Score:0

Make webserver to prevent parsing of certain HTML elements

gb flag

MediaWiki content management system creates many links which their webpages I want not to be discovered by search engine crawlers.

It's not only that I don't want them indexed and more so not only that I don't want them crawled, but I don't even want them discovered !

In theory I can try to customize the skin (theme/template) of my MediaWiki website to remove the HTML elements linking to these webpages but doing so sanely requires tremendous learning of the MediaWiki architecture which I'd prefer not to do if more simple solutions are available.

  • CSS display: none won't help as the markup would be evident in DOM
  • JavaScript document.querySelector("#x").remove(); won't help as until it runs, crawlers may discover the link element
  • I cannot use PHP 8.1.3 to ignore its own previous commands because the moment any markup with such link was processed, it would be served to the user.
  • I can use robots.txt to try to prevent crawling (if not indexing) of these page though, but, since my website URLs are multilingual and there are many patterns, this might be a hard task.

The only trick which might left to help me is to somehow ask the server to not serve any such markup by CSS ID or class.

As brute as it may be, can it work? If not, what other option do I have left?

Mat avatar
cn flag
Mat
If you don't want stuff discovered, don't put it on the public web. Keep your private stuff behind required authentication.
us flag
If MediaWiki does not support your requirements, you should look into other software for the purpose that supports the requirements. That is the only reasonable and maintainable way to reach your objectives. All other methods require lots of effort and can have many undesired side effects.
gb flag
@TeroKilkanen I strongly agree, I would migrate to Drupal but it's already 2400 webpages and manually transfer content could take about 4 months and would be hard and I also like MediaWiki syntax a lot.
gb flag
I can use **robots.txt** to try to prevent crawling (if not indexing) of these page though, but, since my website URLs are multilingual and there are many patterns, this might be a hard task. Still, much easier than migrating to Drupal.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.