You know what? There is an Italian proverb that possibly does not make too much sense in English, but once translated it roughly says: “something can grow up from something” (this is the italian translation, but in english “one thing leads to another” would make more sense.
This is what happens when talking, reading or writing about something, and you suddenly come up with a new idea and other points for your conversation.
That’s what’s happened to me reading the article on SEO Mofo, regarding how to spam your competitor’s SERP. I wish such kind of thing didn’t happen, but unfortunately Black Hat SEO techniques are unlikely to die.
I guess this wasn’t the purpose of the SEO Mofo’s post, but such an article opened my eyes, and because things sometimes can’t be avoided, the best solution is trying “to live together as good friends”. Therefore, I tried to imagine a couple of good guidelines being able to mitigate the negative effects of a possible “query spam” attack.
Manage your DNS** properly **
From the blog example it is clear that creating the “ALL record” into your DNS is not a good choice. As some providers allow people to manage their own DNS records, have a look at it and see if the ALL record is there. A star (*) generally indicate such a record.
Apart from the beforehand scenario, another good reason falls into the web site promotion.
The canonicalization concept is going around the net for a few years or so, and many articles around the net tried to explain how to avoid the problem as well as the search engines did suggesting to promote only one version of the same web site.
I don’t want to go deep into the canonicalization aspect, as I expect you know what it is. However, because of that, it is clear that having your web site accessible with more than one URL, whether or not it is a subdomain or the second level domain name, is not useful at all.
So why bother creating such a record into the DNS to allow a web site to be reachable with everything you can came up with?
Robots.txt and the exclusion protocol
We all know what a robots.txt is, don’t we? All right, you don’t know, so I’ll briefly tell you. Robots.txt is a file which informs the search engines (who respect the protocol) which file or folder we don’t want to be crawled.
However, a good SEO should know that in certain circumstances the robots.txt directives may be ignored, in particular by Google.
So why would you bother creating a robots.txt file for specific file and folders that should contain only private stuff? Whenever possible, just use the .htaccess file or implement some sort of security restriction for the directory in order to avoid visits from unauthorized users. This limit the chance of revealing the structure of your web site and where your important documents are.
The suggestions given above won’t prevent all the black hat seo scenarios, nor are intended to mitigate all the side effects connected to the article I read, but I guess we can discuss it more, maybe starting another Seo conversation?