Some time ago now, I made a note of when I first submitted this site to google. The big question was how long would it take Google to come and find this new site and then how long before the first visitors would find it?
The results are in!
Looking at the log files just now shows that “Googlebot/2.1 (+http://www.google.com/bot.html)” (which is the Google spider that visits sites and gets information about them) visited for the first time on 10/Jan/2005 at 07:42 (and 54 seconds) GMT.
Google is a well behaved search engine because the first file it tried to get was the robots.txt file. This file essentially tells a search engine what it can and cannot do on a web site. So, for example, I could tell it to index everything on the web site it can find except for things in a particular folder called /temp and /private. There’s no guarantee a search engine will follow the robots.txt instructions, but all the good ones do.
Having read and understood the robots.txt file, it was another hour before it came looking at all the pages. It read a page every five minutes or so, which is standard practice for a search engine spidering a site so that it doesn’t overload the site with requests (and then stop the site working).
The first visitor came to this site by searching for scheema definition (and if you follow that link you’ll perform the same search on Google, though the result may be different by the time you do that). Amazingly (I think), they came on 11/Jan/2005 19:27:02 GMT – barely 36 hours after Google first found this site!
Now that I’ve got a search engine to feed, I better write some more fodder…erm text for it to consume.