Bots visiting the Kaprekar Site
What is a bot/robot
A robot is an automated program that accesses a web site and traverses through the site by following the links present on the pages.
A visit by a bot on a web site is quite normal.
The most common robots on my site are search engine robots.
Robots are also referred to as Web Crawlers, or Spiders.
Baidu.com is China's largest search engine. " Set up in 1999 in California's Silicon Valley, Beijing-based Baidu says it is China's most popular search engine, averaging tens of millions text searches a day in Chinese alone." [yahoo news]
Trigger: I don't know where it got my site from!
Googlebot is Google's web-crawling robot. It collects documents from the web to build a searchable index for the Google search engine.
Trigger: I submitted by site to google for indexing using their add url page.
Trigger: Got this googlebot user agent after adding Google Ads to my pages.
IBM Almaden Research Center
This robot is from the IBM Almaden Research Center. The site mentioned in the robot's use agent does not give any information of what this robot is for.
Trigger: They have visited my site only once, I don't what was the trigger.
The Alexa crawler
Ok, its the Alexa crawler.
Trigger: I submitted my site to be archived, by using their add url link.
iSiloX is the desktop application that converts content to the iSilo™ 3.x/4.x document format, enabling you to carry that content on your Palm OS® PDA, Pocket PC PDA, Windows® CE Handheld PC, or Windows® computer for viewing using iSilo.
iSiloXC is the command-line version of iSiloX.
Larbin is a web crawler. It was initially developed for the XYLEME project in the VERSO team at INRIA.
ScanSoft.com has no information on this robot. Its supposedly part of a research project at ScanSoft.
The web crawler for the Teoma.com search engine.
"Mozilla/2.0 (compatible; Ask Jeeves/Teoma)"
Is this a bot? No idea!
"Mozilla/4.0 (compatible; Donut : L 15; Windows 98;)"
This is the GPU Distributed Search Engine crawler.
"Mozilla/4.0 (compatible; GPU p2p crawler)"
"Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; Girafabot; girafabot at
girafa dot com; http://www.girafa.com)"
Freshmeat URL Validator
"Mozilla/5.0 (compatible; fmII URL validator/1.1)"
Trigger: Used by freshmeat.net to validate the URLs I submit for my project entries there.
"Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
Trigger: I submitted my site to their directory.
MSNBot is a prototype web-crawling robot developed by MSN Search (http://search.msn.com/).
Trigger: I don't know how they heard of my site, but I don't mind!
W3C Markup Validation Service
Trigger: Agent for the W3C Markup Validation Service. Triggered whenever anyone (mostly me) tries to check if my site is XHTML valid.
W3C Link Checker
"W3C-checklink/3.9.2 [3.17] libwww-perl/5.79"
Trigger: Agent for the W3C Link Checker page. Triggered whenever anyone (I!!) tries to check the links on my site.
World Wide Web Offline Explorer
The wwwoffle program is a simple proxy server with special features for use with dial-up internet links. It lets you browse web pages and read them without having to remain connected.
There has been only one hit registered using this user agent on my site.
Trigger: Anyone accessing my site using the wwwoffle program.