| | | |
|---|
| -- | Suchmaschine | |
AAfter | AAfter looks like a legit search engine. () | Suchmaschine | |
AboutUsBot | AboutUsBot is used by the About Us website to determine the contents, aspect, logo and owner of a website. It is legit. | Suchmaschine | |
aiHitBot | aiHitBot, seems to be a legit search engine. () | Suchmaschine | |
ia_archiver | Alexa ia_archiver () | Suchmaschine | |
almaden | almaden, Unstructured Information Analysis and Search @ IBM () | Suchmaschine | |
Scooter | AltavistaBot, Altavista () | Suchmaschine | |
aport | Aport () | Suchmaschine | |
appie | Appie-spider/Walhello () | Suchmaschine | |
ApptusBot | ApptusBot is the Apptus crawler bot, some business driven search engine. () | Suchmaschine | |
Ask Jeeves | Ask Jeeves, Teoma () | Suchmaschine | |
askpeter_bot | Ask Peter is a German based search engine () | Suchmaschine | |
ASPseek | ASPSeek () | Suchmaschine | |
Baiduspider | Baidu search engine web crawler. Not always respectful of robots.txt, sometimes a bit pushy as well. () | Suchmaschine | |
BecomeBot | BecomeBot shopping search () | Suchmaschine | |
Blazer | Blazer Browser, Sharp Zaurus () | Suchmaschine | |
CatchBot | CatchBot is a business page crawler. They claim to resell information for companies, academics and various professional fields. Bot behaves correctly. | Suchmaschine | |
abby/ | Ellerdale determines trends through the semantic web, usually through gathering recent Twits or Facebook entries. () | Suchmaschine | |
ExaBot | Exlead Exabot () | Suchmaschine | |
facebookexternalhit | Facebook External Hit () | Suchmaschine | |
fast | FAST-WebCrawler () | Suchmaschine | |
sitedossier.com | Featuring sitedossier.com as a referer, the IP 69.71.222.186 seems to check for websites recently crawled by one of their competitors, domaintools. Seems harmless. () | Suchmaschine | |
feedfetcher | Feedfetcher Google, gathers news feeds from websites () | Suchmaschine | |
Feedtrace-bot | Feedtrace-bot makes a list of the most popular twitter feeds (parses the most recent feeds all time round). () | Suchmaschine | |
ftxbrowser | ftxBrowser, Windows CE () | Suchmaschine | |
gais | Gais () | Suchmaschine | |
Gigabot | Gigablast's Gigabot () | Suchmaschine | |
Mediapartners | Google AdSense () | Suchmaschine | |
Google Desktop | Google Desktop is a desktop data manager/search. It should be harmless. () | Suchmaschine | |
googlebot | Googlebot () | Suchmaschine | |
ichiro | ichiro @ Goo Japan / Inktomi () | Suchmaschine | |
IconSurf | Icon Surf () | Suchmaschine | |
ICRA_Semantic_spider | ICRA semantic spider, Internet Content Rating association. () | Suchmaschine | |
infoseek | InfoSeek () | Suchmaschine | |
JS-Kit | JS-Kit is a blabla software for blogs. It usually connects here and there to promote their stuff through curiosity. For that reason no URL is provided here. | Suchmaschine | |
Linguee Bot | Linguee Bot is a legit search engine bot. However it WILL get banned from your Beamreactor enabled website for its extreme crawling speed with the argument 'flood'. () | Suchmaschine | |
LinkWalker | LinkWalker () | Suchmaschine | |
grub | Looksmart/Grub () | Suchmaschine | |
Mail.RU | mail.ru (Поиск@mail.ru) is tied to the mail.ru search engine () | Suchmaschine | |
MJ12bot | Majestic-12 distributed search engine bot () | Suchmaschine | |
MetaQuerier | MetaQuerier (University of Illinois in Urbana-Champaign) () | Suchmaschine | |
bingbot | Microsoft Bing () | Suchmaschine | |
Media Center PC 5.0 | Microsoft EnhanceIE enables Windows users to mod their useragents and emulate other user agents (!?) () | Suchmaschine | |
MLbot | MLBot is a mp3/video crawler. The true purpose of MLbot is undisclosed but might be related to piracy protection. This robot is fairly clean. () | Suchmaschine | |
Yahoo-MMCrawler | MM Crawler, seeks for images on the www. () | Suchmaschine | |
MOT-A768 | Motorola A768 browser client. Might be fairly harmless. | Suchmaschine | |
msnbot | MSN Search Crawler () | Suchmaschine | |
MSTV | MSTV WebTV () | Suchmaschine | |
MyIE2 | MyIE2 @ turkey? | Suchmaschine | |
netcraft | Netcraft () | Suchmaschine | |
Naverbot | NHN Corp bot/Naver.com () | Suchmaschine | |
Ocelli | Ocelli Engineering search () | Suchmaschine | |
OnetSzukaj | OnetSzukaj () | Suchmaschine | |
avantgo | PalmOs AvantGo () | Suchmaschine | |
PSbot | Picsearch web crawler () | Suchmaschine | |
plucker | Plucker Browser, Windows CE () | Suchmaschine | |
Plukkie | Plukkie: a search engine robot, fairly harmless. () | Suchmaschine | |
pompos | Pompos () | Suchmaschine | |
Moo | qsdfqs () | Suchmaschine | |
quepasacreep | QuePasaCreep () | Suchmaschine | |
StackRambler | Rambler search robot () | Suchmaschine | |
RSScache | RSS Cache website bandwith saver () | Suchmaschine | |
SapphireWebCrawler | SapphireWebCrawler crawls for a computer science project from Carnegie Mellon university. | Suchmaschine | |
scrubby | scrubby () | Suchmaschine | |
Shim | Shim Crawler (University of Tokyo) () | Suchmaschine | |
slurp | Slurp () | Suchmaschine | |
spbot | spbot; "we just want to find out to which web pages you link to" () | Suchmaschine | |
Speedy Spider | Speedy Spider is a part of the highly advanced search engine Entireweb.com, that was developed in Halmstad, Sweden during 1998-2000. () | Suchmaschine | |
sproose | Sproose Crawler () | Suchmaschine | |
Apple-PubSub | The PubSub client is checking your RSS for an Apple computer owner! Don't remove or block this client/IP. () | Suchmaschine | |
bnf.fr_bot | This robot comes from the National French Library. It makes a web archive of your website for various reasons and may, or may not respect robots.txt according to its settings. Harmless nonetheless. () | Suchmaschine | |
seznambot | Tied to the Seznam Czech search engine. () | Suchmaschine | |
turnitin | Turn It In () | Suchmaschine | |
Twiceler | Twiceler is the legit Cuil search engine crawler () | Suchmaschine | |
Twingly Recon | Twingly Recon is a RSS parser, focused towards blogs. Usually triggered with syndication tools, such as facebook / twitter post third party APIs. () | Suchmaschine | |
Twitturls | Twitter URL parser. Someone linked your content to twitter. () | Suchmaschine | |
Vagabondo | Vagabondo () | Suchmaschine | |
VideoSurf_bot | VideoSurf bot looks for videos. It uses social webs to parse URLs to visit, so its visit might be related to some of your website data being posted on Twitter or FB | Suchmaschine | |
VoilaBot | VoilaBot is from the Voila search engine, owned by The "France Telecom - Orange" group. Basically harmless. () | Suchmaschine | |
Jigsaw | W3C CSS validator - JFouffa () | Suchmaschine | |
W3C_Validator | W3C Validator () | Suchmaschine | |
WMP | Windows Media Player | Suchmaschine | |
Xenu Link Sleuth | Xenu Link Sleuth validates your website for dead links () | Suchmaschine | |
Yahoo! Mindset | Yahoo Mindset () | Suchmaschine | |
Yandex | Yandex. I at the end refers to search. H looks for mirror copies, P for images, F for favicons, D for Yandex declared websites, B for RSS () | Suchmaschine | |
YandexSomething | YandexSomething searches for news related RSS feeds for their news system. () | Suchmaschine | |
zyborg | Zyborg () | Suchmaschine | |
ImagesiftBot | AI image analysis crawler () | KI-Crawler | |
Anthropic-AI | Anthropic AI services general agent () | KI-Crawler | |
Claude-Web | Anthropic Claude web crawler () | KI-Crawler | |
Bytespider | ByteDance (TikTok) AI crawler () | KI-Crawler | |
CCBot | CCBot, or CommonCrawl bot, claims since 2009 and like many that it'll bring interesting search content for xyz in a near future. It offers nothing but crawling stats. It's CC name misleads to "creative commons" but has NOTHING in common with it. () | KI-Crawler | |
ChatGPT-User | ChatGPT user agent for browsing () | KI-Crawler | |
cohere-ai | Cohere AI language model crawler () | KI-Crawler | |
Diffbot | Diffbot AI extraction and analysis () | KI-Crawler | |
Omgilibot | Omgili conversation analysis bot () | KI-Crawler | |
GPTBot | OpenAI GPT crawler for AI training () | KI-Crawler | |
PerplexityBot | Perplexity AI search crawler () | KI-Crawler | |
MyIE2 | | Scraper | |
_viewer | -- | Scraper | |
<? | '<?': Some script kiddo attempted to bypass your website securities through php injection. | Scraper | |
aboundex | Aboundexbot claims to index websites and whilst it abides to robots.txt its activity remains doubtful. () | Scraper | |
America Online | America Online is a rather weak mockup of the AOL referer and hides a strong forum spammer | Scraper | |
synapse | Apache Synapse isn't documented. Frequently seen. Suspicious. | Scraper | |
avantbrowswer.com | Avant Browser Second Street Research @ Shawcable.net proxy, AB, CA | Scraper | |
BackStreet | BackStreet Browser | Scraper | |
Bluecoat | Bluecoat DRTR | Scraper | |
Brick House | Brick House | Scraper | |
calif univ | Calif Univ Tools @ btcentralplus.com | Scraper | |
Cam finder | Cam finder | Scraper | |
Orbiter | DailyOrbit's Orbiter, supposed dead. Shouldnt crawl your web. | Scraper | |
domainratio | Domainratio bot belongs to a website that claims to "sort interesting websites", but is really a whois frontend with a lot of advertisement. () | Scraper | |
drupal | Drupal web management () | Scraper | |
Daum | EDI/Edacious & Intelligent Web Robot | Scraper | |
EmeraldShield | EmeraldShield.com web spider () | Scraper | |
FlashGet | FlashGet | Scraper | |
Franklin | Franklin Locator (eclipse.net.uk) @ XO Communications, Reston, VA, US | Scraper | |
FunWebProducts | FunWebProducts enters the Adware/spyware category with their set of dirty toys from IWon. Their bot can be related to Mr Sputnik. | Scraper | |
gqbi | gqbi hnxupsxgfgnX berXjteu (!) | Scraper | |
http generic | http generic @ mchsi.com, Mediacom, NY, US | Scraper | |
Huasai | Huasai ignores robots.txt. It is an harvester, the purpose is unknown. | Scraper | |
huaweisymantec | Huawei Symantec; Chinese bot that claims to "fix websites security holes". It isn't in any way related to Symantec, and most probably a scam. () | Scraper | |
IE/4.0 | IE/4.0 @ mesh.ad.jp, JP | Scraper | |
intelium_bot | Intelium does respect robots.txt, doesn't flood, but is much too discrete to be considered safe. | Scraper | |
Internet Explore 5.x | Internet Explore 5.x @ Dynegy-Comm, Beijing, CN | Scraper | |
ISC Systems | ISC Systems iRc Search 2.1 | Scraper | |
Java/ | Java/xxxx. Various users, usually used for cheap crawlers (hispeed.ch), more rarely for harmful actions. | Scraper | |
Java1.3.1 | Java1.3.1 @ antelecom.net | Scraper | |
Java1.4.0_02 | Java1.4.0_02 @ Speakeasy.NET, US | Scraper | |
jobo | JoBo/1.3 @ Technikzentrum Luebeck tzl.de, DE () | Scraper | |
libwww-perl | libwww-perl | Scraper | |
linguee | Linguee bot. Flooder () | Scraper | |
Mac Finder | Mac Finder 1.0.26 @ rr.com | Scraper | |
Mac_Power | Mac_Power | Scraper | |
mnoGoSearch | mnoGoSearch () | Scraper | |
Mozilla(IE Compatible) | Mozilla(IE Compatible) @ UNSX, RU | Scraper | |
indy library | Mozilla/3.0 (compatible; Indy Library) @ Bijing Gold, sina.com, CN | Scraper | |
MRSPUTNIK | Mr Sputnik is a strong adware/malware crap from IWon - maybe linked to hiyo.com | Scraper | |
nerdbynature | Nerd By Nature indexes French and German websites. It is supposed to establish maps of links tied to a website, but does it in some unobvious way. () | Scraper | |
Nextopia | NextopiaBOT () | Scraper | |
OmniExplorer | Omni Explorer () | Scraper | |
PlantyNet_WebRobot | PlantyNet Web Robot @ hinet.net, TW | Scraper | |
Poirot | Poirot | Scraper | |
Program Shareware | Program Shareware | Scraper | |
RPT-HTTPClient | RPT-HTTPClient/0.3-3 | Scraper | |
Search17Bot | Search17Bot claims to be a semantic search engine. It is closed to the public, therefore might be anything. | Scraper | |
second life lsl | Second Life LSL. LSL is a programming language for the Second Life and OpenSIM game environments. Into the wrong hands and provided Linden Labs doesn't check for outgoing traffic contents, it can be used to flood, spam or seriously hit a website. () | Scraper | |
SiteBot | SiteBot is a link collector. Provided its origin and customers aren't disclosed, we may consider it as privacy infringing or some cheap harvester. () | Scraper | |
HMSE | Spammer | Scraper | |
surveybot | SurveyBot () | Scraper | |
teleport pro | Teleport pro @ interbusiness.it, IT | Scraper | |
tiehttp | Tiehttp: related to AskPeter, the tiehttp software has been developped as a freeware by Kyriacos Michael for the Delphi plateform. It is a free bot. Normally shouldn't be on your web. () | Scraper | |
W3CRobot | W3CRobot/5.4.0 @ CommunityEngine, Tokyo, JP | Scraper | |
wbdbot | WBD bot @ hostcasters.com, TX, USA | Scraper | |
WebDataCentre | WebDataCentre is, according to their web yet another 'team of scientists' cruising the web with automated systems to reveal the future of the internet, or whatever.The bot ignores robots.txt, leeches full website content whenever it finds an (yet undetermined) trigger word, otherwise goes away after hitting the website homepage. () | Scraper | |
WebDAV | WebDAV shared server document editor for Excel () | Scraper | |
WGet | WebGet "Multi-Threaded File Downloader". It leeches your content. () | Scraper | |
WEP search | WEP search 00 @ rr.com, USA | Scraper | |
lwp | wp-trivial/1.32 & LWP::Simple/5.48 @ OLM Llc, Lisle, IL, US | Scraper | |
wsowner | WSOwner is a poorly maintained PHP crawler tied to a broken website. () | Scraper | |
Zeus 2.6 | Zeus 2.6 @ Dynegy-Comm, Beijing, CN | Scraper | |
Psycheclone | | Malware | |
8484 boston | -- | Malware | |
cerberian | Cerberian drtrs @ TW () | Malware | |
core-project | core-project/1.0 frontpage exploiter | Malware | |
DataCha0s | DataCha0s | Malware | |
DigExt | Dig Extense | Malware | |
GalaxyBot | GalaxyBot/1.0 (http://www.galaxy.com/galaxybot.html) @ Logika Corp, Chicago, IL, US () | Malware | |
GetRight | GetRight () | Malware | |
google_three_web | Google_three_web is.. Not google, obviously. Related to #*$! viewer, probably other '_viewers' using Larbin. Ignores robots.txt, insists on trying to reach documents forbidden by robots.txt | Malware | |
Green Research, inc | Green research, inc [ Nigerian 419-scam email ] @ linkserve Nigeria, linkserve.com.ng, NG | Malware | |
metabot | human-guided@lerly.net @ Cogent Co, DC, US | Malware | |
IPiumBot | IPiumBot laurion(dot)com @ CommunityEngine, Tokyo, JP | Malware | |
JikeSpider | Jike is a very doubtful chinese crawler, tied to fishing and spywares () | Malware | |
Lachesis | Lachesis @ NEC Research Inst. Corp., Princeton, NJ, US | Malware | |
URL control | Microsoft URL Control - 6.00.8862 () | Malware | |
chartercom.com | Microsoft URL Control - 6.00.8862 @ chartercom.com | Malware | |
Missigua | Missigua Locator 1.9 | Malware | |
NetResearchServer | NetResearchServer/2.7 @ RNCI New Media, Pittsburgh, PA, US. Theoretically bankrupt. | Malware | |
nhnbot | NHNbot@naver.com, KR | Malware | |
Offline Explorer/([0-9].[0-9]{ | OfflineExplorer | Malware | |
openfind | Openfind/Openbot 3.0+ (robot-response@openfind.com.tw) @ OpenFind.com.tw, HINET-NET, CHTD, TW () | Malware | |
PHP version tracker | PHP version tracker | Malware | |
Port Huron Labs | Port Huron Labs @ cox.net, GA, USA | Malware | |
Purebot | Purebot, malicious Content Scraper and Spam Agent, rule breaker. () | Malware | |
river valley | River Valley inc @ cox.net, GA, USA | Malware | |
Snapbot | Snapbot | Malware | |
TopBlogsInfo | TopBlogsInfo is a spammer, potientally harmful. | Malware | |
URL_Spider_SQL | URL_Spider_SQL/1.0 | Malware | |
WebCopier | webcopier @ Hugues Network Systems / HOT, hns.com, DE | Malware | |
webdup | Webdup/0.9 @ Chinanet-BJ Beijing Province network, Beijing, CN | Malware | |
wells | Wells search @ NL | Malware | |
yanga | Yanga WorldSearch is a dangerous personal data harvester. They are probably related to identity theft. | Malware | |
fimap | Yet another free - yet - dangerous software that leads to catastrophes in the wrong hands: python tool that find, prepare, audit, exploit & google automaticly for local and remote file inclusion bugs in webapps. Someone wants to upload crap into your web. Usually just visiting, multiple hits reveals a clear attempt to ruin your website and should be monitored carefully. () | Malware | |
AddThis.com | AddThisCrawler, tied to the "Add This" social network plugin. () | Legitimate | |
ahrefs | Ahrefs indexes the links of websites. It doesn't abide to robots.txt. () | Legitimate | |
arste.info | arste.info, related to AskPeter.info, probably running the tiehttp software, crawls for a cheap search engine from Germany. Doesn't abide to robots.txt. | Legitimate | |
BlackBerry | BlackBerry device browser. Normally not a threat. () | Legitimate | |
DotBot | DotBot claims to be making a structure display of the web. Whilst fairly opened about it, it is unverifiable, hence the level 1 rank. () | Legitimate | |
findlinks | Find Links () | Legitimate | |
HTTPRetriever | HTTP Retriever PHP class | Legitimate | |
InfoPath.1 | InfoPath is a Microsoft web environment/framework normally not supposed to reach the web (most generally limited to a LAN/WAN). Whilst not a big threat, it is still doubtful () | Legitimate | |
Jakarta Commons | Jakarta Commons Java HTTP client () | Legitimate | |
Lipperhey | Lipperhey usually crawls websites to advertise their SEO tools () | Legitimate | |
mAgent | mAgent is an adware at the user browser level | Legitimate | |
MyFamilyBot | MyFamilyBot () | Legitimate | |
Nutch | Nutch robot software @ Apache () | Legitimate | |
page_verifier | page_verifier is Secure Computings anti malware () | Legitimate | |
ParchBot | ParchBot is a robot from Parchment Hill supposedly optimised to provide websites instead of webpages during a search. Bot is currently down (03/2008) () | Legitimate | |
Pingdom | Pingdom Website monitoring () | Legitimate | |
ShareThisFetch | ShareThisFetch. Probably linked to the twitter or facebook API. Respects robots.txt and only parses documents. | Legitimate | |
SheenBot | SheenBot belongs to Amazon web services (cloud computing). The behaviour can be mixed, but it is most often harmless. () | Legitimate | |
Sogou | Sogou web crawler is dirty, it systematically ignores robots.txt (although seems to parse it). Sogou is otherwise a legit search engine. | Legitimate | |
Steeler | Steeler () | Legitimate | |
Szukacz | Szukacz () | Legitimate | |
ezooms | There's very few information towards Ezooms. It behaves correctly, abides to robots.txt and doesn't flood websites. | Legitimate | |
WebVulnCrawl | WebVulnCrawl.blogspot.com () | Legitimate | |
wotbox | Wotbox complies to robots.txt, and doesn't flood. It is tied to an obscure search engine. () | Legitimate | |
CMS Survey | CMS Survey belongs to punkt.de, a CMS creators company. This robot seems to be "visiting" their competitors. It follows robots.txt, but comes with no explanation. () | Unbekannt | |
compspybot | CompSpyBot - Competitive Spying and Scraping
This robot is probably a joke made by some bored wanna be James Bond or content leecher. It does seem to abide by robots.txt. () | Unbekannt | |
F-Bot test pilot | f-bot test @ pilotad.jp | Unbekannt | |
larbin | Larbin is a multi purpose bot. Usually not too critical. | Unbekannt | |
NaverRobot | minibot(NaverRobot) @ KORNET-NETINFRA-JUNGANG-KR, Seoul, KR | Unbekannt | |
Missouri College | Missouri College Browse @ Sprint DSL-Net, sprint-hsd.net, KS, US | Unbekannt | |
natcrawl | natcrawler (France Telecom Interactive, Orange, Voila, Voilachat, etc) often misses robots.txt. Worst, it can suddenly LOOP over the very same page tenths of thousand times, many examples of this behavior can be find on the web. | Unbekannt | |
PycURL | PycURL, Python interface for cURL, might be good or bad news, syndication related or SEO related, or a crawler. () | Unbekannt | |
Python-urllib | Python-urllib is used by the Python high level language to "open arbitrary resources by URL" () | Unbekannt | |
robotgenius | Robotgenius is supposed to monitor company PCs over the www. () | Unbekannt | |
ScoutJet | ScoutJet is the web crawler for blekko, a new Silicon Valley based search engine. Seems ok, with interesting leaders. However, this search engine is down permanently for months and this raises their RR () | Unbekannt | |
Second Street | Second Street Research @ Shawcable.net proxy, AB, CA | Unbekannt | |
SISTRIX | Sistrix is a German based private SEO engine. It crawls at a very high speed and triggers the Beamreactor anti flood protection. () | Unbekannt | |
sitecheck | Sitecheck () | Unbekannt | |
SBider | Sitesell statistics robot () | Unbekannt | |
Syntryx | Syntryx ANT Scout Chassis Pheromone () | Unbekannt | |
T-H-U-N-D-E-R-S-T-O-N-E | T-H-U-N-D-E-R-S-T-O-N-E is a free web crawler. Also refers to 'webinator'. Might or might not be dangerous, depending of the use made of it by script kids. () | Unbekannt | |
T312461 | T312461 UNKNOWN BOT @ Corex technologies | Unbekannt | |
TuringOS | TuringOS anonymizer | Unbekannt | |