Oct 27, 2006

php version tracker

php version tracker (http://www.nexen.net/phpversion/bot.php)
tabarnak.nexen.net 217.174.203.41

PHP version tracker runs continuously. It doesn't request anything more than HEAD from the web site (i.e., http://www.nexen.net/, GET /), and do not recurse into folders. It runs from differents IP, which are not stable.



Everyone knows the famous PHP phpinfo(), which provide the programmer with invaluable information about his server configuration and set up. This is a useful tool as soon as one get a new server, and it is also a tool to talk with any administrator.

Yet, after usage, it is usually recommended to remove it, or to restrict its access to few people. Indeed, phpinfo may be dangerous by itself : in other times, it was even flawed with XSS injections. Even when secured, phpinfo() publish information about your architecture, and it is always recommended to keep it from privy eyes.

Sadly enough, the common habit to set up a phpinfo page on every web site is now so widely spread that even search engines are starting to pick them up : there are literally thousands of phpinfo indexed on Yahoo and Google. Just hit a search with the words 'phpinfo()' 'GoogleBot' and



SO this bot reads the phpinfo script that you forgot to remove and compiles all your servers info.

customhostingservers.com guestbook spammers

Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)
1530-29-1.customhostingservers.com 216.32.76.202

Looks like this is a heavy guestbook spammer.

The website has nothing but a image file that gives a address to report abuse. Is this a real hosting company? If so why doesn't it have a real website?

I am banning this domain based on the guestbook spamming I am seeing on other sites.

schibstedsokbot Just what is this

schibstedsokbot (compatible; mozilla/5.0; msie 5.0; fast freshcrawler 6; +http://www.schibstedsok.no/bot/)
sch-fast-se-crawl02.osl.basefarm.net 81.93.168.72
sch-fast-se-crawl04.osl.basefarm.net 81.93.168.74


The website this bot says it comes from doesn't exist so what is this bot.

Also basefarm.net doesn't have a website.

Oct 26, 2006

clarissa.empyreum.com bot

silk/1.0
clarissa.empyreum.com 194.213.194.206

A look at the sites webpage at empyreum.com says.

Typical performance of the Crawler technology is 1 - 5 thousands documents (www pages) per second on a single conventional server with UNIX operation system.

Some internet resources perform real-time user agent behaviour analysis in order to identify automated crawler systems based on request delay measuring and / or traversing method detection.

EMPYREUM Crawler technology therefore supports many advanced techniques including virtual user emulation allowing customer to monitor selected public information silently



Thats really something to brage about a bot that breaks into yuor site and pretends to be a browser. Nothing on this site says why they are spidering me so this domain is added to the domain ban.

Also I have not yet found out where silk comes from but it may also be banned.

209-103-237-195.excel.net spambot?

mozilla/4.0 (compatible ; msie 6.0; windows nt 5.1)
209.103.237.195 209-103-237-195.excel.net

This bot fell right into the bot traps. Its also blocked by BB.
Excel.net is a isp in Wisconsin, Sheboygan and Plymouth.
So this looks like a DSL line.

Have to watch this one..

pdxg1n-o.webtrends.com What is it? Scrapper?

400 Required header 'Accept' missing
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705; .NET CLR 1.1.4322)
63.88.212.164 pdxg1n-o.webtrends.com

Webtrends robot fakes a useragent and doesn't send the proper headers so it can not pass Bad Behaivor scanning.

From looking at the site it says nothing about a cloked robot. So what are they doing on my site?

For now based on all the text on the website about helping you get a top position it looks like they may be scraping other sites to get content to raise you in the listings this would make them a scrapper. If thats not whats going on someone explain it.

Why do they hide what they are doing?

Amazon bots dont get the point

nutchec2test/nutch-0.9-dev (testing nutch on amazon ec2.; http://lucene.apache.org/nutch/bot.html; ec2test at lucene.com)
216.182.236.46 domU-12-31-34-00-00-6A.usma2.compute.amazonaws.com

Now look here Amazon your not going to get in my site or others using that crappy bot. You need to create your own useragent with a link to a page explaining what you are doing on my site. Otherwise you will get a brick wall.

Oct 21, 2006

zippy2.cs.cornell.edu abuse

wget/1.8.2
128.84.97.129 zippy2.cs.cornell.edu

So cornell.edu why are you trying to dowload my entire website?
GET LOST!

Edacious & Intelligent Web Robot (what a joke)

Mozilla/4.0 (compatible; EDI/1.6.6; Edacious & Intelligent Web Robot; Daum Communications Corp.- Korea)
222.231.42.12

This bot can't even send the proper headers.
The website doesn't say anything about a bot.

Oct 17, 2006

mozilla/4.0 (compatible; msie 6.0; windows nt)

mozilla/4.0 (compatible; msie 6.0; windows nt) ::elnsb50::000061100320025802a00111000000000507000900000000
83.167.112.156

mozilla/4.0 (compatible; msie 6.0; windows nt) ::elnsb50::000061100320025802a00111000000000507000900000000
66.110.119.170 gauntlet.angolatelecom.com

mozilla/4.0 (compatible; msie 6.0; windows nt) ::elnsb50::000061100320025802a00111000000000507000900000000
219.153.13.45


Agent: mozilla/4.0 (compatible; msie 6.0; windows nt) ::elnsb50::000061100320025802a00111000000000507000900000000
83.167.112.156

mozilla/4.0 (compatible; msie 6.0; windows nt) ::elnsb50::000061100320025802a00111000000000507000900000000
201.21.208.95 C915D05F.poa.virtua.com.br

The above bot came in claiming to be Windows NT. Windows says no such useragent was ever built. When they were blocked they tried again using several other IPS thinking it was a ip ban.

gauntlet.angolatelecom.com is a open proxy.
virtua.com.br is identified with guestbook spamming

topicblogs/0.9 Email harvestor or scrapper?

topicblogs/0.9
72.36.205.226 226.205.36.72.reverse.layeredtech.com

topicblogs Has no real working website just a email collection system to trick you into giving them your email address.

We throw this into our list of fake startup companies running scrapper and email harvestor bots. If it was a valid bot it would have a link to a real working webpage.


Also layeredtech.com is banned for abusive bots.

Oct 14, 2006

tellusioncrawler/1.0 bot

tellusioncrawler/1.0
wildebeest.gnoos.com.au 64.34.161.44

http://www.gnoos.com.au/ looks to be some type of search system for blogs.

Oct 13, 2006

Internet Explorer 6 (MSIE 6; Windows XP) abusive bot

Internet Explorer 6 (MSIE 6; Windows XP)

The above useragent was sent by Web Scrapper +
I have tested this software and it created the above fake invalid useragent.
BB stops it dead.

Oct 12, 2006

vodka.fark.com bot

Prohibited header 'Range' present

FARK.com link verifier; see http://www.fark.com/farq/tech.shtml
207.58.150.113 vodka.fark.com

Looks like more drunk on vodka bots this time from fark.com

I don't use fark but its link checker will never verify anything because its improper headers are blocked by BB

dedicatedcentral.com abuse

mozilla/4.0 (compatible; msie 6.0; windows nt 5.1)
216.139.224.70 ip-d88be046.dedicatedcentral.com

The above bot fell right into the bot trap. And then tried to scan the other sites.

On the 15th falure it tried to submit a order for a giveaway I am running.
How strange its must be a spider with a human monitoring it.

Actual name and address removed but the address was in boston.
spam=[+city=Boston&+state=OK&+zip=02116&]

dedicatedcentral.com has no A record or website. So what is it?
| Domain Name: dedicatedcentral.com
| Created on .............Tue Jul 22 14:27:43 2003
| Expires on .............Fri Jul 22 14:27:43 2011
| Record last updated on .Tue Sep 26 17:15:13 2006
| Status .................LOCK
|
| Administrative Contact:
| DomainPeople, Inc.
| Dom Reg
| 200-550 Burrard Street
|
| Vancouver, BC
| V6C2B5, CA
| (604)6391680
| ()
| hostway-cdm@domainpeople.com
|
| Technical Contact:
| DomainPeople, Inc.
| Dom Reg
| 200-550 Burrard Street
|
| Vancouver, BC
| V6C2B5, CA
| (604)6391680
| ()
| hostway-cdm@domainpeople.com
|
| Domain servers in listed order:
| a.ns.dedicatedcentral.com 216.139.254.200
| b.ns.dedicatedcentral.com 216.139.223.11
|
| (dedicatedcentral.com)


The IP is from Texas.

OrgName: SouthWeb Ventures
OrgID: SOUTH-29
Address: 501 Waller St
City: Austin
StateProv: TX
PostalCode: 78702
Country: US

NetRange: 216.139.208.0 - 216.139.255.255
CIDR: 216.139.208.0/20, 216.139.224.0/19
NetName: SOUTHWEB-AUSTIN
NetHandle: NET-216-139-208-0-1
Parent: NET-216-0-0-0-0
NetType: Direct Allocation
NameServer: A.NS.SOUTHWEBVENTURES.COM
NameServer: B.NS.SOUTHWEBVENTURES.COM
Comment: ADDRESSES WITHIN THIS BLOCK ARE NON-PORTABLE
RegDate: 2000-06-28
Updated: 2005-10-25

OrgAbuseHandle: ABUSE548-ARIN
OrgAbuseName: Abuse
OrgAbusePhone: +1-512-469-9939
OrgAbuseEmail: abuse@southwebventures.com

OrgTechHandle: NOC1501-ARIN
OrgTechName: NOC
OrgTechPhone: +1-512-469-9939
OrgTechEmail: noc@southwebventures.com


The southwebventures.com website is blank.

Because of all the above and no clear ideal why this is (is it a isp it cant be with no website) the domain will be added as an abussive bot.


dedicatedcentral.com,Unknown abusive bots

Oct 11, 2006

Guestbooks a thing of the past?

With all the spam Guestbooks have become a thing of the past because unless you are going to moderate them they will fill with thousands of spam and trackback messages.

Once your system fills up with the spam your webspace will be used up and your bandwidth will go way up as others load the huge pages of spam.
Once the search systems see you linking to this spam your page will go down in the ratings as you will be identified with the spammers. Even iof you moderate if the spam stays on your site for more than a day your domain could be damaged by it.

I switched off all my guestbooks and converted to wordpress because it has spam plugins that are mostly automated this still allows users to comment but doesnt take up all my time watching for spam.

Suggest Wordpress with Spankarma and bad behaivor.

Just post a message for your new guestbook and link to that post from your old guestbook link.


It would be nice if someone would write a new guestbook program that could work with the wordpress anti spam plugins but until they do I recomend everyone shutdown your guestbooks and convert to WP.

Having these old guestbooks up is encourging all the guestbook spamers.

Union injection hackers

A hacker not already noticed in your blog tried to hack my site last week with triying to inject code in my php page. If you are interrresting, Here are their IP and domain name : 66.110.9.76 89.108.91.144 202.8.85.44 Domain icezinhu.by.ru They tried injection whith this command file texte http://icezinhu.by.ru/ice.txt


I checked the url and the hack file no longer exists. If you have not already installed M&M Autoban you should because it will let you scan for all of those hacks.

I am adding the domain by.ru as a hacker website. Here is the current list of sites hosting the injection scripts. For those that don't understand they will post a union injection into your script with the url of a text file to run. Your poorly written script will then run that script and they will get full access to your server.

void.ru
paupal.info
expl0itz.com
echo.or.id
200.72.130.29
persiangig.com
fullcrew.net
paginas.aol.com.br
shikoe.net
by.ru

M&M autoban scans all the post and get data strings looking for anything that might be an injection.

ns.metalsusa.com abuse

Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)
63.123.84.196 ns.metalsusa.com

Why are metal roofing companies like metalusa.com sending bots to our sites?


Microsoft useragents explained

As per the Microsoft website they have never used a useragent that itentified itself as Windows NT without a version# after the NT so any useragent with this in it is fake. So we will now start looking for fake agents containing the following.

Windows NT) or Windows NT;

If anyone has anything else on this please post it.

Oct 10, 2006

dabworx.com abuse

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
66.98.212.79 dabworx.com

The above bot is a guestbook spammer.


The useragent is seen a lot from bots and I am begining to wonder if any real browsers even use this useragent.

net::trackback/1.01 abuse

net::trackback/1.01
209-9-169-114.sdsl.cais.net 209.9.169.114
209-9-169-124.sdsl.cais.net 209.9.169.124

Clearly some type of scrapper tool in HK.

added to ban list.

What is sdsl.cais.net
What is cais.net

I think the entire domain needs to be banned.

PCCW-HKT DataCom Services Limited
39/F PCCW Tower, Taikoo Place
979 King's Rd
Quarry Bay, Hong Kong 0
HK

xbox.dedi.inhoster.com / xbox.inhoster.com abuse

mozilla/4.0 (compatible; msie 6.0; windows nt 5.1)
85.255.117.226 85.255.117.226-xbox.dedi.inhoster.com


This xbox robot comes in posting to the contact forms but not the blogs with his trackback links. Doesn't understand the error messages he gets or that he is wasting his time he just keeps hitting the contact form ever few days or so.

Should really be no trafic from inhoster.com since its not a ISP so its blocked.
Scrapers have also been reported on this domain.

Update: Trackback spam now poring in using a trackback agent.

net::trackback/1.01
85.255.114.132 85.255.114.132-xbox.dedi.inhoster.com
net::trackback/1.01
85.255.114.131 85.255.114.131-xbox.dedi.inhoster.com
net::trackback/1.01
85.255.114.133 85.255.114.133-xbox.dedi.inhoster.com
net::trackback/1.01
85.255.114.134 85.255.114.134-xbox.dedi.inhoster.com

net::trackback/1.01
85.255.113.78 85.255.113.78-xbox.inhoster.com
net::trackback/1.01
85.255.113.77 85.255.113.77-xbox.inhoster.com
net::trackback/1.01
85.255.113.76 85.255.113.76-xbox.inhoster.com


Also recomend adding this IP to your servers IP ban.

inhoster.com,Used by scraper robots - Trackback Spam

random useragents from mail.midlandsteel.com

da4pxbtx4dbp iun xgt hifgbggalg4gcq
mail.midlandsteel.com 216.226.39.146

We see this a lot this, random useragents created by spam bots.
Its not clear why the mail server from midlandsteel.com is visiting our site with one but I thought I would post a notice about it.

For the most part random useragents are now under control and can be blocked by BB.

bb.answers.com 64.34.162.210 abuse bot

mozilla/4.0 (compatible; msie 5.5; windows nt 5.0)
bb.answers.com 64.34.162.210

Answers.com is sending out a robot with a FAKE useragent.

Nothing on the site explains why they are doing this or why they must fake a useragent.

Oct 9, 2006

ev1servers.net abuse

More info on the Evel Server

This domain was banned sometime ago for sending out abusive bots. Its not clear if they are spam bots or trackbots. They have also been known to fake the googlebot.

This domain is used by servers not users so everything from this domain is a bot and not a browser.

Well today they started hitting a blog again and went nuts when blocked. They tried several IPS but never got in.

Abuse from this domain is very active.

mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.147 ev1s-209-85-54-147.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.132 ev1s-209-85-54-132.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.148 ev1s-209-85-54-148.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.145 ev1s-209-85-54-145.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.143 ev1s-209-85-54-143.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.140 ev1s-209-85-54-140.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.136 ev1s-209-85-54-136.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.148 ev1s-209-85-54-148.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.142 ev1s-209-85-54-142.ev1servers.net

Update:

User-Agent: User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
67.15.119.25 ev1s-67-15-119-25.ev1servers.net

In the latest hits above the bots have gone nuts inserting User-Agent: 2 times in from of the fake agent.

a Better Bot Trap that can not be detected.

I have been using the bot traps described by others only to find that the bots are avoiding them by scanning the robots file.

The solution is to unlist your robot trap. O yea you say good bots will just fall into it. Well thats ok we just ignore the major search systems and only scan for useragents starting with mozilla. Once we ignore the major search systems and other bots known to use mozilla everything that is left will be a bot faking a web browser and we can bann it.

I have a beta test of the unlisted bot up and running and I will soon see what it does.

If this still doesn't work then bots must be avoiding the traps in other ways.

Update on this. No one has fallen into the new bot trap. This tells me that the bots are not really spidering from image links they must be following in links from google.
Going to switch to using text links to the trap and see what I catch.

http: user-agent = mozilla/4.0 Fake useragent

http: user-agent = mozilla/4.0 (compatible; msie 6.0; windows nt 5.1; .net clr 1.0.3
65.54.225.173

This is a strange one. The useragent has http: and user-agent = in it its clearly an atempt to fake a useragent.

Stranger is that the IP is a MS IP. Making it look more and more like MS IPs are being used to send spam.


Adding this check phrase to UA-Anywhere ban list.
user-agent =,unknown bot

hn.kd.dhcp abuse

mozilla/4.0 (compatible; msie 6.0; windows nt 5.0)
hn.kd.dhcp 61.54.11.169

This is not a real domain but if you see it its a giveaway for spam
Add this to your domain ban.

reverse.layeredtech.com snoopy v1.2.3 Abuse

POST HTTP/1.0
snoopy v1.2.3
72.232.60.162 162.60.232.72.reverse.layeredtech.com

This bot came in using snoopy and tried to post 6 loads of spam to the blog.
A legit snoopy agent would never post to your blog. Because the latest spam bots have started using the snoopy useragent it is now banned.


layeredtech.com is not a ISP its a hosting company where abusers are setting up bots to abuse websites. The domain needs to be added to the domain ban list.


reverse.layeredtech.com,unknown bots

IconSurf/2.0 favicon monitor (abuse)

400 Prohibited header 'Range'
IconSurf/2.0 favicon monitor (see http://iconsurf.com/robot.html)
12.146.74.139 xoba.com

The above robot is blocked by Bad Behaivor due to impropper headers.

I suggest that the useragent and domain name also be added to the block list.

This website indexes your .ico files and then hotlinks to them from its site. Resulting hotlinking will run up your bandwidth.

If you have not done so already suggest adding .ico to any hotlink protection system you are running. If you are not running one you really need to.


The site says they will honor robots file so you should also add a line to that as well.

User-agent: IconSurf
Disallow: /

Oct 7, 2006

security-lab1.juniper.net abuse

python-urllib/1.16
208.223.208.181 security-lab1.juniper.net

The above bot fell into a bot trap. In order to do this it had to violate the rules in the robots.txt file. It is not clear why www.juniper.net is sending out a bot.

I have written to them and am waiting for a reply on what this thing is and why it was caught in a improper crawl.

In the mean time python-urllib/1.16 is added as a email harvestor since thats why I find about that useragent on other sites.

203.117.201.35 mx1.khattarholdings.com abuse

Missigua Locator 1.9
203.117.201.35 mx1.khattarholdings.com

The locator agent is thought to be a email harvestor. However this ip is connected with gustbook spam. The domain has no website and is from Singapore.
Registrant:
KHATTAR HOLDINGS PRIVATE LIMITED (KHATTARH272)
80 RAFFLES PLACE
#25-01 UOB PLAZA 1
SINGAPORE, , 048624
SG


Domain added to domain ban

User-Agent: Mozilla/4.0 abuse

What do you think of this one? Is it bad robot?
IP: 72.13.32.7
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
Publish Reject


If you see a useragent with User-Agent: in the agent then its a known robot.
We are not sure what its doing.

Oct 4, 2006

sdcresearchlabs-testbot abuse

sdcresearchlabs-testbot/0.8-dev (www.shopping.com/bot.html; http://lucene.apache.org/nutch/bot.html; researchbot@shopping.com)
72.5.173.21

As part of our ongoing efforts to improve the buying experience for shoppers online, Shopping.com is experimenting with new ways to collect and aggregate data through web crawling. At this point we do not plan on integrating inventory from our web crawling index with inventory


Say what! You pirate our content and bandwidth for your own internal test and don't even plan to index the data on your website. Now thats what you call abuse.

Why on earth should we allow you to do that.

Shopping.com is a pay site you have to pay to get listed unlike you do on google's froogle where its free. Stay off my site unless you want to pay me or actualy list teh info on your site!

To start with your using nutch a free package you downloaded. Nutch is banned from all my sites.

And you were not even scanning my store you were scanning a local blog how lame is that?

How often will sdcresearchlabs-testbot access my web pages?
For most sites, sdcresearchlabs-testbot shouldn't access your site more than once every few seconds on average.


What the ________? ONCE ever FEW SECONDS do you know how much bandwidth that is?



They say you should add "sdcresearchlabs-testbot" to your robots file. I am gona try that and see what it does.

Oct 2, 2006

net-sweeper.com is content filtering remove from domain ban

When this one was added the robot on this domain was caught in dos atacks.
See here It now looks like the bot was just broken and this is some type of content filtering for schools and business. So it is being removed from the ban list adjust yours.

Also see this list of bad bots. listed as not looking at the robots file. Will be monitoring to see if it falls into a bot trap.


Remove from domain ban file
net-sweeper.com,Hides what it is was caught in DOS atacks.

vodka.ietsmetinternet.nl abuse bot drunk on vodka

66.232.113.20 vodka.ietsmetinternet.nl

The above bot has started scanning with no useragent. It is not clear what its trying to do not it keeps comming back even tho it gets a blocked screen. Must be drunk on Vodka.

A search for this only shows this subdomain hitting other sites. The main website gives a file not found page.

The domain is added to the domain ban

ietsmetinternet.nl,unknown drunk on vodka bot.

ns1.input-box.com abuse

'User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)'
81.177.34.184 ns1.input-box.com

This is abot using a known useragent.

input-box.com has a blank page for a website so the entire domain is banned.


added to domain ban

input-box.com,Unknown bots

spokane.dailydns.com Blog abuse SPAM

snoopy v1.2.3
69.16.221.13 spokane.dailydns.com

This is a blog spammer. Using useragent snoopy v1.2.3 We ban this useragent.
If you surf to the subdomain you will see a website that is not setup.
A search of google can find nothing on this subdomain however spam has been seen from other subdomains and the main domain is not really working so I am banning the it and all subdomains.


added to the domain ban.

dailydns.com,Blog spam website is down