- This topic is empty.
-
AuthorPosts
-
November 29, 2005 at 4:31 pm #591330AnonymousInactive
As I type this, one of my new sites is being hit once every minute by the following IP address:
It hits a different page at exactly the same time, at 48 seconds past the minute. Here is a snippet of the hits to my site in the last 5 minutes:
“page 1” Tue 11/29 – 11:11:48 am 64.5.245.23
“page 2” Tue 11/29 – 11:12:48 am 64.5.245.23
“page 3” Tue 11/29 – 11:14:48 am 64.5.245.23
“page 4” Tue 11/29 – 11:15:48 am 64.5.245.23
“page 5” Tue 11/29 – 11:16:48 am 64.5.245.23Whois shows this, but I don’t know what to do with the information:
64.5.245.23
Blacklist Status: Clear
Cached Whois: Cached today
Whois History: 5 records stored
Oldest: 2005-11-24
Newest: 2005-11-29
Record Type: IP Address
IP Location: Canada – Nova Scotia – Halifax – It Interactive
Reverse IP: No websites hosted using this IP address
Reverse DNS: h64-5-245-23.gtcust.grouptelecom.net
GT Group Telecom Services Corp. GROUPTELECOM-BLK-5 (NET-64-5-192-0-1)
64.5.192.0 – 64.5.255.255
IT Interactive GT-64-5-245-0-CX (NET-64-5-245-0-1)
64.5.245.0 – 64.5.245.63Does anyone know what is going on here? I don’t know why this thing would be crawling through every single page like this… It’s been chewing up my bandwidth for almost 18 hours now. :angry:
Any thoughts?
November 29, 2005 at 4:42 pm #677478AnonymousInactiveIf it were me… I’d block the IP.
November 29, 2005 at 4:49 pm #677479AnonymousInactiveHmmm… Do you do that by adding a rule to the .htaccess file? Or to the robots.txt file?
November 29, 2005 at 5:14 pm #677481AnonymousInactiveOkay, I blocked it with .htaccess, and it worked. The hits have stopped. :xmas:
Here’s the code I added to my .htaccess file, if anyone is interested:
order allow,deny
deny from 64.5.245.23
allow from all [/CODE][CODE]order allow,deny
deny from 64.5.245.23
allow from all [/CODE]November 29, 2005 at 5:17 pm #677482AnonymousInactiveperfect. I wasn’t sure which, but figured .htaccess would block a bot as well (if it only used 1 IP address). Anyone know for sure?
November 29, 2005 at 9:11 pm #677523AnonymousGuestI was gonna say: are you sure its a bad thing before you block it?
I don’t know crapola about all this but seems to me it’d be a shame to find out later it was a SE spider.
… just a thought.
November 29, 2005 at 9:36 pm #677534AnonymousInactiveSteve, you were right! I did a more detailed search, and it turns out that this was a spider from the GenieKnows search engine. GAAAHHH!!!
I wonder why the spider didn’t identify itself properly in my stats, and in the Whois information? If they would just name the thing “GenieKnows Bot” or something, this whole thing wouldn’t have happened. :sarcasm:
November 29, 2005 at 10:11 pm #677541AnonymousInactiveThis IP spidered my site too, good to know if it is a SE bot
Thanks guys!
December 1, 2005 at 8:29 pm #677695AnonymousGuestSteve, you were right!
you know what they say: even a blind squirrel finds an acorn once in a while.
I’m surprised … pleasantly, to hear GenieKnows is spidering. I thought they were strictly a PPC (one which I have avoided after initially trying out).
December 25, 2005 at 4:27 pm #679224AnonymousInactiveI went ahead and banned this IP. It was using 1 gig a month in bandwidth and never left my site. It seems like this spider is not programmed correctly. I hope I did the right thing, it was a waste of bandwidth though.
December 26, 2005 at 3:25 am #679237AnonymousInactiveEngineer wrote:If they would just name the thing “GenieKnows Bot” or something, this whole thing wouldn’t have happened. :sarcasm:I could name myself “GenieKnows Bot” if I liked. User agent spoofing is easy. That’s why user agent based cloaking is a bad idea.
But, yeah, I see your point. I’d also think they’d be smart to identify their IP better. IP spoofing is a different game.
-
AuthorPosts