Is this a scraper or just an annoying spider?

This topic is empty.

Viewing 11 posts - 1 through 11 (of 11 total)

Author

Posts
November 29, 2005 at 4:31 pm #591330

Anonymous
Inactive

As I type this, one of my new sites is being hit once every minute by the following IP address:

64.5.245.23

It hits a different page at exactly the same time, at 48 seconds past the minute. Here is a snippet of the hits to my site in the last 5 minutes:

“page 1” Tue 11/29 – 11:11:48 am 64.5.245.23
“page 2” Tue 11/29 – 11:12:48 am 64.5.245.23
“page 3” Tue 11/29 – 11:14:48 am 64.5.245.23
“page 4” Tue 11/29 – 11:15:48 am 64.5.245.23
“page 5” Tue 11/29 – 11:16:48 am 64.5.245.23

Whois shows this, but I don’t know what to do with the information:

64.5.245.23

Blacklist Status: Clear
Cached Whois: Cached today
Whois History: 5 records stored
Oldest: 2005-11-24
Newest: 2005-11-29
Record Type: IP Address
IP Location: Canada – Nova Scotia – Halifax – It Interactive
Reverse IP: No websites hosted using this IP address
Reverse DNS: h64-5-245-23.gtcust.grouptelecom.net

GT Group Telecom Services Corp. GROUPTELECOM-BLK-5 (NET-64-5-192-0-1)
64.5.192.0 – 64.5.255.255
IT Interactive GT-64-5-245-0-CX (NET-64-5-245-0-1)
64.5.245.0 – 64.5.245.63

Does anyone know what is going on here? I don’t know why this thing would be crawling through every single page like this… It’s been chewing up my bandwidth for almost 18 hours now. :angry:

Any thoughts?

November 29, 2005 at 4:42 pm #677478

Anonymous
Inactive

If it were me… I’d block the IP.

November 29, 2005 at 4:49 pm #677479

Anonymous
Inactive

Hmmm… Do you do that by adding a rule to the .htaccess file? Or to the robots.txt file?

November 29, 2005 at 5:14 pm #677481

Anonymous
Inactive

Okay, I blocked it with .htaccess, and it worked. The hits have stopped. :xmas:

Here’s the code I added to my .htaccess file, if anyone is interested:

order allow,deny
deny from 64.5.245.23
allow from all [/CODE][CODE]order allow,deny
deny from 64.5.245.23
allow from all [/CODE]

November 29, 2005 at 5:17 pm #677482

Anonymous
Inactive

perfect. I wasn’t sure which, but figured .htaccess would block a bot as well (if it only used 1 IP address). Anyone know for sure?

November 29, 2005 at 9:11 pm #677523

Anonymous
Guest

I was gonna say: are you sure its a bad thing before you block it?

I don’t know crapola about all this but seems to me it’d be a shame to find out later it was a SE spider.

… just a thought.

November 29, 2005 at 9:36 pm #677534

Anonymous
Inactive

Steve, you were right! I did a more detailed search, and it turns out that this was a spider from the GenieKnows search engine. GAAAHHH!!!

I wonder why the spider didn’t identify itself properly in my stats, and in the Whois information? If they would just name the thing “GenieKnows Bot” or something, this whole thing wouldn’t have happened. :sarcasm:

November 29, 2005 at 10:11 pm #677541

Anonymous
Inactive

This IP spidered my site too, good to know if it is a SE bot

Thanks guys!

December 1, 2005 at 8:29 pm #677695

Anonymous
Guest

Steve, you were right!

you know what they say: even a blind squirrel finds an acorn once in a while.

I’m surprised … pleasantly, to hear GenieKnows is spidering. I thought they were strictly a PPC (one which I have avoided after initially trying out).

December 25, 2005 at 4:27 pm #679224

Anonymous
Inactive

I went ahead and banned this IP. It was using 1 gig a month in bandwidth and never left my site. It seems like this spider is not programmed correctly. I hope I did the right thing, it was a waste of bandwidth though.

December 26, 2005 at 3:25 am #679237

Anonymous
Inactive

Engineer wrote:

If they would just name the thing “GenieKnows Bot” or something, this whole thing wouldn’t have happened. :sarcasm:

I could name myself “GenieKnows Bot” if I liked. User agent spoofing is easy. That’s why user agent based cloaking is a bad idea.

But, yeah, I see your point. I’d also think they’d be smart to identify their IP better. IP spoofing is a different game.
Author

Posts

Viewing 11 posts - 1 through 11 (of 11 total)

Get exclusive CAP network offers from top brands

Menu

For Advertisers

List Your Affiliate Program

iGaming Services

Services Categories

For Advertisers

Sell My Product or Service Now!

Get More Players and Affiliates

Sell Product or Service

Got Questions?

Get in touch!

Is this a scraper or just an annoying spider?

Get exclusive CAP network offers from top brands

Featured Programs

Featured Programs

Featured Programs

Featured Programs

Featured Programs

Featured Programs

Featured Programs

Featured Programs

Featured Programs

Featured Programs

Featured Programs

List Your Affiliate Program

Services Categories

Sell My Product or Service Now!

Get in touch!