Ravens PHP Scripts: Forums
 

 

View next topic
View previous topic
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    Ravens PHP Scripts And Web Hosting Forum Index -> NukeSentinel(tm) v2.5.x
Author Message
mercman
Regular
Regular



Joined: Nov 29, 2006
Posts: 64
Location: TN, USA

PostPosted: Wed Dec 20, 2006 4:39 pm Reply with quote

Hi all!
I am running Raven's 76 v2.02.02 distro with Sentinel v.2.5.03 and have been getting abuse-flood reports from Sentinel for what I believe are Google bot crawlers. See sample below.

Code:
Date & Time: 2006-12-19 21:29:59 EST GMT -0500 Blocked IP: 66.249.72.76 User ID: Anonymous (1)

Reason: Abuse-Flood
--------------------
User Agent: Mediapartners-Google/2.1
Query String: www.merc-man.net/nuke/modules.php?name=PHP-Nuke_HOWTO&page=change-buggy-php-nuke-theme.html
Get String: www.merc-man.net/nuke/modules.php?name=PHP-Nuke_HOWTO&page=change-buggy-php-nuke-theme.html
Post String: www.merc-man.net/nuke/modules.php Forwarded For: none Client IP: none Remote Address: 66.249.72.76 Remote Port: 50191 Request Method: GET


I reviewed a few posts on this subject, but nothing really seems to fit the bill.
I've checked my "robots.txt" file and it seems ok, but... ??
My "robots.txt":

Code:
User-agent: *

Disallow: /abuse/
Disallow: /admin/
Disallow: /blocks/
Disallow: /cgi-bin/
Disallow: /db/
Disallow: /images/
Disallow: /includes/
Disallow: /language/
Disallow: /modules/
Disallow: /themes/
Disallow: /admin.php
Disallow: /config.php


Can someone point me in the right direction to correct this please?

Thank you,

-Merc
 
View user's profile Send private message Visit poster's website
Guardian2003
Site Admin



Joined: Aug 28, 2003
Posts: 6799
Location: Ha Noi, Viet Nam

PostPosted: Wed Dec 20, 2006 6:35 pm Reply with quote

I have yet to see Google flood as site with http requests (unlike Slurp).
What setting is your flood level set too? You might want to knock it down a peg or two.
 
View user's profile Send private message Send e-mail
mercman







PostPosted: Wed Dec 20, 2006 6:44 pm Reply with quote

Guardian,
Thank you for your reply.
The Flood delay is set for 2 seconds (default?).

-Merc
 
Guardian2003







PostPosted: Wed Dec 20, 2006 6:49 pm Reply with quote

Out of curiosity, whats the 'default' blocker template set to?
I'm not saying Google didnt cause the flood blocker to trip it's just the first time I have known it to happen. Google is usuallly pretty well behaved.

I'm not sure on the default flood setting, mine has always been set to 3 seconds.
 
mercman







PostPosted: Wed Dec 20, 2006 6:54 pm Reply with quote

Activate:Email Admin
Write to htaccess:No
Forward to:""
IP Block Type:Full IP (127.2.3.4)
Default Page: Flood
Email IP Lookup:Off
Reason:Abuse-Flood
Block Duration:Permanent
 
Guardian2003







PostPosted: Wed Dec 20, 2006 7:06 pm Reply with quote

Interesting. So it might not have been a *real* flood after all as that is the default template.
I would be tempted to disregard the warning email on this occassion.
 
mercman







PostPosted: Wed Dec 20, 2006 7:12 pm Reply with quote

Yeah, I sort of figured to.
I am making some changes in the templates now (Write to htaccess:on, Email IP Lookup:on, etc.).
Do you think I should "open" up the flood delay to maybe 3 or 4 seconds?

-Merc
 
evaders99
Former Moderator in Good Standing



Joined: Apr 30, 2004
Posts: 3221

PostPosted: Wed Dec 20, 2006 10:41 pm Reply with quote

I suggest using the Crawl-delay parameter in robots.txt

_________________
- Only registered users can see links on this board! Get registered or login! -

Need help? Only registered users can see links on this board! Get registered or login! 
View user's profile Send private message Visit poster's website
Guardian2003







PostPosted: Thu Dec 21, 2006 2:54 am Reply with quote

From the Yahoo help pages
Quote:
Frequency of Access
There is a Yahoo-Blogs/v3.9 specific extension to robots.txt which allows you to set a lower limit on our crawler request rate.

You can add a "Crawl-delay: xx" instruction, where "xx" is the minimum delay in seconds between successive crawler accesses. Our default crawl-delay value is 1 second. If the crawler rate is a problem for your server, you can set the delay up to up to 5 or 20 or a comfortable value for your server.

Setting a crawl-delay of 20 seconds for Yahoo-Blogs/v3.9 would look something like:

User-agent: Yahoo-Blogs/v3.9
Crawl-delay: 20

Not all bots will obey the directive but Slurp, Google and MSNBot should.

For Google you would use something like
Code:
User-agent: Google

Crawl-delay: 20
 
montego
Site Admin



Joined: Aug 29, 2004
Posts: 9457
Location: Arizona

PostPosted: Thu Dec 21, 2006 9:17 am Reply with quote

Google ignores the Crawl-delay parameter (so says their robot.txt validation checked), BUT, I still use this for other bots, specifically yahoo! It was tripping the flood all the time on me.

_________________
Only registered users can see links on this board! Get registered or login!
Only registered users can see links on this board! Get registered or login! 
View user's profile Send private message Visit poster's website
jakec
Site Admin



Joined: Feb 06, 2006
Posts: 3048
Location: United Kingdom

PostPosted: Thu Dec 21, 2006 2:46 pm Reply with quote

If you have your site registered with Google and have verified your site, there is an option to set the crawl speed. Normal or Slower.

Once you have selected your domain under webmaster tools the option is located under Diagnostic and then Crawl rate.

I hope this helps. Smile

Jakec
 
View user's profile Send private message
montego







PostPosted: Fri Dec 22, 2006 6:48 am Reply with quote

jakec, I had forgotten about that. Thanks!

I am in agreement, though, with Guardian in that I have not had an issue with Google's bot tripping the flood. Maybe it was a one-off issue???
 
mercman







PostPosted: Fri Dec 22, 2006 5:39 pm Reply with quote

Everyone, thank you so much for all the input and suggestions.

I too, think this was simply a "one-off" issue; I had just that day placed a new Google ad on the site. (Thanks go out to hitwalker for fixing my blocks!) Very Happy

My site is registered with Google, so I'll check out that setting in the Webmaster tools. Thank you jakec!

I'll definately check/add that setting in the robots.txt file.

Thank you all

-Merc
 
montego







PostPosted: Sun Dec 24, 2006 9:02 am Reply with quote

RavensScripts
 
Display posts from previous:       
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    Ravens PHP Scripts And Web Hosting Forum Index -> NukeSentinel(tm) v2.5.x

View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum


Powered by phpBB © 2001-2007 phpBB Group
All times are GMT - 6 Hours
 
Forums ©