| Author |
Message |
nb1 Regular


Joined: Mar 03, 2005 Posts: 92 Location: OZ
|
Posted:
Sat Jul 14, 2007 1:50 am |
|
I am posting in hopes that it may help someone else.
Yahoo has changed its crawling method and also added more crawlers and the way I read it for a short time there will be excessive amount of crawling of web sites for a period
of time
NetRange: 74.6.0.0 - 74.6.255.255
In effect may result in a higher cpu usage , and a higher number of hits also causing the pages to load slower ,
So as a Webmaster this is something you may want to be informd about
The links in this post explains what can be done to help slow The crawling down by using a delayed method
thank you for your time
::NB:: |
|
|
|
 |
kguske Site Admin

Joined: Jun 04, 2004 Posts: 4873
|
Posted:
Sat Jul 14, 2007 7:45 am |
|
Thanks! In short, here's the relevant stuff from the links above:
There is a Yahoo! Slurp-specific extension to robots.txt which allows you to set a lower limit on our crawler request rate.
You can add a "Crawl-delay: xx" instruction, where "xx" is a delay value between successive crawler accesses. If the crawler rate is a problem for your server, you can set the delay up to 5 or 10 or a comfortable value for your server.
Setting a crawl-delay of 10 for Yahoo! Slurp would look something like:
User-agent: Slurp
Crawl-delay: 10 |
|
|
|
 |
montego Site Admin

Joined: Aug 29, 2004 Posts: 7481 Location: Arizona
|
Posted:
Sat Jul 14, 2007 8:44 am |
|
This also helps keep the behaving search engine bots from getting banned with the Flood Blocker if you use "User-agent: *".
At one point Google said that they do not support that directive, but they have a way from their webmaster tools to set up a crawl delay. (Sorry if this particular info is outdated - have had both the robots.txt and Google set this way for quite some time - the old adage: "if it ain't broke, don't fix it!". LOL). |
|
|
|
 |
nb1 Regular


Joined: Mar 03, 2005 Posts: 92 Location: OZ
|
Posted:
Sat Jul 14, 2007 11:23 am |
|
humm outdated It may be the only thing I know is several other sites I visit that show you visitors ip addresses are usually flooded with that range and yes it also sets off the Flood Blocker quite often
But if this helps any one it's a good thing,
I was wondering do you have a direct link to your Google
site map in your robots text file ?
Sitemap:http://YOUR SITE/sitemap.xml
The reason I ask Beginning in April or may befor all major search engines are able to read this method
and I have had some trouble getting Google to validate my site map
Any follow up on this ? |
|
|
|
 |
montego Site Admin

Joined: Aug 29, 2004 Posts: 7481 Location: Arizona
|
Posted:
Sun Jul 15, 2007 9:51 am |
|
nb1, I use nukeSEO to provide my XML sitemap.  |
|
|
|
 |
nb1 Regular


Joined: Mar 03, 2005 Posts: 92 Location: OZ
|
Posted:
Sun Jul 15, 2007 11:28 am |
|
|
|
 |
montego Site Admin

Joined: Aug 29, 2004 Posts: 7481 Location: Arizona
|
Posted:
Mon Jul 16, 2007 5:36 am |
|
Then have you asked kguske about it over on nukeSEO.com? Mine has always validated... |
|
|
|
 |
|
|
|
|