PHP Web Host - Quality Web Hosting For All PHP Applications Sign up for PayPal and start accepting credit card payments instantly
  Login or Register
 • Home • Downloads • Your Account • Forums • 

View next topic
View previous topic


Google
 
Web RavenPHPScripts (This Site)
Post new topic   Reply to topic
Author Message
hinksta
Worker
Worker


Joined: Dec 23, 2005
Posts: 226
Location: UK

PostPosted: Mon Mar 12, 2007 7:09 am Reply with quote Back to top

First of all Well Done to Raven and the team for all the hard work dragging phpnuke into the 21st century.

Just a couple of questions regarding robots.txt

What is Crawl-delay: 5 ?
and should ShortLinks be included?
View user's profile Send private message Visit poster's website
jakec
Moderator


Joined: Feb 06, 2006
Posts: 1727
Location: United Kingdom

PostPosted: Mon Mar 12, 2007 7:20 am Reply with quote Back to top

Crawl delay should slow down the crawling of bots on your site. Although I do recall reading somewhere that Google ignores this option, but if you have your site registered with Google you can specify a delay there instead.

I'm at work at the moment so I do have any files to look at, but I guess bots don't necessarily need to crawl the shortlinks directory so you could include if you wanted to. Maybe Montego can confirm/clarify this?

I don't think is really a bug as such, so I have moved the post.

Jakec
View user's profile Send private message
kguske
Site Admin


Joined: Jun 04, 2004
Posts: 4686

PostPosted: Mon Mar 12, 2007 10:56 am Reply with quote Back to top

They should not need to crawl the shortlinks directory, and it would be better if this is added to the robots.txt file.
View user's profile Send private message
montego
Site Admin


Joined: Aug 29, 2004
Posts: 7330
Location: Arizona

PostPosted: Tue Mar 13, 2007 6:22 am Reply with quote Back to top

I agree. My oversight. Sorry about that.
View user's profile Send private message Visit poster's website
kguske
Site Admin


Joined: Jun 04, 2004
Posts: 4686

PostPosted: Tue Mar 13, 2007 6:32 am Reply with quote Back to top

Not a big deal. I don't think it will really hurt anything.
View user's profile Send private message
Susann
Moderator


Joined: Dec 19, 2004
Posts: 2194
Location: Germany:Moderator German NukeSentinel Support

PostPosted: Tue Mar 20, 2007 11:57 am Reply with quote Back to top

Did we ever added Audioslaves Google Tap folder into the robots.txt ?
There isn´t any instruction where ist says you should add this folder to the robots.txt.
In my opionion its not required because I never saw a bot in the past years indexing the Google Tap folder from my website.
View user's profile Send private message Visit poster's website
montego
Site Admin


Joined: Aug 29, 2004
Posts: 7330
Location: Arizona

PostPosted: Wed Mar 21, 2007 5:38 am Reply with quote Back to top

Do not know Susann. Just thought that it could not hurt is all.
View user's profile Send private message Visit poster's website
bugsTHoR
Worker
Worker


Joined: Apr 05, 2006
Posts: 172

PostPosted: Wed Apr 04, 2007 12:36 pm Reply with quote Back to top

i believe you have to enter this to stop googlebot indexing your site

in the robots file at the /Root

User-agent: Googlebot
Disallow: /
View user's profile Send private message Visit poster's website
jakec
Moderator


Joined: Feb 06, 2006
Posts: 1727
Location: United Kingdom

PostPosted: Wed Apr 04, 2007 12:47 pm Reply with quote Back to top

Why would you want to stop Googlebot from indexing your site, unless you never want it appear in the Google search? Confused
View user's profile Send private message
bugsTHoR
Worker
Worker


Joined: Apr 05, 2006
Posts: 172

PostPosted: Wed Apr 04, 2007 12:55 pm Reply with quote Back to top

jakec wrote
Quote:
Although I do recall reading somewhere that Google ignores this option
in reply to first post

so thought i would add that you need this to stop googlebot



Arrow someone might want it. never know
View user's profile Send private message Visit poster's website
jakec
Moderator


Joined: Feb 06, 2006
Posts: 1727
Location: United Kingdom

PostPosted: Wed Apr 04, 2007 2:06 pm Reply with quote Back to top

OK I see what you are saying.

My point was that although Googlebot doesn't recognise the crawl delay in the robots.txt files, you can apply a delay if your site is registered with Google.

If you block Googlebot it is unlikely that people will find your website. Maybe that is what you want, if it is a private site.
View user's profile Send private message
bugsTHoR
Worker
Worker


Joined: Apr 05, 2006
Posts: 172

PostPosted: Wed Apr 04, 2007 5:26 pm Reply with quote Back to top

i have seen some guys /ladies on here have private sites (friends/ family)so it would be usefull to them.

i can see if you have a really busy site it would also be very usfull to with other indexing robots or what not on the net.

Arrow i do think its usefull as the google site i seen, on googlebot it has more than 10,000,000 robots out there.
View user's profile Send private message Visit poster's website
jakec
Moderator


Joined: Feb 06, 2006
Posts: 1727
Location: United Kingdom

PostPosted: Thu Apr 05, 2007 2:44 am Reply with quote Back to top

Like I said, I understand what you are saying, and I too have some personal sites that I don't really want indexed, but also I'm not bothered if people did visit them. So blocking good bots in this way is one solution.

...but the original question was:

hinksta wrote:
What is Crawl-delay: 5 ?
and should ShortLinks be included?
View user's profile Send private message
Guardian2003
Site Admin


Joined: Aug 28, 2003
Posts: 4653

PostPosted: Thu Apr 05, 2007 2:06 pm Reply with quote Back to top

Going off topic slightly but if you want to stop all bots (that adhere to the robots.txt) then you can simply use
Code:

User-agent: *
Disallow: /

As far as I am aware, the Crawl-delay only works with Slurp (at least Yahoo claim its a directive specifically for their bots).
Only registered users can see links on this board!
Get registered or login to the forums!
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:       
Post new topic   Reply to topic

View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Forums ©
 

All logos and trademarks in this site are property of their respective owner.
The comments are property of their posters, all the rest © 2002-2008 by Raven
Proud to be listed at Lobo Links Web Directory

You can syndicate our news using the file xml

CSE HTML Validator Helped Clean up This Page! [Valid RSS] valid RSS 2.0 Valid robots.txt Stop Spam Harvesters, Join Project Honey Pot

Website engines core code is © copyright by PHP-Nuke but has been heavily patched and modified by myself and others.
PHP-Nuke is a free software released under the GNU/GPL.


:: fisubice phpbb2 style by Daz :: PHP-Nuke theme by www.nukemods.com ::

:: fisubice Theme Recoded To 100% W3C CSS & HTML 4.01 Transitional Compliance by Raven and 64bitguy ::

:: W3C CSS Compliance Validation :: W3C HTML 4.01 Transitional Compliance Validation ::

zerosum