Ravens PHP Scripts: Forums
 

 

View next topic
View previous topic
Post new topic   Reply to topic    Ravens PHP Scripts And Web Hosting Forum Index -> v2.4 RN Issues
Author Message
dad7732
RavenNuke(tm) Development Team



Joined: Mar 18, 2007
Posts: 1242

PostPosted: Mon Nov 30, 2009 7:22 pm Reply with quote

The MSNBOT/2.0b does not recognize the robots.txt file. So here is what to add in your /root/.htaccess file:

Code:


RewriteCond %{HTTP_REFERER} ^msnbot/2\.0b [NC]
RewriteRule .* - [F,L]

//Returns a 403-Forbidden response and no content.


Works well. My site "was" being literally inundated with MSNBOT hits daily. Better than adding to the harvester list in NS.

If this doesn't belong here, move it - couldn't find a more appropriate place.

Cheers
 
View user's profile Send private message
hicuxunicorniobestbuildpc
The Mouse Is Extension Of Arm



Joined: Aug 13, 2009
Posts: 1123

PostPosted: Mon Nov 30, 2009 8:23 pm Reply with quote

hi dad7732

I tried to use this trick but unfortunatly I get a 500 error and the page doesnt show.
 
View user's profile Send private message
dad7732







PostPosted: Mon Nov 30, 2009 8:36 pm Reply with quote

You copy/pasted exactly as it is to your .htaccess file, doesn't matter where as long as it's below the opening lines, mine is in the middle somewhere. Looks like maybe a syntax error, Can you post your .htaccess file here without including any login/pass information, etc. ?
 
dad7732







PostPosted: Tue Dec 01, 2009 10:56 pm Reply with quote

What a sneaky bunch of "you know what's" at MSN. Looking at my tracked IP's I noticed several hundreds of hits from the MSN Bot tonight .. say what??? There must be an auto-detection setup for block attempts on their bots so guess what they did, they changed the user-agent string to eliminate MSNBOT/2.0b !! So .. seeing as how the IP block 65.55.xcxx.xxx is dedicated to the MSN Bot, I added 65.55.*.* as a "flood" blocker and to not be emailed several hundreds of times I set it to just "block,default page". We'll see what happens now. And yes, I could just as easily block the IP at the server level in hosts.allow, hosts.deny on some BSD Servers.

Cheers
 
Raven
Site Admin/Owner



Joined: Aug 27, 2002
Posts: 17088

PostPosted: Tue Dec 01, 2009 11:13 pm Reply with quote

According to this article it should now be honoring the robots file:

http://news.softpedia.com/news/MSNBot-2-0b-Kills-MSNBot-1-1-in-Bing-Crawler-Evolution-126202.shtml
 
View user's profile Send private message
hicuxunicorniobestbuildpc







PostPosted: Wed Dec 02, 2009 1:30 am Reply with quote

it is working fine right now dad7732. I just did like this

before
//Returns a 403-Forbidden response and no content.
after
#Returns a 403-Forbidden response and no content.

I forgot to comment out this line. lol. Thanks
 
dad7732







PostPosted: Wed Dec 02, 2009 7:54 am Reply with quote

Maybe MSN hasn't read the article!! Adding the MSN Bot to robots.txt has no effect here in any domain as of this morning. Adding the IP blocker is the only thing that works simply because as I mentioned, they changed their UA string accordingly. I haven't added their IP server-wide because I may have some users (30+ domains) that actually may want the visits.

Cheers, thanks for the article

PS: unicornio - that was my next suggestion, etc. Smile
 
dad7732







PostPosted: Wed Dec 02, 2009 8:52 am Reply with quote

Forgot to add that:

65.55.xxx.xxx - the MSN Bot

Was denied domain access 1688 times so far since midnight last night until the posting time just now - only 9 hrs. Over a 24 hr period this bot can be responsible for a pretty hefty resource drain.

Cheers
 
dad7732







PostPosted: Wed Dec 02, 2009 3:38 pm Reply with quote

As of a few minutes ago, we're up to 2940 MSN Bot hits since midnight that are now being rejected. Robots.txt isn't working. These people are ruthless to say the least. If they want my site that bad then they can PAY for it .. Smile

Cheers
 
Guardian2003
Site Admin



Joined: Aug 28, 2003
Posts: 6799
Location: Ha Noi, Viet Nam

PostPosted: Fri Dec 04, 2009 2:19 pm Reply with quote

This doesn't sound right to me, there is something else going on here. There's no reason for a 'bot' to change it's UA, mmmmm.....
 
View user's profile Send private message Send e-mail
Guardian2003







PostPosted: Fri Dec 04, 2009 2:22 pm Reply with quote

FYI
http://iplists.com/nw/msn.txt
 
dad7732







PostPosted: Fri Dec 04, 2009 2:36 pm Reply with quote

Old list. Where is MSNBOT 2.0b ? I've done a lot of research on this and the primary IP for the MSN bots is 65.55.xxx.xxx regardless of the UA. I've seen the change in UA immeidately after blocking MSNBOT 2.0b in the harvester menu.

Cheers
 
hicuxunicorniobestbuildpc







PostPosted: Sat Dec 05, 2009 4:38 pm Reply with quote

Because "msnbot/2.0b" continued to crawl numerous pages and directories that are officially off limits via META tags, robots.txt rules and X-Robots-Tag directives, I just officially blocked it.

Open your .htaccess

Copy and paste

Code:
RewriteCond %{HTTP_USER_AGENT} ^msnbot/2\.0b [NC] 

RewriteCond %{REQUEST_URI} !^/robots\.txt$
RewriteRule .* /robots.txt [L,R=301]


Upload it again. it is working for me. Can u please test it in order to know how we can improve this issue.
 
Display posts from previous:       
Post new topic   Reply to topic    Ravens PHP Scripts And Web Hosting Forum Index -> v2.4 RN Issues

View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum


Powered by phpBB © 2001-2007 phpBB Group
All times are GMT - 6 Hours
 
Forums ©