Author |
Message |
bugsy
Worker


Joined: May 08, 2007
Posts: 130
|
Posted:
Tue Sep 04, 2007 3:16 am |
|
Hi
Our site was loading fine properly at Only registered users can see links on this board! Get registered or login! with Force Nuke URL option turned on in Sentinel and site was also indexed properly in Google and was also getting a good amount of hits.
Few days back we have modified the .htaccess to do a permanent 301 redirect to Only registered users can see links on this board! Get registered or login!.. and turned off the Force Nuke URL option and all pages are loading properly but unfortunately hits became almost half.
So while trying to check whether pages are indexed properly in google search i came across this in the first result field in google search Quote: | Press Booth: You have been blocked
You have been blocked from entering this site. You have attempted to bypass the Filter System on this site. All of the following information has been ...
www.press.xtvworld.com/ - 2k - | you can see this for yourself here Only registered users can see links on this board! Get registered or login!
So does it mean due to the redirect or some reason Sentinel has blocked the google bot from indexing www redirected pages? If so then what is the work around it?
Thanks in advance |
|
|
|
 |
kguske
Site Admin

Joined: Jun 04, 2004
Posts: 6437
|
Posted:
Tue Sep 04, 2007 4:27 am |
|
Beware of attacks that spoof Googlebot's IP address. They are specifically designed to harm your search engine rankings. You can protect the Googlebot IP range and attacks like that will still be blocked, even though they aren't added to the blocked IP table or htaccess. |
_________________ I search, therefore I exist...
Only registered users can see links on this board! Get registered or login! |
|
|
 |
montego
Site Admin

Joined: Aug 29, 2004
Posts: 9457
Location: Arizona
|
Posted:
Tue Sep 04, 2007 7:20 am |
|
bugsy, I don't understand why you would have gone from press.xtvworld.com to www.press.... especially if you were already indexed with just "press.". I didn't think that was what you were trying to do in your original thread on the topic. Of course it is going to take time to re-index for the "www.press".
Did you also tell GoogleBot to only cache now the "www.press" version? You do this by logging into your web master account there.
Personally, I would have left it at "press." and redirected all the "www.press" to what was well indexed.  |
_________________ Only registered users can see links on this board! Get registered or login!
Only registered users can see links on this board! Get registered or login! |
|
|
 |
bugsy

|
Posted:
Tue Sep 04, 2007 8:04 am |
|
ksgnuke - I checked Sentinel Panel and it has indeed blocked the Google Bot - at least its written googlebot 2.1! In fact it blocked Yahoo Slurp too. I have cleared both now. As for the .htaccess i found out though its chmod 666 Sentinel is not writing the IPs there - and i am getting lots of ip blocked email messages but they are showing neither in Sentinel Panel or .htaccess!
Montego - I know it was kinda stupid on my part! I tried a 301 redirect to http://press.xtvworld.com from all www.press queries and that fell on a redirect loop so decided to have a go with http://www.press - what i didn't foresee is Sentinel will block off Google Bot and Yahoo Slurp and even AdSense Bot who were trying to follow the 301 redirect!
So to make matters less complicated what i again have done is removed the redirect and again forcing nuke url to http://press.xtvworld.com like as before and hoping things will cool down a bit
! ! |
|
|
|
 |
montego

|
Posted:
Wed Sep 05, 2007 5:50 am |
|
What do you mean by "fell on a redirect loop"?
I would still log onto your Google web master account and tell Google to only index your now NON-www pages... Have you done this? |
|
|
|
 |
bugsy

|
Posted:
Wed Sep 05, 2007 6:06 am |
|
Ya in Google SiteMap panel i have now given http:// as the preference for crawling over www - but again the sitemap contains non GTapped URLs and Google Search is actually crawling the GTapped URLs..so i am kinda feeding two kind of information to google anyway ... which i do not think is good..
That loop is the message i got in the browser that the site is redirecting to a page that will keep on redirecting in a loop. I was trying to redirect http://www to http://.
But on second thoughts even if i do 301 that way again chances are there Sentinel will block bots who will try crawl the redirects..why i do not know....any idea on why .htaccess is not writing the IPs?...I did check that there is a space at the bottom of the .htaccess file..and chmod is 666 as instructed.. |
|
|
|
 |
montego

|
Posted:
Thu Sep 06, 2007 6:37 am |
|
Quote: |
That loop is the message i got in the browser that the site is redirecting to a page that will keep on redirecting in a loop.
|
That sounds to me like your rewrite rule wasn't "tight" enough. There should be no looping.
Quote: |
But on second thoughts even if i do 301 that way again chances are there Sentinel will block bots who will try crawl the redirects..why i do not know.
|
I just can't see how or why this is happening. Makes no sense to me. Would have to ask the author of NukeSentinel that one (BobMarion).
Quote: |
any idea on why .htaccess is not writing the IPs?...I did check that there is a space at the bottom of the .htaccess file..and chmod is 666 as instructed
|
There are numerous threads here which talk about all the common reasons why. I don't want to extend this thread into a different topic which has already been addressed many times. Feel free to post a new thread under NukeSentinel 2.5.x if you cannot find your solution, but please do search first as there is a wealth of info here and multiple identical threads / solutions just makes it all the harder for others to find their answers. Thanks. |
|
|
|
 |
bugsy

|
Posted:
Thu Sep 06, 2007 8:27 am |
|
Clarifying Loop:
Ya i might have used a bad re write rule..I think i used something like this Code:RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.press\.xtvworld\.com
RewriteRule ^(.*)$ http://press.xtvworld.com/$1 [R=301]
|
As for sentinel blocking google and yahoo bots - it can be for other reasons too but it happened just after putting in effect the redirect to http://www.
Here are the messages i received on the block Code:User ID: Anonymous (1)
Reason: Abuse-Filter
--------------------
User Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Query String: www.press.xtvworld.com/modules.php?name=News&file=article&sid=1341%2520id=r-3
Forwarded For: none
Client IP: none
Remote Address: xx.xxx.xx.xx
Remote Port: XXXXX
Request Method: GET
---------------------------------------------------------------------
User ID: Anonymous (1)
Reason: Abuse-Filter
--------------------
User Agent: Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
Query String: www.press.xtvworld.com/modules.php?name=News&file=article&sid=6760%2520
Forwarded For: none
Client IP: none
Remote Address: XX.X.XX.XXX
Remote Port: XXXXX
Request Method: GET
-------------------------------------------------------------------
User ID: Anonymous (1)
Reason: Abuse-Filter
--------------------
User Agent: Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
Query String: www.press.xtvworld.com/modules.php?name=News&file=article%2526sid=2549
Forwarded For: none
Client IP: none
Remote Address: XX.X.XX.XXX
Remote Port: XXXXX
Request Method: GET
|
As for the .htaccess ya i will check on the forum....actually i was looking for some kinda solution other that loosening the permission more on the .htaccess as one hacker once replaced some of my files which were chmod 777! So thats one chmod i am weary of!  |
|
|
|
 |
montego

|
Posted:
Fri Sep 07, 2007 6:33 am |
|
bugsy, well, not sure because that rewrite rule looks reasonable with possibly one exception. Have you tried it with [L,R=301] instead of just the [R=301]? Well, you probably don't want to do that now given your original concern.
However, I still do not see how the redirect could be causing those blocks. What IS concerning to me are the following characters that I see in your URL strings:
%2520
%2526
Unfortunately, I am not sure how those are getting there. Have you added these IP addresses already to your NS Protected IPs or Ranges? If not, then I would expect you to still see these blocks even now. However, if you have added them, then you may be covering up the fact that the issue still exists.
.htaccess does NOT need 777. 666 should work just fine.
BTW, having a file be 777 permissions does not automatically mean someone can just arbitrarily overwrite it from the web. Something within your site's code has to allow that to happen... i.e., another hole. |
|
|
|
 |
bugsy

|
Posted:
Fri Sep 07, 2007 11:37 am |
|
No have not tried [L,R=301] will test it on some other site. In fact I also do not know how they evolved those urls..will add the IPs in Sentinel and so far it has not blocked Google/Yahoo Bot again after clearing the ips..
That hacker somehow managed to hack into lots of 777 folders both nuke and mambo and also changed files which were chmod 777. How he did that i don't know but did manage to get the issue sorted out later - and i am positive any 777 i leave lying around even a cache folder he will hack it! Looked like part of some hacking syndicate or something!! |
|
|
|
 |
montego

|
Posted:
Fri Sep 07, 2007 6:54 pm |
|
The point about 777 that I am making is that just having 777 on a file or directory doesn't, in of itself, expose it for being hacked (updated). What had likely happened is that you had another hole in your system which allowed the hacker to then upload a file or execute a PHP or shell command which then took advantage of the 777 permissions.
Yes, it is best to not be this wide open if you don't know exactly what you have on your system and how secure it is. Things like Attachment mods, galleries, vwar, chat, etc. which allow file system level commands to be executed are the notorious "holes". |
|
|
|
 |
bugsy

|
Posted:
Sat Sep 08, 2007 12:52 pm |
|
Ya i got your point but I am not exactly aware where the hole lies ...i believe upgrading to later versions will take care of lots of issues in one go as they will be more secured anyway! |
|
|
|
 |
|