Status: offline

RickW

Forum User
Full Member
Registered: 01/28/04
Posts: 240
Quote by LWC:
Update:
As for your theory, RickW (are you still here? Don't go...) - if that were true, Google would be completly worthless. When Google updates the index, it divides it into 2 parts: verifing existing pages and crawling in search for brand new pages. If they only did the latter part, they'd have not 8 billion, but 8 googol pages in their index...of course, the first part is not perfect.


Depends on how many levels down the page is, what it's page rank is, who is linking to that page - I've seen some pages get really stagnant. Maybe in your case some of the issue is with the robots.txt and htaccess - it could be that a combination of algo flags and new algos just put some of your pages into a part of an index that just never gets updated. I still think if you remove the block from robots.txt, and put in a tiny link in your footer to those pages, and put in the php meta routine to set the noarchive, then you'll get rid of them for good.

edit:
Try putting this in your htaccess:

Text Formatted Code

RedirectMatch 410 /lior/*
RewriteCond %{HTTP_HOST} !^lior\.weissbrod\.com [NC]
RewriteCond %{HTTP_HOST} !^$
RewriteRule ^(.*) http://lior.weissbrod.com/$1 [L,R=301]


 


You can also add ErrorDocument 410 /410.php if you want, just make a copy of the 404.php and change the message.

I think that htaccess will match up with exactly what you need to accomplish.

edit2:
Hey, have you checked your htaccess to make sure it's doing what you think it's doing? I just put in http://lior.weissbrod.com at http://www.searchenginepromotionhelp.com/m/http-server-response/code-checker.php and got back a normal response 200. But then I tired http://www.weissbrod.com/lior, and it gave back a response 302! With that response, Google will continue to assume that your old path is the correct one.
www.antisource.com

Status: offline

LWC

Forum User
Full Member
Registered: 02/19/04
Posts: 818
Oops, that was because I shouldn't have used
Text Formatted Code

RewriteCond %{HTTP_HOST} !^lior\.weissbrod\.com$ [NC,OR]
RewriteCond %{REQUEST_URI} ^/lior/ [NC]
RewriteRule ^.*$ http://lior.weissbrod.com/ [G,L]

 

The correct way is
Text Formatted Code

RewriteCond %{HTTP_HOST} !^lior\.weissbrod\.com$ [NC,OR]
RewriteCond %{REQUEST_URI} ^/lior/ [NC]
RewriteRule .* - [G,L]

 

(the difference is in the last line)

Thanks!

If you want to re-test it, notice that it should be 410, not 301 (Dirk suggested it's even stronger).

Also, like I said, Google only has one such WWW link anyway. The problem is with the endless http://lior.weissbrod.com/lior/ links.

P.S.
Did you know this topic is on Google already? :-)
Dirk could probably pass such a fix in no time...

Status: offline

RickW

Forum User
Full Member
Registered: 01/28/04
Posts: 240
When you paste your htaccess onto here, make sure to wrap it in CODE tags - your escaping backslashes are getting stripped (assuming you're using them).
www.antisource.com

Status: offline

LWC

Forum User
Full Member
Registered: 02/19/04
Posts: 818
Fixed all of them, thanks!
But the same goes for your quotes of my .htaccess.

Page navigation