Welcome to Geeklog Monday, November 20 2017 @ 11:21 pm EST


Status: offline

RickW

Forum User
Full Member
Registered: 28/01/2004
Posts: 240
Location:United States
Quote by LWC:
Update:
As for your theory, RickW (are you still here? Don't go...) - if that were true, Google would be completly worthless. When Google updates the index, it divides it into 2 parts: verifing existing pages and crawling in search for brand new pages. If they only did the latter part, they'd have not 8 billion, but 8 googol pages in their index...of course, the first part is not perfect.


Depends on how many levels down the page is, what it's page rank is, who is linking to that page - I've seen some pages get really stagnant. Maybe in your case some of the issue is with the robots.txt and htaccess - it could be that a combination of algo flags and new algos just put some of your pages into a part of an index that just never gets updated. I still think if you remove the block from robots.txt, and put in a tiny link in your footer to those pages, and put in the php meta routine to set the noarchive, then you'll get rid of them for good.

edit:
Try putting this in your htaccess:

PHP Formatted Code

RedirectMatch 410 /lior/*
RewriteCond %{HTTP_HOST} !^lior\.weissbrod\.com [NC]
RewriteCond %{HTTP_HOST} !^$
RewriteRule ^(.*) http://lior.weissbrod.com/$1 [L,R=301]




You can also add ErrorDocument 410 /410.php if you want, just make a copy of the 404.php and change the message.

I think that htaccess will match up with exactly what you need to accomplish.

edit2:
Hey, have you checked your htaccess to make sure it's doing what you think it's doing? I just put in http://lior.weissbrod.com at http://www.searchenginepromotionhelp.com/m/http-server-response/code-checker.php and got back a normal response 200. But then I tired http://www.weissbrod.com/lior, and it gave back a response 302! With that response, Google will continue to assume that your old path is the correct one.
www.antisource.com

Status: offline

LWC

Forum User
Full Member
Registered: 19/02/2004
Posts: 818
Oops, that was because I shouldn't have used
PHP Formatted Code

RewriteCond %{HTTP_HOST} !^lior\.weissbrod\.com$ [NC,OR]
RewriteCond %{REQUEST_URI} ^/lior/ [NC]
RewriteRule ^.*$ http://lior.weissbrod.com/ [G,L]

 

The correct way is
PHP Formatted Code

RewriteCond %{HTTP_HOST} !^lior\.weissbrod\.com$ [NC,OR]
RewriteCond %{REQUEST_URI} ^/lior/ [NC]
RewriteRule .* - [G,L]

 

(the difference is in the last line)

Thanks!

If you want to re-test it, notice that it should be 410, not 301 (Dirk suggested it's even stronger).

Also, like I said, Google only has one such WWW link anyway. The problem is with the endless http://lior.weissbrod.com/lior/ links.

P.S.
Did you know this topic is on Google already? :-)
Dirk could probably pass such a fix in no time...

Status: offline

RickW

Forum User
Full Member
Registered: 28/01/2004
Posts: 240
Location:United States
When you paste your htaccess onto here, make sure to wrap it in CODE tags - your escaping backslashes are getting stripped (assuming you're using them).
www.antisource.com

Status: offline

LWC

Forum User
Full Member
Registered: 19/02/2004
Posts: 818
Fixed all of them, thanks!
But the same goes for your quotes of my .htaccess.

All times are EST. The time is now 11:21 pm.

  • Normal Topic
  • Sticky Topic
  • Locked Topic
  • New Post
  • Sticky Topic W/ New Post
  • Locked Topic W/ New Post
  •  View Anonymous Posts
  •  Able to post
  •  Filtered HTML Allowed
  •  Censored Content