Welcome to Geeklog, Anonymous Saturday, May 18 2024 @ 07:55 pm EDT
Geeklog Forums
robots.txt to exclude "Print Format Page" of Stories?
tokyoahead
Anonymous
How would I write a robots.txt so that robots do not index my print-layout of stories? I found out that users search for things on google, find it on my homepage but are redirected on the print-layout without menus etc...
thanks
thanks
5
3
Quote
Status: offline
Dirk
Site Admin
Admin
Registered: 01/12/02
Posts: 13073
Location:Stuttgart, Germany
That's not possible since the robots.txt standard does not support regular expressions or the like. It only allows substrings, so you can only exclude URLs that start with a certain string but not those that end in some string (like .../print for the printable pages).
It may be possible to do something about this with some .htaccess magic (anyone?) or you could disable the link to the printable format altogether.
And, yes, the bots love the printable pages (no clutter, just text). There's a link back to the article at the bottom of that page - you may want to make that more obvious for your human visitors.
bye, Dirk
It may be possible to do something about this with some .htaccess magic (anyone?) or you could disable the link to the printable format altogether.
And, yes, the bots love the printable pages (no clutter, just text). There's a link back to the article at the bottom of that page - you may want to make that more obvious for your human visitors.
bye, Dirk
4
5
Quote
Status: offline
drshakagee
Forum User
Full Member
Registered: 10/01/03
Posts: 231
I don't know how well it works but you can add the rel=”nofollow” tag to your link and Google at least shouldn't follow it.
Yes I am mental.
Yes I am mental.
5
5
Quote
Status: offline
eg0master
Forum User
Regular Poster
Registered: 07/21/05
Posts: 73
Location:Stockholm
I've solved it using a hack in article.php and staticpages/index.php
This is the cide added in article.php
0 != strncmp($_SERVER['HTTP_REFERER'], $_CONF['site_url'],strlen($_CONF['site_url']))) {
echo COM_refresh($_CONF['site_url'] . '/article.php?story=' . $story);
exit();
}
Maybe something like this should be in the geeklog distribution and control it from config.php.
Geeklog Plugins: http://plugincms.com
This is the cide added in article.php
Text Formatted Code
if (0 == strcmp($mode,'print') &&0 != strncmp($_SERVER['HTTP_REFERER'], $_CONF['site_url'],strlen($_CONF['site_url']))) {
echo COM_refresh($_CONF['site_url'] . '/article.php?story=' . $story);
exit();
}
Maybe something like this should be in the geeklog distribution and control it from config.php.
Geeklog Plugins: http://plugincms.com
3
6
Quote
All times are EDT. The time is now 07:55 pm.
- Normal Topic
- Sticky Topic
- Locked Topic
- New Post
- Sticky Topic W/ New Post
- Locked Topic W/ New Post
- View Anonymous Posts
- Able to post
- Filtered HTML Allowed
- Censored Content