Welcome to Geeklog, Anonymous Thursday, March 28 2024 @ 07:48 am EDT

Geeklog Forums

Duplicate content generated by search.class.php


Status: offline

beewee

Forum User
Full Member
Registered: 08/05/03
Posts: 969
Location:The Netherlands, where else?
When we're using url_rewrite to the article url changes from: /article.php?story=story_id&query=xxx to: /article.php/story_id.

Unfortunately the search function still generates the old /article.php?story=story_id&query=xxx. This means for Google: duplicate content.
How can I change the following code from the search class to obtain /article.php/story_id ? I'll skip the query as youn see..

Text Formatted Code
                        $articleUrl = COM_buildUrl ($_CONF['site_url']
                                        . '/article.php?story=' . $A['sid']);
                    } else {
                        $articleUrl = $_CONF['site_url'] . '/article.php?story='
                            . $A['sid'] . '&query=' . urlencode ($urlQuery);
                    }
                    $row = array ('<a href="' . $articleUrl . '">'
                            . stripslashes ($A['title']) . '</a>', $thetime[0],
                            DB_getItem ($_TABLES['users'], 'username',
                            "uid = '{$A['uid']}'"), $A['hits']);

Dutch Geeklog sites about camping/hiking:
www.kampeerzaken.nl | www.campersite.nl | www.caravans.nl | www.caravans.net
 Quote

Status: offline

Dirk

Site Admin
Admin
Registered: 01/12/02
Posts: 13073
Location:Stuttgart, Germany
Somewhat to my surprise, URLs like
Text Formatted Code
http://www.geeklog.net/article.php/geeklog-1.5.0b2?query=beta
do actually work (at least with the Apache setup here - haven't done any more testing yet). So that should be relatively easy to change (assuming it works on your server, too).

bye, Dirk
 Quote

Status: offline

beewee

Forum User
Full Member
Registered: 08/05/03
Posts: 969
Location:The Netherlands, where else?
Dirk, I've already tried that, but it didn't work on my server. And if I write a .htacces rewrite rule, it conflicts with the Geeklog URL_rewrite, resulting in an 500 internal server error.
Dutch Geeklog sites about camping/hiking:
www.kampeerzaken.nl | www.campersite.nl | www.caravans.nl | www.caravans.net
 Quote

Status: offline

jmucchiello

Forum User
Full Member
Registered: 08/29/05
Posts: 985
put a deny rule in your robots.txt for "query="
 Quote

Status: offline

Dirk

Site Admin
Admin
Registered: 01/12/02
Posts: 13073
Location:Stuttgart, Germany
Quote by: beewee

Dirk, I've already tried that, but it didn't work on my server.


Too bad, it sounded like an easy way out.

We've added it to Sami's to-do list :wink:

bye, Dirk
 Quote

Status: offline

beewee

Forum User
Full Member
Registered: 08/05/03
Posts: 969
Location:The Netherlands, where else?
Great, I'll be patient!
Dutch Geeklog sites about camping/hiking:
www.kampeerzaken.nl | www.campersite.nl | www.caravans.nl | www.caravans.net
 Quote

Status: offline

sbarakat

Forum User
Junior
Registered: 04/22/08
Posts: 27
Location:Norwich, United Kingdom
I have tested the url on a fresh install of beta 2
http://localhost/geeklog/public_html/article.php/welcome?query=geeklog
and for some reason it directs back to the home page. I will do a couple more tests and also keep the url rewrite stuff in mind when I work on the improvements to the search page. If however I find a "quick n dirty hack" I will post it up, otherwise you may have to wait until the project is finished and included in the next version of GL (1.6 maybe?)
 Quote

Status: offline

Dirk

Site Admin
Admin
Registered: 01/12/02
Posts: 13073
Location:Stuttgart, Germany
Quote by: furiousdog

for some reason it directs back to the home page.


That's Geeklog's way of dealing with manipulated / illegal requests that it doesn't consider worth reporting. So on your setup, that request ended up as something that Geeklog didn't like.

bye, Dirk
 Quote

Status: offline

beewee

Forum User
Full Member
Registered: 08/05/03
Posts: 969
Location:The Netherlands, where else?
Quote by: furiousdog

I have tested the url on a fresh install of beta 2
http://localhost/geeklog/public_html/article.php/welcome?query=geeklog
and for some reason it directs back to the home page. I will do a couple more tests and also keep the url rewrite stuff in mind when I work on the improvements to the search page. If however I find a "quick n dirty hack" I will post it up, otherwise you may have to wait until the project is finished and included in the next version of GL (1.6 maybe?)



I've also tried several url_rewrites, but now I know why they didn't work out at all: Geeklog didn't like them Idea
Dutch Geeklog sites about camping/hiking:
www.kampeerzaken.nl | www.campersite.nl | www.caravans.nl | www.caravans.net
 Quote

Status: offline

sbarakat

Forum User
Junior
Registered: 04/22/08
Posts: 27
Location:Norwich, United Kingdom
Sorry for the late response on this, I have been busy with other things and it completely slipped my mind.
To fix the problem open up the search.class.php file and replace this block of code
Text Formatted Code

                if (empty($this->_query)) {
                    $articleUrl = COM_buildUrl($_CONF['site_url']
                                    . '/article.php?story=' . $A['sid']);
                } else {
                    $articleUrl = $_CONF['site_url'] . '/article.php?story='
                        . $A['sid'] . '&amp;query=' . urlencode($this->_query);
                }
 

with this
Text Formatted Code

                $articleUrl = COM_buildUrl($_CONF['site_url'] . '/article.php?story=' . $A['sid']);
                if (!empty($this->_query)) {
                    $articleUrl .= (strpos($articleUrl,'?') ? '&' : '?') . 'query=' . urlencode($this->_query);
                }
 

it should be around line 306.
I have tested this briefly and it seems to work when url_rewrite is enabled and disabled, let me know how you get on. Maybe this fix can be included in the future GL 1.5.0-1?
 Quote

Status: offline

beewee

Forum User
Full Member
Registered: 08/05/03
Posts: 969
Location:The Netherlands, where else?
Hey, works like a charm over here!

Thanks! :banana:

Please include this in GL1.51
Dutch Geeklog sites about camping/hiking:
www.kampeerzaken.nl | www.campersite.nl | www.caravans.nl | www.caravans.net
 Quote

Status: offline

ankur_mmy

Forum User
Newbie
Registered: 02/18/08
Posts: 14
That works on my GL 1.4.1.. cheers
 Quote

All times are EDT. The time is now 07:48 am.

  • Normal Topic
  • Sticky Topic
  • Locked Topic
  • New Post
  • Sticky Topic W/ New Post
  • Locked Topic W/ New Post
  •  View Anonymous Posts
  •  Able to post
  •  Filtered HTML Allowed
  •  Censored Content