Status: offline

asmaloney

Forum User
Full Member
Registered: 02/08/04
Posts: 214
I noticed a problem with the "What's Related" box - it would only add links up to the first image. So I took a look at the function, made it more efficient, and fixed the problem.

The main difference in the match is that this one doesn't recognize any tags in between <a href=""> and </a>. So, for example, <a href="..."><b>foo</b></a> will not be matched. Maybe I'll futz around with it to handle this if anyone's interested.


function COM_extractLinks( $fulltext, $maxlength = 26 )
{
$rel = array();

preg_match_all( &quot;/(&lt;a href=[^>]+&gtWink([^<]*)(&lt;/a&gtWink/i&quot;, $fulltext, $matches );

for ( $i=0; $i&lt; count( $matches[0] ); $i++ )
{
// if link is too long, shorten it and add ... at the end
if ( ( $maxlength &gt; 0 ) &amp;&amp; ( strlen( $matches[2][$i] ) &gt; $maxlength ) )
{
$matches[2][$i] = substr( $matches[2][$i], 0, $maxlength - 3 ) . '...';
$matches[0][$i] = $matches[1][$i] . $matches[2][$i] . $matches[3][$i];
}

$rel[] = COM_checkHTML( $matches[0][$i] );
}

return $rel;
}


[Note I did not post this in HTML mode because it changed some of the code into smilies...]

Status: offline

vinny

Site Admin
Admin
Registered: 06/24/02
Posts: 352
In case you're curious, this is what we went with:

Text Formatted Code

function COM_extractLinks( $fulltext, $maxlength = 26 )
{
    $rel = array();

    preg_match_all( "/(<a.*?href="(.*?)".*?&gt<img align=absmiddle src='images/smilies/wink.gif' alt='Wink'>(.*?)(</a&gt<img align=absmiddle src='images/smilies/wink.gif' alt='Wink'>/", $fulltext, $matches );
    for ( $i=0; $i< count( $matches[0] ); $i++ )
    {
        $matches[3][$i] = strip_tags( $matches[3][$i] );
        if ( !strlen( trim( $matches[3][$i] ) ) ) {
            $matches[3][$i] = strip_tags( $matches[2][$i] );
        }

        // if link is too long, shorten it and add ... at the end
        if ( ( $maxlength > 0 ) && ( strlen( $matches[3][$i] ) > $maxlength ) )
        {
            $matches[3][$i] = substr( $matches[3][$i], 0, $maxlength - 3 ) . '...';
        }

        $rel[] = $matches[1][$i] . $matches[3][$i] . $matches[4][$i];
    }

    return( $rel );
}

 


Or you can take a look at it in lib-common.php (without the smilely faces) at: lib-common.php.

Status: offline

Blaine

Forum User
Moderator
Registered: 07/16/02
Posts: 1232
I'm just curious Is there any reason you are not using the [ code ] bb tags when posting code in the forum?

If it is not working well, I'd like to know.
Geeklog components by PortalParts -- www.portalparts.com

Status: offline

asmaloney

Forum User
Full Member
Registered: 02/08/04
Posts: 214

Vinny - thanks for posting that. Don't we want a case-insensitive match though?

Blaine - I did that because posting it using CODE translates smilies

e.g.
Text Formatted Code

 preg_match_all( "/(<a href=[^>]+&gt<img align=absmiddle src='images/smilies/wink.gif' alt='Wink'>([^<]*)(</a&gt<img align=absmiddle src='images/smilies/wink.gif' alt='Wink'>/i", $fulltext, $matches );

 

Status: offline

Blaine

Forum User
Moderator
Registered: 07/16/02
Posts: 1232
Yeh, it appear that updates of recent to GL have effected this feature.
Geeklog components by PortalParts -- www.portalparts.com

Status: offline

vinny

Site Admin
Admin
Registered: 06/24/02
Posts: 352
Blaine,

I put my code snippet in the code tags, but it put the smilely's in there anyway.

Also, I'm sure you noticed but the QUOTE tags are acting funny as well.

-Vinny

Status: offline

Blaine

Forum User
Moderator
Registered: 07/16/02
Posts: 1232
Quote by vinny: I put my code snippet in the code tags, but it put the smilely's in there anyway.

Also, I'm sure you noticed but the QUOTE tags are acting funny as well.


With the geeklog.net upgrade, the allowable HTML was changed. I need the pre tags for the code block formatting. Dirk fixed it a few hours ago. Let's see if that fixed both the quotes and code formatting.
Geeklog components by PortalParts -- www.portalparts.com

Status: offline

vinny

Site Admin
Admin
Registered: 06/24/02
Posts: 352
I added the case insensitive flag to the regex for COM_extractLinks. (Good catch asmaloney). It should show up in -rc2, and if not there then in the final release of 1.3.9.

-Vinny

Anonymous
GL 1.3.9sr1 still has the problem that links are only added up to the first image. Perhaps someone could finally get this messy preg_match_all() sorted out?

Status: offline

vinny

Site Admin
Admin
Registered: 06/24/02
Posts: 352
I've just tested this is Gl 1.3.9sr1 and it works with the exception of when you have a link like this:

Text Formatted Code

<a href="link1">[image1]</a>

 


which just won't show up in the whats related field, though links after this image still will. I'll work on this last little bug related to COM_extractLinks(). If you can demonstrate another bug, please post a URL so I can see it.

Thanks,
Vinny

Status: offline

asmaloney

Forum User
Full Member
Registered: 02/08/04
Posts: 214
I still have this problem too. I have a story with images [which themselves are links to unscaled versions] and none of the links on the page show up in What's Related.

Here's an example from my site.

Status: offline

vinny

Site Admin
Admin
Registered: 06/24/02
Posts: 352
Your problem has nothing to do with images, you use single quotes instead of double quotes in your links i.e.

Text Formatted Code

<a href='link1'>link1</a>
--instead of--
<a href="link1">link1</a>

 


The HTML spec calls for the use of double quotes. I'll see about accepting both when 1.3.10 is relased though.

-Vinny

Status: offline

asmaloney

Forum User
Full Member
Registered: 02/08/04
Posts: 214
Quote by vinny: Your problem has nothing to do with images, you use single quotes instead of double quotes in your links i.e.



Heh. That was one of the first things I checked. Using Firefox if you select some text on the page and use the context menu to 'View Selection Source', it shows double quotes even though the page source shows single quotes... I guess I out-Foxed myself.

Quote by vinny:
The HTML spec calls for the use of double quotes. I'll see about accepting both when 1.3.10 is relased though.


Yet the W3C validator validates them alright.

Thanks for catching that for me.

Status: offline

vinny

Site Admin
Admin
Registered: 06/24/02
Posts: 352
The next version of Geeklog (1.3.10) will have a COM_extractLinks that supports single quotes and also nested HTML tags (including images, i.e. [imageX]).

-Vinny