Welcome to Geeklog, Anonymous Wednesday, October 09 2024 @ 09:51 am EDT
Geeklog Forums
Issue of the multibyte char sets
Status: offline
vvprok
Forum User
Newbie
Registered: 07/07/03
Posts: 10
Geeklog is translated to many languages. It is fine!
However Gl does not work with multibyte characters correctly.
As you know, string related functions strlen, strpos, substr, etc. do not take into account string encoding and works with byte sequence only. In such way, f.e. links plugin incorrecly composes brief string for "whats new" block. It leaves 16 bytes of the link title and then adds "...". As result for uk_UA.UTF-8 locale I got 7 symbols of the title in Ukrainian language and then some garbage symbols before "...".
And as you also know, there are another set of functions especially for multibyte encoding: mb_strlen, mb_strpos, mb_substr, mb_etc.
I already fixed links plugin with mb_* functions (see here).
I simply changed calls
to the
However, it looks quite complicated to be used as total solution for all string related operations.
So, I propose to create lib-strings.php module. It will contain string-related functions. Those functions will hide from Gl code implementation details of the string related code. All of them will look in the next manner:
function gl_strlen($string)
{
global $LANG_CHARSET;
return mb_strlen($string, $LANG_CHARSET);
}
So, what do you think?
However Gl does not work with multibyte characters correctly.
As you know, string related functions strlen, strpos, substr, etc. do not take into account string encoding and works with byte sequence only. In such way, f.e. links plugin incorrecly composes brief string for "whats new" block. It leaves 16 bytes of the link title and then adds "...". As result for uk_UA.UTF-8 locale I got 7 symbols of the title in Ukrainian language and then some garbage symbols before "...".
And as you also know, there are another set of functions especially for multibyte encoding: mb_strlen, mb_strpos, mb_substr, mb_etc.
I already fixed links plugin with mb_* functions (see here).
I simply changed calls
Text Formatted Code
str...(...)Text Formatted Code
mb_str...(..., $LANG_CHARSET)So, I propose to create lib-strings.php module. It will contain string-related functions. Those functions will hide from Gl code implementation details of the string related code. All of them will look in the next manner:
Text Formatted Code
function gl_strlen($string)
{
global $LANG_CHARSET;
return mb_strlen($string, $LANG_CHARSET);
}
So, what do you think?
9
8
Quote
All times are EDT. The time is now 09:51 am.
- Normal Topic
- Sticky Topic
- Locked Topic
- New Post
- Sticky Topic W/ New Post
- Locked Topic W/ New Post
- View Anonymous Posts
- Able to post
- Filtered HTML Allowed
- Censored Content