Welcome to Geeklog, Anonymous Wednesday, December 11 2024 @ 10:20 am EST
Geeklog Forums
Geeklog vs. Unicode in MySQL (v4+)
Status: offline
LWC
Forum User
Full Member
Registered: 02/19/04
Posts: 818
MySQL v4 had added Unicode support.
The problem is if I try to make Unicode tables instead of Ascii ones, then Geeklog pastes any none English Unicode chars as Gibberish.
The solution is to use Ascii tables (I decided to use "latin1_general_ci" because it looked the most...default for Ascii).
The result is that Geeklog stores none English Unicode chars as Unicode symbols instead of in their native language.
Therefore, I simply can't use PHPMyAdmin to edit anything that isn't English.
In other words, why can't the Unicode letters show up natively both in MySQL and in Geeklog (currently it must be the latter)?
So in conclusion, Geeklog has a problem with Unicode tables!
What do you think?
P.S.
How is it that despite using Ascii tables, "MySQL connection collation" can still be "utf8_general_ci"?
The problem is if I try to make Unicode tables instead of Ascii ones, then Geeklog pastes any none English Unicode chars as Gibberish.
The solution is to use Ascii tables (I decided to use "latin1_general_ci" because it looked the most...default for Ascii).
The result is that Geeklog stores none English Unicode chars as Unicode symbols instead of in their native language.
Therefore, I simply can't use PHPMyAdmin to edit anything that isn't English.
In other words, why can't the Unicode letters show up natively both in MySQL and in Geeklog (currently it must be the latter)?
So in conclusion, Geeklog has a problem with Unicode tables!
What do you think?
P.S.
How is it that despite using Ascii tables, "MySQL connection collation" can still be "utf8_general_ci"?
26
17
Quote
Status: offline
Dirk
Site Admin
Admin
Registered: 01/12/02
Posts: 13073
Location:Stuttgart, Germany
Quote by LWC: So in conclusion, Geeklog has a problem with unicode tables!
To be honest, I'm continually amazed about how well (more or less) it works at all. Geeklog doesn't do any special handling for Unicode / UTF-8 and even uses all the wrong functions most of the time.
Don't expect any major changes for this in the 1.3 branch though. This is something that must be done right from the ground up. In other words: GL2 ...
bye, Dirk
22
23
Quote
Status: offline
LWC
Forum User
Full Member
Registered: 02/19/04
Posts: 818
So is your official recommendation for the tables' language "latin1_general_ci"?
I suggest you mention it (or your other recommendation) in the FAQ so people who install Geeklog would know which default language for tables to choose when they create a database (it would also help those who already have Geeklog and happen to update MySQL to v4+).
Remember, if you use Unicode tables because you only use English anyway, someday you might use other languages (and then you'd have to convert all of your tables to Ascii, which is a manual chore).
I suggest you mention it (or your other recommendation) in the FAQ so people who install Geeklog would know which default language for tables to choose when they create a database (it would also help those who already have Geeklog and happen to update MySQL to v4+).
Remember, if you use Unicode tables because you only use English anyway, someday you might use other languages (and then you'd have to convert all of your tables to Ascii, which is a manual chore).
18
19
Quote
Status: offline
Dirk
Site Admin
Admin
Registered: 01/12/02
Posts: 13073
Location:Stuttgart, Germany
I wasn't aware that there were any changes in MySQL regarding Unicode, but then again I don't use UTF-8 on my sites ...
I'll look into this and see if I can make that language option the default when creating the tables.
bye, Dirk
I'll look into this and see if I can make that language option the default when creating the tables.
bye, Dirk
21
22
Quote
Status: offline
LWC
Forum User
Full Member
Registered: 02/19/04
Posts: 818
By "that" language, do you mean "latin1_general_ci"? Well, I believe you should keep letting the user choose the language (i.e. the default language they choose when they create a new database for Geeklog). You just need to stress it in the FAQ.
16
24
Quote
anonymouspoopdisturber
Anonymous
Quote by LWC: Hey, just wondered if you did some reading.
It's really a pain not being able to use PHPMyAdmin for anything that isn't English.
It's really a pain not being able to use PHPMyAdmin for anything that isn't English.
If this is as desperate as you make it sound, I'm sure Dirk wouldn't mind if you did some reading yourself and submit a patch
I'll reserve comments on mentioning that English is the most prolific language on the planet...
17
20
Quote
Status: offline
ByteEnable
Forum User
Full Member
Registered: 10/20/03
Posts: 138
Actually the database interaction between GL and MySQL is done through PHP and GL really doesn't have much control over that, except other than table creation and type of connects. I currently use utf8_general_ci on my database. I could use utf8_unicode_ci but according to MySQL general is faster.
So you need to do several things.
1. Select your language in Geeklog if supported. (mine is english)
2. Create your database using your collation (mine is utf8_general_ci).
One thing is that your http server should also have your supported character set. Mine is UTF-8. You can also set your default_charset in php.ini (I don't).
Also you need to change MySQL (my.cnf) as follows for your collation
default-character-set=utf8
init_connect='SET collation_connection = utf8_general_ci'
init_connect='SET NAMES utf8'
default-character-set=utf8
character-set-server = utf8
collation-server = utf8_general_ci
I have also changed Geeklog to connect to MySQL using UTF-8.
from system/databases/mysql.class.php
// Connect to MySQL server
$this->_db = mysql_connect($this->_host,$this->_user,$this->_pass) or die('Cannnot connect to DB server');
mysql_query("SET NAMES 'UTF8'"); <----- Need to add this for UTF-8
If you want to modify a geeklog install, you would have to modify sql/mysql_tableanddata.php as shown below. Every SQL statement would need to be modified.
$_SQL[1] = "
CREATE TABLE {$_TABLES['access']} (
acc_ft_id mediumint(8) NOT NULL default '0',
acc_grp_id mediumint(8) NOT NULL default '0',
PRIMARY KEY (acc_ft_id,acc_grp_id)
) TYPE=MyISAM CHARACTER SET `utf8` COLLATE `utf8_general_ci`;
";
So as you can see from above its a complex process for the uninitiated. Each subsystem has to be on the same page so to speak. http, mysql, php and geeklog to some extent. I hope this helps.
Byte
So you need to do several things.
1. Select your language in Geeklog if supported. (mine is english)
2. Create your database using your collation (mine is utf8_general_ci).
One thing is that your http server should also have your supported character set. Mine is UTF-8. You can also set your default_charset in php.ini (I don't).
Also you need to change MySQL (my.cnf) as follows for your collation
Text Formatted Code
default-character-set=utf8
init_connect='SET collation_connection = utf8_general_ci'
init_connect='SET NAMES utf8'
default-character-set=utf8
character-set-server = utf8
collation-server = utf8_general_ci
I have also changed Geeklog to connect to MySQL using UTF-8.
from system/databases/mysql.class.php
Text Formatted Code
// Connect to MySQL server
$this->_db = mysql_connect($this->_host,$this->_user,$this->_pass) or die('Cannnot connect to DB server');
mysql_query("SET NAMES 'UTF8'"); <----- Need to add this for UTF-8
If you want to modify a geeklog install, you would have to modify sql/mysql_tableanddata.php as shown below. Every SQL statement would need to be modified.
Text Formatted Code
$_SQL[1] = "
CREATE TABLE {$_TABLES['access']} (
acc_ft_id mediumint(8) NOT NULL default '0',
acc_grp_id mediumint(8) NOT NULL default '0',
PRIMARY KEY (acc_ft_id,acc_grp_id)
) TYPE=MyISAM CHARACTER SET `utf8` COLLATE `utf8_general_ci`;
";
So as you can see from above its a complex process for the uninitiated. Each subsystem has to be on the same page so to speak. http, mysql, php and geeklog to some extent. I hope this helps.
Byte
17
19
Quote
Status: offline
ByteEnable
Forum User
Full Member
Registered: 10/20/03
Posts: 138
Oh yeah. I went from MySQL 3.X to 4.X. I made a dump of my database then modified every SQL statement to the new collation as shown in my previous post. Then I restored the database using my modified dump file.
Byte
Byte
22
20
Quote
Status: offline
LWC
Forum User
Full Member
Registered: 02/19/04
Posts: 818
I'm so glad that for every wise guy anonymous user, there's a ByteEnable.
You've fixed it!
But let's start from the beginning: I could only find /etc/my.cnf and it had nothing about character sets or collations. Nevertheless, I guess my default is UTF8 so there's no problem.
Now I'll skip to the end: I believe sql/mysql_tableanddata.php should keep letting the user decide what they want (i.e. keep using the default collation they chose in the database they created for Geeklog).
So the only thing I did was:
[quote system/databases/mysql.class.php]
// Connect to MySQL server
$this->_db = mysql_connect($this->_host,$this->_user,$this->_pass) or die('Cannnot connect to DB server');
// custom code - start
mysql_query("SET NAMES 'UTF8'");
// custom code - end
[/quote]
And now Unicode letters show up natively both in MySQL and in Geeklog!
Which means I can finally use PHPMyAdmin for everything!
Of course, since it wasn't a fresh installation, searching and replacing the collatation inside the backup file wasn't enough! I had to decode - yes, decode - my UTF-8 backup file from UTF-8 because when I export it MySQL converts the UTF-8 symbols...into UTF-8 and that's one level too much. So I had to decode it first.
Hey, it's not so easy! The only way I managed to do it was using my site.
So if this helps anyone:
$fileread = file('geeklog_old.sql');
$fileread = utf8_decode(join("", $fileread));
$filewrite = fopen('geeklog.sql', "w");
fwrite($filewrite, $fileread);
fclose($filewrite);
?>
You've fixed it!
But let's start from the beginning: I could only find /etc/my.cnf and it had nothing about character sets or collations. Nevertheless, I guess my default is UTF8 so there's no problem.
Now I'll skip to the end: I believe sql/mysql_tableanddata.php should keep letting the user decide what they want (i.e. keep using the default collation they chose in the database they created for Geeklog).
So the only thing I did was:
[quote system/databases/mysql.class.php]
// Connect to MySQL server
$this->_db = mysql_connect($this->_host,$this->_user,$this->_pass) or die('Cannnot connect to DB server');
// custom code - start
mysql_query("SET NAMES 'UTF8'");
// custom code - end
[/quote]
And now Unicode letters show up natively both in MySQL and in Geeklog!
Which means I can finally use PHPMyAdmin for everything!
Of course, since it wasn't a fresh installation, searching and replacing the collatation inside the backup file wasn't enough! I had to decode - yes, decode - my UTF-8 backup file from UTF-8 because when I export it MySQL converts the UTF-8 symbols...into UTF-8 and that's one level too much. So I had to decode it first.
Hey, it's not so easy! The only way I managed to do it was using my site.
So if this helps anyone:
Text Formatted Code
<?php$fileread = file('geeklog_old.sql');
$fileread = utf8_decode(join("", $fileread));
$filewrite = fopen('geeklog.sql', "w");
fwrite($filewrite, $fileread);
fclose($filewrite);
?>
19
23
Quote
Status: offline
LWC
Forum User
Full Member
Registered: 02/19/04
Posts: 818
Just wondering what if I kept that UTF8 setting in system/databases/mysql.class.php but used Ascii tables?
I hope it still works because if it does, it means Geeklog should just add that setting no matter what.
If not, it should be a choice in config.php .
I hope it still works because if it does, it means Geeklog should just add that setting no matter what.
If not, it should be a choice in config.php .
16
22
Quote
All times are EST. The time is now 10:20 am.
- Normal Topic
- Sticky Topic
- Locked Topic
- New Post
- Sticky Topic W/ New Post
- Locked Topic W/ New Post
- View Anonymous Posts
- Able to post
- Filtered HTML Allowed
- Censored Content