Welcome to Geeklog, Anonymous Wednesday, December 11 2024 @ 10:20 am EST

Geeklog Forums

Geeklog vs. Unicode in MySQL (v4+)


Status: offline

LWC

Forum User
Full Member
Registered: 02/19/04
Posts: 818
MySQL v4 had added Unicode support.

The problem is if I try to make Unicode tables instead of Ascii ones, then Geeklog pastes any none English Unicode chars as Gibberish.

The solution is to use Ascii tables (I decided to use "latin1_general_ci" because it looked the most...default for Ascii).
The result is that Geeklog stores none English Unicode chars as Unicode symbols instead of in their native language.
Therefore, I simply can't use PHPMyAdmin to edit anything that isn't English.

In other words, why can't the Unicode letters show up natively both in MySQL and in Geeklog (currently it must be the latter)?

So in conclusion, Geeklog has a problem with Unicode tables!

What do you think?

P.S.
How is it that despite using Ascii tables, "MySQL connection collation" can still be "utf8_general_ci"?
 Quote

Status: offline

Dirk

Site Admin
Admin
Registered: 01/12/02
Posts: 13073
Location:Stuttgart, Germany
Quote by LWC: So in conclusion, Geeklog has a problem with unicode tables!

To be honest, I'm continually amazed about how well (more or less) it works at all. Geeklog doesn't do any special handling for Unicode / UTF-8 and even uses all the wrong functions most of the time.

Don't expect any major changes for this in the 1.3 branch though. This is something that must be done right from the ground up. In other words: GL2 ...

bye, Dirk
 Quote

Status: offline

LWC

Forum User
Full Member
Registered: 02/19/04
Posts: 818
So is your official recommendation for the tables' language "latin1_general_ci"?

I suggest you mention it (or your other recommendation) in the FAQ so people who install Geeklog would know which default language for tables to choose when they create a database (it would also help those who already have Geeklog and happen to update MySQL to v4+).

Remember, if you use Unicode tables because you only use English anyway, someday you might use other languages (and then you'd have to convert all of your tables to Ascii, which is a manual chore).
 Quote

Status: offline

Dirk

Site Admin
Admin
Registered: 01/12/02
Posts: 13073
Location:Stuttgart, Germany
I wasn't aware that there were any changes in MySQL regarding Unicode, but then again I don't use UTF-8 on my sites ...

I'll look into this and see if I can make that language option the default when creating the tables.

bye, Dirk
 Quote

Status: offline

LWC

Forum User
Full Member
Registered: 02/19/04
Posts: 818
By "that" language, do you mean "latin1_general_ci"? Well, I believe you should keep letting the user choose the language (i.e. the default language they choose when they create a new database for Geeklog). You just need to stress it in the FAQ.
 Quote

Status: offline

Dirk

Site Admin
Admin
Registered: 01/12/02
Posts: 13073
Location:Stuttgart, Germany
Quote by LWC: By "that" language, do you mean "latin1_general_ci"?

I don't know what I mean Mr. Green This is all new to me - I have to do some reading first.

bye, Dirk
 Quote

Status: offline

LWC

Forum User
Full Member
Registered: 02/19/04
Posts: 818
Hey, just wondered if you did some reading.

It's really a pain not being able to use PHPMyAdmin for anything that isn't English.
 Quote

anonymouspoopdisturber

Anonymous
Quote by LWC: Hey, just wondered if you did some reading.

It's really a pain not being able to use PHPMyAdmin for anything that isn't English.

If this is as desperate as you make it sound, I'm sure Dirk wouldn't mind if you did some reading yourself and submit a patch Mr. Green

I'll reserve comments on mentioning that English is the most prolific language on the planet... Leaves me speechless
 Quote

Status: offline

ByteEnable

Forum User
Full Member
Registered: 10/20/03
Posts: 138
Actually the database interaction between GL and MySQL is done through PHP and GL really doesn't have much control over that, except other than table creation and type of connects. I currently use utf8_general_ci on my database. I could use utf8_unicode_ci but according to MySQL general is faster.

So you need to do several things.

1. Select your language in Geeklog if supported. (mine is english)
2. Create your database using your collation (mine is utf8_general_ci).

One thing is that your http server should also have your supported character set. Mine is UTF-8. You can also set your default_charset in php.ini (I don't).

Also you need to change MySQL (my.cnf) as follows for your collation
Text Formatted Code

default-character-set=utf8
init_connect='SET collation_connection = utf8_general_ci'
init_connect='SET NAMES utf8'
default-character-set=utf8
character-set-server = utf8
collation-server = utf8_general_ci

 


I have also changed Geeklog to connect to MySQL using UTF-8.
from system/databases/mysql.class.php

Text Formatted Code

// Connect to MySQL server
        $this->_db = mysql_connect($this->_host,$this->_user,$this->_pass) or die('Cannnot connect to DB server');
        mysql_query("SET NAMES 'UTF8'"); <----- Need to add this for UTF-8

 


If you want to modify a geeklog install, you would have to modify sql/mysql_tableanddata.php as shown below. Every SQL statement would need to be modified.
Text Formatted Code

$_SQL[1] = "
CREATE TABLE {$_TABLES['access']} (
  acc_ft_id mediumint(8) NOT NULL default '0',
  acc_grp_id mediumint(8) NOT NULL default '0',
  PRIMARY KEY  (acc_ft_id,acc_grp_id)
) TYPE=MyISAM CHARACTER SET `utf8` COLLATE `utf8_general_ci`;
";


 


So as you can see from above its a complex process for the uninitiated. Each subsystem has to be on the same page so to speak. http, mysql, php and geeklog to some extent. I hope this helps.

Byte
 Quote

Status: offline

ByteEnable

Forum User
Full Member
Registered: 10/20/03
Posts: 138
Oh yeah. I went from MySQL 3.X to 4.X. I made a dump of my database then modified every SQL statement to the new collation as shown in my previous post. Then I restored the database using my modified dump file.

Byte
 Quote

Status: offline

LWC

Forum User
Full Member
Registered: 02/19/04
Posts: 818
I'm so glad that for every wise guy anonymous user, there's a ByteEnable.

You've fixed it!

But let's start from the beginning: I could only find /etc/my.cnf and it had nothing about character sets or collations. Nevertheless, I guess my default is UTF8 so there's no problem.

Now I'll skip to the end: I believe sql/mysql_tableanddata.php should keep letting the user decide what they want (i.e. keep using the default collation they chose in the database they created for Geeklog).

So the only thing I did was:
[quote system/databases/mysql.class.php]
// Connect to MySQL server
$this->_db = mysql_connect($this->_host,$this->_user,$this->_pass) or die('Cannnot connect to DB server');
// custom code - start
mysql_query("SET NAMES 'UTF8'");
// custom code - end
[/quote]
And now Unicode letters show up natively both in MySQL and in Geeklog!
Which means I can finally use PHPMyAdmin for everything!

Of course, since it wasn't a fresh installation, searching and replacing the collatation inside the backup file wasn't enough! I had to decode - yes, decode - my UTF-8 backup file from UTF-8 because when I export it MySQL converts the UTF-8 symbols...into UTF-8 and that's one level too much. So I had to decode it first.
Hey, it's not so easy! The only way I managed to do it was using my site.
So if this helps anyone:
Text Formatted Code
<?php
$fileread = file('geeklog_old.sql');
$fileread = utf8_decode(join("", $fileread));
$filewrite = fopen('geeklog.sql', "w");
fwrite($filewrite, $fileread);
fclose($filewrite);
?>

 
 Quote

Status: offline

LWC

Forum User
Full Member
Registered: 02/19/04
Posts: 818
Just wondering what if I kept that UTF8 setting in system/databases/mysql.class.php but used Ascii tables?

I hope it still works because if it does, it means Geeklog should just add that setting no matter what.

If not, it should be a choice in config.php .
 Quote

All times are EST. The time is now 10:20 am.

  • Normal Topic
  • Sticky Topic
  • Locked Topic
  • New Post
  • Sticky Topic W/ New Post
  • Locked Topic W/ New Post
  •  View Anonymous Posts
  •  Able to post
  •  Filtered HTML Allowed
  •  Censored Content