Hide Forgot
Description of problem: When mysql uses non-latin1 language, the character conversion does not work. Version-Release number of selected component (if applicable): mysql-server-5.1.61-1.el6_2.1 How reproducible: always Steps to Reproduce: 1. start mysqld with "--language=czech --character-set-server=utf8" 2. mysql --default-character-set=utf8 3. mysql> select bla; Actual results: ERROR 1054 (42S22): NeznB�m� sloupec 'bla' v field list Expected results: ERROR 1054 (42S22): Neznámý sloupec 'bla' ... ("v field list" is incorrect but it is another case) Additional info: mysql> \s -------------- mysql Ver 14.14 Distrib 5.1.61, for redhat-linux-gnu (i386) using readline 5.1 Connection id: 2 Current database: Current user: root@localhost SSL: Not in use Current pager: stdout Using outfile: '' Using delimiter: ; Server version: 5.1.61 Source distribution Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: utf8 Db characterset: utf8 Client characterset: utf8 Conn. characterset: utf8 UNIX socket: /var/lib/mysql/mysql.sock Uptime: 2 min 4 sec Threads: 1 Questions: 7 Slow queries: 0 Opens: 15 Flush tables: 1 Open tables: 8 Queries per second avg: 0.56 --------------
This looks like your terminal window is set to use some encoding other than what you told mysql to use (ie, utf8).
(In reply to comment #1) > This looks like your terminal window is set to use some encoding other than > what you told mysql to use (ie, utf8). I've got settings forwarded from my local machine .qa.[root@x86-64-6s-m1 tps]# locale LANG=cs_CZ.UTF-8 LC_CTYPE="cs_CZ.UTF-8" LC_NUMERIC="cs_CZ.UTF-8" LC_TIME="cs_CZ.UTF-8" LC_COLLATE="cs_CZ.UTF-8" LC_MONETARY="cs_CZ.UTF-8" LC_MESSAGES="cs_CZ.UTF-8" LC_PAPER="cs_CZ.UTF-8" LC_NAME="cs_CZ.UTF-8" LC_ADDRESS="cs_CZ.UTF-8" LC_TELEPHONE="cs_CZ.UTF-8" LC_MEASUREMENT="cs_CZ.UTF-8" LC_IDENTIFICATION="cs_CZ.UTF-8" LC_ALL= And it seems to work: .qa.[root@x86-64-6s-m1 tps]# echo -e "\x50\xC5\x99\xC3\xAD\x6C\x69\xC5\xA1\x20\xC5\xBE\x6C\x75\xC5\xA5\x6F\x75\xC4\x8D\x6B\xC3\xBD\x20\x6B\xC5\xAF\xC5\x88\x20\xC3\xBA\x70\xC4\x9B\x6C\x20\xC4\x8F\xC3\xA1\x62\x65\x6C\x73\x6B\xC3\xA9\x20\x6B\xC3\xB3\x64\x79\x2E" Příliš žluťoučký kůň úpěl ďábelské kódy. And in addition, connecting to mysql without charst specified leads to the same output: .qa.[root@x86-64-6s-m1 tps]# mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.1.61 Source distribution Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> \s -------------- mysql Ver 14.14 Distrib 5.1.61, for redhat-linux-gnu (x86_64) using readline 5.1 Connection id: 2 Current database: Current user: root@localhost SSL: Not in use Current pager: stdout Using outfile: '' Using delimiter: ; Server version: 5.1.61 Source distribution Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: utf8 Db characterset: utf8 Client characterset: latin1 Conn. characterset: latin1 UNIX socket: /var/lib/mysql/mysql.sock Uptime: 17 min 7 sec Threads: 1 Questions: 5 Slow queries: 0 Opens: 15 Flush tables: 1 Open tables: 8 Queries per second avg: 0.4 -------------- mysql> select bla; ERROR 1054 (42S22): NeznB�m� sloupec 'bla' v field list
just for the record, the same happens in RHEL 5 with mysql-5.0.95-1.el5_7.1 (not cloning yet, if this doesn't get resolved in RHEL6, I doubt there would be any chance to get it in RHEL5 ...)
I've reproduced it with default configuration after a fresh install in RHEL-6. What's more, I've tried many combinations, but haven't found a working configuration. It seems to be the same as an upstream bug report [1], which has been fixed in mysql-5.4. I tried it in mysql-5.5.20, which is currently in all maintained Fedora releases, and except [2] it works fine there. Also, 5.1 documentation mentions [3] possible issues with error message encoding and users are redirected to a current mysql-5.5, which is fixed: "The preceding method of error-message construction can result in messages that contain a mix of character sets unless all items involved contain only ASCII characters. This issue is resolved in MySQL 5.5, in which error messages are constructed internally within the server using UTF-8 and returned to the client in the character set specified by the character_set_results system variable." [3] Unfortunately, I haven't found a patch that could be easily applied. It looks like a more complicated issue, that probably won't be fixed by upstream in 5.1 any more :( [1] http://bugs.mysql.com/bug.php?id=1406 [2] http://bugs.mysql.com/bug.php?id=64310 [3] http://dev.mysql.com/doc/refman/5.1/en/charset-errors.html
I'm inclined to consider this a WONTFIX. Even if we could extract a reasonably-sized patch from mysql 5.5, I would be hesitant to apply it because it would amount to a significant behavioral change, which is exactly the kind of thing our users don't want in a stable RHEL release. It's not hard to imagine that there are apps out there that are looking at error message texts and will be broken by a change that affects their encoding, even if the new behavior is "more correct".