Bug 705135

Summary: mysql client character encoding behavior is confusing
Product: [Fedora] Fedora Reporter: Christopher Beland <beland>
Component: mysqlAssignee: Tom Lane <tgl>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 14CC: hhorak, tgl
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-13 15:12:06 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Christopher Beland 2011-05-16 18:40:02 UTC
The environment variable 'LANG' is set to 'en_US.utf8' in my locale, which I assume is the default.  

Inside mysql, the output of "SHOW CREATE DATABASE db_name;" contains "/*!40100 DEFAULT CHARACTER SET utf8 */".

The output of "SHOW CREATE TABLE table_name;" contains "DEFAULT CHARSET=utf8".

When I "SELECT * FROM table_name;" I see accented characters displayed incorrectly.  I need to add the following to ~/.my.cnf to correct the problem:

>>
[client]
default-character-set=utf8
<<

It seems to me that the client has enough information to automatically negotiate the correct character encoding.  The fact that it didn't caused me some consternation when I was trying to input UTF8 text into my UTF8 database.  It's also surprising that the character encoding isn't UTF8 for everything these days.

Version: mysql-5.1.56-1.fc14.x86_64

Comment 1 Tom Lane 2011-05-16 19:54:22 UTC
I doubt this is a bug; at best it's a feature request.  I would not venture to make Fedora deviate from upstream's behavior on the point.  I suggest you file it at http://bugs.mysql.com/ if you think you can make the case for them to change the behavior ... but first you might want to read http://dev.mysql.com/doc/refman/5.1/en/charset-connection.html and see whether responding to LANG would sensibly be part of their model at all.

BTW, it's also possible that this isn't a mysql bug but incorrect interaction with xterm, or whichever terminal emulator you were using to call the mysql client.  If the terminal isn't generating the encoding that mysql thinks it is receiving, you'll have problems ...

Comment 2 Honza Horak 2012-02-13 15:12:06 UTC
This behavior has changed in mysql-5.5, so the client takes $LANG into account now. See the following outputs of a fresh mysql client/server on Fedora 16:

$ rpm -q mysql mysql-server
mysql-5.5.20-1.fc16.x86_64
mysql-server-5.5.20-1.fc16.x86_64

$ echo $LANG
en_US.utf8

$ echo '\s' | mysql | grep characterset
Server characterset:	latin1
Db     characterset:	latin1
Client characterset:	utf8
Conn.  characterset:	utf8

$ echo '\s' | LANG=en_US mysql | grep characterset
Server characterset:	latin1
Db     characterset:	latin1
Client characterset:	latin1
Conn.  characterset:	latin1

$ echo '\s' | LANG=cs_CZ mysql | grep characterset
Server characterset:	latin1
Db     characterset:	latin1
Client characterset:	latin2
Conn.  characterset:	latin2

Since there is a newer mysql-5.5.20 on all supported Fedora releases, let's close this bug.