From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.7.12-1.3.1 Description of problem: psql does not allow to communicate with UNICODE databases when use 'advanced' columns descriptions. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Install and run postgres server 2. Create db with UNICODE encoding $ createdb -E UNICODE mydb 2. Set KOI8 locale $ export LANG=ru_RU.koi8r 3. Connect to this DB from terminal with KOI8 encoding $ psql -d mydb 5. Set RIGHT encoding to communicate with DB mydb=> \encoding KOI8 6. Just create table with 'PRIMARY KEY' specifier (I've got this table description from pgsql docs at http://www.postgresql.org/docs/7.3/interactive/ddl-constraints.html#AEN1849 ) mydb=> CREATE TABLE products ( mydb(> product_no integer PRIMARY KEY, mydb(> name text, mydb(> price numeric mydb(> ); ERROR: ignoring unconvertible UTF-8 character 0xd3cf mydb=> Actual Results: ERROR: ignoring unconvertible UTF-8 character 0xd3cf and nothing happen Expected Results: created table Additional info: If I remove PRIMARY KEY keyword, everything is OK. Moreover, with UNICODE it's not really useful too: 1. Turn terminal to have UTF-8 encoding 2. Set corresponding encoding at command prompt $ export LANG=ru_RU.utf8 3. Connect to same unicoded DB $ psql -d mydb 4. Set right encoding (I don't really need it, but) mydb=> \encoding UNICODE 5. Try to create table with primary key column mydb=> CREATE TABLE products ( product_no integer PRIMARY KEY, name text, price numeric); NOTICE: CREATE TABLE / PRIMARY KEY \uffff\uffff\uffff\uffff\uffff\uffff\uffff \uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff \uffff\uffff\uffff\uffff\uffff\uffff "products_pkey" \uffff\uffff\uffff \uffff\uffff\uffff\uffff\uffff\uffff\uffff "products" CREATE TABLE At my gnome-terminal this '\uffff' showed as a small squares. As far as I understand, there should be russian-language messages about creating additional index "products_pkey" on table "products". but when I put \d command, everything is OK and showed in russian with right encoding
*** Bug 171173 has been marked as a duplicate of this bug. ***
*** Bug 171172 has been marked as a duplicate of this bug. ***
Um ... the proposed test case works fine for me. Which Postgres and OS versions are you working with, exactly?
FC3 (kernel 2.6.12-1.1378_FC3) $ rpm -qa | grep postgres postgresql-libs-7.4.8-1.FC3.1 postgresql-server-7.4.8-1.FC3.1 postgresql-7.4.8-1.FC3.1 Just same results with rh-postgresql-7.3.10 on whitebox 3.0 (kernel 2.4.21-27.0.4.ELsmp) Did you setted up LANG to RUSSIAN? my results is highly reproducible with russian...
OK, as noted in the upstream discussion of this problem, http://archives.postgresql.org/pgsql-bugs/2005-10/msg00233.php this is something that doesn't seem practical to fix in existing versions of Postgres. For the time being, running a database with an encoding different from what is implied by the initdb-time LC_CTYPE is just not a supported configuration. This is not adequately warned against in the 7.4 documentation, but newer PG releases do have a warning about it. http://www.postgresql.org/docs/8.0/static/multibyte.html says: Important: Although you can specify any encoding you want for a database, it is unwise to choose an encoding that is not what is expected by the locale you have selected. The LC_COLLATE and LC_CTYPE settings imply a particular encoding, and locale-dependent operations (such as sorting) are likely to misinterpret data that is in an incompatible encoding. Since these locale settings are frozen by initdb, the apparent flexibility to use different encodings in different databases of a cluster is more theoretical than real. It is likely that these mechanisms will be revisited in future versions of PostgreSQL. One way to use multiple encodings safely is to set the locale to C or POSIX during initdb, thus disabling any real locale awareness.