Bug 171174 - problems with UNICODE databases and psql with KOI8 clients
problems with UNICODE databases and psql with KOI8 clients
Status: CLOSED UPSTREAM
Product: Fedora
Classification: Fedora
Component: postgresql (Show other bugs)
3
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Tom Lane
David Lawrence
:
: 171172 171173 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-10-18 22:29 EDT by CTAC
Modified: 2013-07-02 23:06 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-10-20 01:34:51 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description CTAC 2005-10-18 22:29:36 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.7.12-1.3.1

Description of problem:
psql does not allow to communicate with UNICODE databases when use 'advanced' columns descriptions.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Install and run postgres server
2. Create db with UNICODE encoding 
     $ createdb -E UNICODE mydb
2. Set KOI8 locale 
     $ export LANG=ru_RU.koi8r
3. Connect to this DB from terminal with KOI8 encoding
     $ psql -d mydb
5. Set RIGHT encoding to communicate with DB
     mydb=> \encoding KOI8
6. Just create table with 'PRIMARY KEY' specifier (I've got this table description from pgsql docs at http://www.postgresql.org/docs/7.3/interactive/ddl-constraints.html#AEN1849
)
     mydb=> CREATE TABLE products (
     mydb(>     product_no integer PRIMARY KEY,
     mydb(>     name text,
     mydb(>     price numeric
     mydb(> );
     ERROR:  ignoring unconvertible UTF-8 character 0xd3cf
     mydb=>

Actual Results:       ERROR:  ignoring unconvertible UTF-8 character 0xd3cf
and nothing happen

Expected Results:  created table

Additional info:

If I remove PRIMARY KEY keyword, everything is OK.

Moreover, with UNICODE it's not really useful too:
1. Turn terminal to have UTF-8 encoding
2. Set corresponding encoding at command prompt
   $ export LANG=ru_RU.utf8
3. Connect to same unicoded DB
   $ psql -d mydb
4. Set right encoding (I don't really need it, but) 
   mydb=> \encoding UNICODE
5. Try to create table with primary key column
   mydb=> CREATE TABLE products ( product_no integer PRIMARY KEY, name text, price numeric);
NOTICE:  CREATE TABLE / PRIMARY KEY \uffff\uffff\uffff\uffff\uffff\uffff\uffff \uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff\uffff \uffff\uffff\uffff\uffff\uffff\uffff "products_pkey" \uffff\uffff\uffff \uffff\uffff\uffff\uffff\uffff\uffff\uffff "products"
CREATE TABLE
    
At my gnome-terminal this '\uffff' showed as a small squares. 
As far as I understand, there should be russian-language messages about creating additional index "products_pkey" on table "products".

but when I put \d command, everything is OK and showed in russian with right encoding
Comment 1 CTAC 2005-10-18 22:32:35 EDT
*** Bug 171173 has been marked as a duplicate of this bug. ***
Comment 2 CTAC 2005-10-18 22:33:57 EDT
*** Bug 171172 has been marked as a duplicate of this bug. ***
Comment 3 Tom Lane 2005-10-18 23:00:56 EDT
Um ... the proposed test case works fine for me.  Which Postgres and OS versions are you working with, 
exactly?
Comment 4 CTAC 2005-10-19 00:29:59 EDT
FC3 (kernel 2.6.12-1.1378_FC3)
$ rpm -qa | grep postgres
postgresql-libs-7.4.8-1.FC3.1
postgresql-server-7.4.8-1.FC3.1
postgresql-7.4.8-1.FC3.1

Just same results with rh-postgresql-7.3.10 on whitebox 3.0 (kernel
2.4.21-27.0.4.ELsmp)

Did you setted up LANG to RUSSIAN? my results is highly reproducible with russian...
Comment 5 Tom Lane 2005-10-20 01:34:51 EDT
OK, as noted in the upstream discussion of this problem,
http://archives.postgresql.org/pgsql-bugs/2005-10/msg00233.php
this is something that doesn't seem practical to fix in existing versions of Postgres.  For the time being, 
running a database with an encoding different from what is implied by the initdb-time LC_CTYPE is just 
not a supported configuration.  This is not adequately warned against in the 7.4 documentation, but 
newer PG releases do have a warning about it.
http://www.postgresql.org/docs/8.0/static/multibyte.html
says:

Important: Although you can specify any encoding you want for a database, it is unwise to choose an 
encoding that is not what is expected by the locale you have selected. The LC_COLLATE and LC_CTYPE 
settings imply a particular encoding, and locale-dependent operations (such as sorting) are likely to 
misinterpret data that is in an incompatible encoding.

Since these locale settings are frozen by initdb, the apparent flexibility to use different encodings in 
different databases of a cluster is more theoretical than real. It is likely that these mechanisms will be 
revisited in future versions of PostgreSQL.

One way to use multiple encodings safely is to set the locale to C or POSIX during initdb, thus disabling 
any real locale awareness.

Note You need to log in before you can comment on or make changes to this bug.