Red Hat Bugzilla – Bug 113231
[PostgreSQL] - ERROR: Invalid UNICODE character sequence found (0xc000)
Last modified: 2013-07-02 23:00:26 EDT
Description of problem:
Unable to query a UNICODE database with LIKE 'z%'
Version-Release number of selected component (if applicable):
Linux Fedora Core 1.
uname -r returns:
In psql \encoding reports me:
Using psql SELECT version() returns:
PostgreSQL 7.3.4-RH on i386-redhat-linux-gnu, compiled by GCC
i386-redhat-linux-gcc (GCC) 3.3.2 20031022 (Red Hat Linux 3.3.2-1)
Steps to Reproduce:
1.Create a Database using UNICODE encoding:
createdb -E UNICODE test
2. Run psql:
3. Create a test table:
CREATE TABLE auth_role
rol_id int4 not null,
rol_name varchar(50) unique not null,
rol_enable boolean default true,
INSERT INTO auth_role(rol_id, rol_name) VALUES (1, 'admin');
INSERT INTO auth_role(rol_id, rol_name) VALUES (2, 'zorro');
4. Execute this query:
SELECT * FROM auth_role WHERE rol_name LIKE 'z%';
SELECT * FROM auth_role WHERE rol_name LIKE 'Z%';
ERROR: Invalid UNICODE character sequence found (0xc000):
0 or more rows without error.
If you replace 'z%' with 'za%' or 'a%' you don't get the error.
Seems like the offending sequence is a 'z' char (upper or lower case)
followed by a '%' char.
I already contacted people at postgres and they said the problem is
only reproducible on Fedora. In other Linux distros even in RedHat9 it
is not reproducible at all.
I think we need to upgrade to postgres or recompile it.
Hope this help.
Here is the thread in PostgreSQL maillist that concluded the problem
is only with rpm fedora version:
I've spent quite a bit of time trying to reproduce this, without any
success. Where exactly did you get the Postgres binaries you are
using? Also, what locale are you using in the database
(pg_controldata can be used to check)?
Okay, further info from Paul Thomas helped me isolate the problem.
The failure only occurs when using a multibyte encoding (such as Unicode)
and a non-C locale (probably not all non-C locales will show it).
Paul's example was en_GB.UTF-8.
I've committed a patch into the upstream CVS repository. This is post
7.3.5 and it's unlikely that there will be any more 7.3.* community
releases, but possibly we can sneak it into Red Hat's SRPM as a patch.
(That's assuming Red Hat makes any more 7.3.* SRPMs; we may go to 7.4.)
I got the binaries from the Fedora distribution. As you pointed out
maybe the problem is related to locale. My locale is es_NI.UTF-8
numero de version de pg_control: 72
numero de version de Catalog: 200211021
Database cluster state: in production
ultima modificaciÃ³n de pg_control: miÃ© 04 feb 2004 17:59:59 CST
ID de archivo de log actual: 0
El siguiente segmento archivo log: 78
Latest checkpoint location: 0/4D5AE848
Prior checkpoint location: 0/4D5AC784
Latest checkpoint's REDO location: 0/4D5AE848
Latest checkpoint's UNDO location: 0/0
Latest checkpoint's StartUpID: 105
Latest checkpoint's NextXID: 2703308
Latest checkpoint's NextOID: 1285772
Time of latest checkpoint: miÃ© 04 feb 2004 17:59:57 CST
TamaÃ±o de bloque de la base de datos: 8192
Bloques/segmento en relacion grande: 131072
Longitud maxima de los identificadores: 64
Numero maximo de argumentos de funciÃ³n: 32
Tipo de almacenamiento de Date/Time: Punto flotante
Longitud maxima de nombre locales: 128
If you need more info, please let know. I will be glad to help to
resolve this bug.
A fix slated for RHEL3 Update 2 is now in QA.
An errata has been issued which should help the problem described in this bug report.
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen
this bug report if the solution does not work for you.