Bug 675259 - incorrect numeric settings for French, Spanish, and German locales
Summary: incorrect numeric settings for French, Spanish, and German locales
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: glibc
Version: 5.6
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Jeff Law
QA Contact: qe-baseos-tools-bugs
URL:
Whiteboard:
Depends On: 236212
Blocks: 688720 1047909
TreeView+ depends on / blocked
 
Reported: 2011-02-04 18:13 UTC by Jeff Bastian
Modified: 2019-04-16 13:59 UTC (History)
2 users (show)

Fixed In Version: glibc-2.5-79
Doc Type: Bug Fix
Doc Text:
Clone Of: 236212
: 688720 (view as bug list)
Environment:
Last Closed: 2012-02-21 06:32:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch to correct French thousands separator (1.47 KB, patch)
2011-02-04 23:09 UTC, Jeff Bastian
no flags Details | Diff
patch to correct French thousands separator and grouping (1.16 KB, patch)
2011-02-05 00:09 UTC, Jeff Bastian
no flags Details | Diff
patch for French, Spanish, and German locales for LC_NUMERIC (9.71 KB, patch)
2011-02-15 18:48 UTC, Jeff Bastian
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2012:0260 0 normal SHIPPED_LIVE glibc bug fix update 2012-02-20 15:06:42 UTC

Description Jeff Bastian 2011-02-04 18:13:06 UTC
+++ This bug was initially created as a clone of Bug #236212 +++

Description of problem:
In French, the thousands separator is a space, for example, "1 024".  The fr_FR locale in RHEL 5 has the thousands separator incorrectly defined as a null.

This was fixed with upstream commit bfe6bf1:
http://sourceware.org/git/?p=glibc.git;a=commit;h=bfe6bf17a38b407eb1210f6f7e6d2561c7da05aa

Please update the RHEL 5 fr_FR locale definitions.


Version-Release number of selected component (if applicable):
glibc-2.5-58

How reproducible:
Always

Steps to Reproduce:
env LC_ALL=fr_FR.UTF-8 /usr/bin/printf "%'d\n" 4294967296

Actual Results:
4294967296

Expected Results:
4 294 967 296

Comment 1 Jeff Bastian 2011-02-04 18:17:27 UTC
Commit b632f9a also updates the self-tests when building glibc:

http://sourceware.org/git/?p=glibc.git;a=commit;h=b632f9a81640db676905250257e677b415c963f9

Comment 3 Jeff Bastian 2011-02-04 23:09:11 UTC
Created attachment 477140 [details]
patch to correct French thousands separator

The upstream patches combined for RHEL 5

Comment 5 Jeff Bastian 2011-02-04 23:17:03 UTC
Hmm, even with the patch, it's not printing correctly:

[user@localhost ~]$ rpm -q glibc
glibc-2.5-58.bz675259.x86_64
glibc-2.5-58.bz675259.i686

[user@localhost ~]$ env LC_ALL=fr_FR.UTF-8 /usr/bin/printf "%'d\n" 4294967296
4294967296


However, another quick test shows that the thousands separator is correctly set to a space:

[user@localhost ~]$ cat thousands_sep.c
#include<locale.h>
#include<stdio.h>
int main(void)
{
    struct lconv locale_structure;
    struct lconv *locale_ptr=&locale_structure;

    setlocale(LC_ALL, "fr_FR.UTF-8");

    locale_ptr=localeconv();
    printf("Thousands Separator: '%s'\n",locale_ptr->thousands_sep);
}

[user@localhost ~]$ gcc -o thousands_sep thousands_sep.c

[user@localhost ~]$ ./thousands_sep
Thousands Separator: ' '


I'll do some more research next week.

Comment 6 Jeff Bastian 2011-02-04 23:27:58 UTC
Oh, I think I found the problem: the grouping also has to be changed.
 LC_NUMERIC
 decimal_point             "<U002C>"
 thousands_sep             "<U0020>"
-grouping                  0;0
+grouping                  3
 END LC_NUMERIC

This was fixed with commit d03eba1:
http://sourceware.org/git/?p=glibc.git;a=commit;h=d03eba121c430068fe97f3f85495b2d1fe9b694f

Reported upstream at:
http://sourceware.org/bugzilla/show_bug.cgi?id=6040

I'll try fixing the grouping too.

Comment 7 Jeff Bastian 2011-02-05 00:04:23 UTC
Yes, changing the grouping fixed the problem.

I did a quick rebuild of the fr_FR locales instead of rebuilding all of glibc to verify it.

1. First, as root, edit the fr_FR locale file and changing the grouping in 
   the LC_NUMERIC from 0;0 to 3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
vim /usr/share/i18n/locales/fr_FR
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

2. Next, still as root, compile the fr_FR and fr_FR@euro locale files:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zcat /usr/share/i18n/charmaps/ISO-8859-1.gz > \
     /usr/share/i18n/charmaps/ISO-8859-1
zcat /usr/share/i18n/charmaps/UTF-8.gz > \
     /usr/share/i18n/charmaps/UTF-8
localedef -c -f /usr/share/i18n/charmaps/ISO-8859-1 \
             -i /usr/share/i18n/locales/fr_FR \
             /usr/lib/locale/fr_FR
localedef -c -f /usr/share/i18n/charmaps/ISO-8859-1 \
             -i /usr/share/i18n/locales/fr_FR@euro \
             /usr/lib/locale/fr_FR@euro
localedef -c -f /usr/share/i18n/charmaps/UTF-8 \
             -i /usr/share/i18n/locales/fr_FR@euro \
             /usr/lib/locale/fr_FR.utf8
build-locale-archive
rm /usr/share/i18n/charmaps/ISO-8859-1
rm /usr/share/i18n/charmaps/UTF-8
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

3. Finally, test it out:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[user@localhost ~]$ env LC_ALL=fr_FR /usr/bin/printf "%'d\n" 4294967296 
4 294 967 296

[user@localhost ~]$ env LC_ALL=fr_FR.UTF-8 /usr/bin/printf "%'d\n" 4294967296
4 294 967 296

[user@localhost ~]$ env LC_ALL=fr_FR.ISO-8859-1 /usr/bin/printf "%'d\n" 4294967296
4 294 967 296
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It works!

Comment 8 Jeff Bastian 2011-02-05 00:09:48 UTC
Created attachment 477151 [details]
patch to correct French thousands separator and grouping

Comment 11 Jeff Bastian 2011-02-15 18:48:01 UTC
Created attachment 478946 [details]
patch for French, Spanish, and German locales for LC_NUMERIC

We've discovered more problems with LC_NUMERIC settings in some Spanish and German locales.

I've compared the fr_*, es_*, and de_* files in RHEL 5 glibc against the latest Unicode CLDR (Common Locale Data Repository), version 1.9, and patched the LC_NUMERIC section where appropriate.  See the attached patch.

http://unicode.org/Public/cldr/1.9.0/posix.zip
http://unicode.org/cldr/trac/browser/tags/release-1-9/posix/

Comment 13 Jeff Bastian 2011-02-15 22:32:22 UTC
Testing the patch from comment 11.  All French, Spanish and German locales have a grouping of 3 now and non-null thousands separator.

[username@localhost ~]$ rpm -q glibc-common
glibc-common-2.5-58.bz675259.2.x86_64

[username@localhost ~]$ cat numeric.sh 
#!/bin/bash

for L in /usr/share/locale/{fr,es,de}_* ; do
  loc=$(basename $L).UTF-8
  echo $loc
  env LC_ALL=$loc printf "%d = %'d\n" 4294967296 4294967296
  env LC_ALL=$loc printf "%f = %'f\n" 1234.1234 1234.1234
  echo
done

[username@localhost ~]$ ./numeric.sh 
fr_BE.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

fr_CA.UTF-8
4294967296 = 4 294 967 296
1234,123400 = 1 234,123400

fr_CH.UTF-8
4294967296 = 4'294'967'296
1234.123400 = 1'234.123400

fr_FR.UTF-8
4294967296 = 4 294 967 296
1234,123400 = 1 234,123400

es_AR.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_CL.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_CO.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_CR.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_DO.UTF-8
4294967296 = 4,294,967,296
1234.123400 = 1,234.123400

es_EC.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_ES.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_GT.UTF-8
4294967296 = 4,294,967,296
1234.123400 = 1,234.123400

es_HN.UTF-8
4294967296 = 4,294,967,296
1234.123400 = 1,234.123400

es_MX.UTF-8
4294967296 = 4,294,967,296
1234.123400 = 1,234.123400

es_NI.UTF-8
4294967296 = 4,294,967,296
1234.123400 = 1,234.123400

es_PA.UTF-8
4294967296 = 4,294,967,296
1234.123400 = 1,234.123400

es_PE.UTF-8
4294967296 = 4,294,967,296
1234.123400 = 1,234.123400

es_PR.UTF-8
4294967296 = 4,294,967,296
1234.123400 = 1,234.123400

es_SV.UTF-8
4294967296 = 4,294,967,296
1234.123400 = 1,234.123400

es_UY.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_VE.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

de_AT.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

de_CH.UTF-8
4294967296 = 4'294'967'296
1234.123400 = 1'234.123400

de_DE.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

Comment 21 Jeff Law 2011-12-13 20:52:50 UTC
Miroslav,

You are correct, Andreas failed to include the fix for several locales which were noted as incorrect in c13.  

However, the desired output c13 is incorrect (and thus the patch is also incorrect) in that many of the thousands separators for the es_* locals are wrong.  I had a discussion about these problems with Uli a month or so ago.

I'm not going to try and backport the exact changes Uli made as they include a variety of unrelated fixes.  However, I do have a patch which fixes the thousands separator and grouping for all the locals mentioned in this BZ.  I expect I'll have those builds today.  I'll also include the correct output for the numeric.sh testscript.

Comment 22 Jeff Law 2011-12-14 19:49:56 UTC
Here's the proper output for numeric.sh.

fr_BE.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

fr_CA.UTF-8
4294967296 = 4 294 967 296
1234,123400 = 1 234,123400

fr_CH.UTF-8
4294967296 = 4'294'967'296
1234.123400 = 1'234.123400

fr_FR.UTF-8
4294967296 = 4 294 967 296
1234,123400 = 1 234,123400

es_AR.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_CL.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_CO.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_CR.UTF-8
4294967296 = 4 294 967 296
1234,123400 = 1 234,123400

es_DO.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_EC.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_ES.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_GT.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_HN.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_MX.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_NI.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_PA.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_PE.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_PR.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_SV.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_UY.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

es_VE.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

de_AT.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

de_CH.UTF-8
4294967296 = 4'294'967'296
1234.123400 = 1'234.123400

de_DE.UTF-8
4294967296 = 4.294.967.296
1234,123400 = 1.234,123400

Comment 23 Jeff Bastian 2011-12-14 21:04:26 UTC
For future reference, what sources did you use for the locales?

I based my patch on the POSIX data at unicode.org.  Is this not a reliable source?

Comment 24 Jeff Law 2011-12-14 21:10:09 UTC
I'm not sure what source Uli used for all of them; however, he explicitly noted that CLDR is not considered accurate or authoritative.  Generally, gov't specs are considered authoritative.

Comment 25 errata-xmlrpc 2012-02-21 06:32:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0260.html


Note You need to log in before you can comment on or make changes to this bug.