Bug 109568

Summary: gedit giving warnings under a Persian locale
Product: [Fedora] Fedora Reporter: Roozbeh Pournader <roozbeh>
Component: geditAssignee: Havoc Pennington <hp>
Status: CLOSED UPSTREAM QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 1CC: behdad
Target Milestone: ---Keywords: i18n, MoveUpstream
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-11-11 08:59:36 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
patch to fix two-byte decimal separator problem
none
replacement patch
none
glib-2.2.3 strtod 2byte decimal separator patch none

Description Roozbeh Pournader 2003-11-09 13:41:31 EST
Description of problem:
gedit gives warnings when running under the "fa_IR.UTF-8" locale. It
seems that it can't parse floating point numbers. This is most
probably a fontconfig, gtk2 or glib2 bug, but I can't say exactly
which one.

Version-Release number of selected component (if applicable):
2.4.0-3

How reproducible:
Always

Steps to Reproduce:
1. Run "LANG=fa_IR.UTF-8 gedit" from a gnome-terminal.


Actual Results:  The following warnings appear on the terminal before
gedit comes up:

/usr/share/themes/Bluecurve/gtk-2.0/gtkrc:50: error: scanner: digit is
beyond radix
Fontconfig error: line 172: "7.5": not a valid double
Fontconfig warning: line 173: missing test expression
Fontconfig error: line 184: "7.5": not a valid double
Fontconfig warning: line 185: missing test expression
Fontconfig error: line 196: "7.5": not a valid double
Fontconfig warning: line 197: missing test expression
Fontconfig error: line 436: "0.2": not a valid double
Fontconfig error: line 438: wrong number of matrix elements
Fontconfig error: Cannot load default config file


Expected Results:  No warnings should appear.

Additional info:

This is most probably due to the Persian locale using a decimal
separator that becomes two bytes when encoded in UTF-8.

I appreciate any help or pointer on how to find the source of the
problem, so I can patch it as soon as possible.
Comment 1 Behdad Esfahbod 2003-11-09 19:23:02 EST
You went really good to the point.  Can't image how you didn't got it.
 Well, it's the first appearance of a very wide and important problem
with Persian locale:  decimal separator is the same for both Latin and
Persian set of digits, and is NOT the usual dot (".").  So you should
have got it already:  When setting the whole locale to fa_IR, scanf
simply does not parse floats with dot as separator!!!  Here is the
simple test:

#include <stdio.h>
#include <locale.h>
                                                                     
          
int
main()
{
    double g = 0;
    setlocale(LC_ALL, "fa_IR.UTF-8");
    scanf("%d", &g);
    printf("%d\n", g);
}

Compile and run and see how 12.3 would be parsed as 12.

Fix:  In the mean time:  fontconfig parse it's config file with C
locale.  In the long run:  Add another decimal separator to the
locales definition.  You already know the issue.

behdad
Comment 2 Behdad Esfahbod 2003-11-10 01:27:27 EST
Oops, Of course I mant "%lg" not "%d" in both cases above.
Here is it fixed:

#include <stdio.h>
#include <locale.h>
                                                                     
                                                                     
                                                         
int
main()
{
    setlocale(LC_ALL, "fa_IR");
    while (1) {
        double g = 0;
        char buf[100];
        scanf("%lg%[^\n]", &g, buf);
        printf("You entered %lg\n", g);
    }
}


And here is the run:

123
You entered 123
123.456
You entered 123
123Ù«456
You entered 123Ù«456


See how the Persian decimal separator works.
Comment 3 Roozbeh Pournader 2003-11-10 07:34:08 EST
Behdad, it's not that problem. Try the same with a "fr_FR.UTF-8"
locale and see that the output of your program is the same (use the
comma instead of the Persian decimal separator in your last example).
While running gedit with with a fr_FR locale doesn't give you any
warnings.

BTW, fontconfig can't switch locales while running. It's not thread
safe to do that, I've heard from Keith Packard.

So, no, I don't know the issue here. My real guess is two bugs, one in
glib and one in fontconfig.
Comment 4 Behdad Esfahbod 2003-11-10 08:45:34 EST
Created attachment 95867 [details]
patch to fix two-byte decimal separator problem
Comment 5 Behdad Esfahbod 2003-11-10 08:46:31 EST
Attaching the one line patch.  Here is the description of the problem:
 In fcxml.c, FcStrtod, when the decimal separator is two bytes, *end
gets computed wrong, so later FcParseDouble nags.

Keith:  Would you take care of upstream? ;-)
Comment 6 Behdad Esfahbod 2003-11-10 08:47:15 EST
Oops, s/Keith/Roozbeh/
Comment 7 Behdad Esfahbod 2003-11-10 09:43:14 EST
Created attachment 95871 [details]
replacement patch

fixed a bug in the patch ;)
Comment 8 Behdad Esfahbod 2003-11-10 09:49:11 EST
Created attachment 95872 [details]
glib-2.2.3 strtod 2byte decimal separator patch

Here is the patch for exactly the same bug in glib2.
Comment 9 Roozbeh Pournader 2003-11-10 11:26:34 EST
Tested both patches. They fix the problem. Trying to report them upstream.
Comment 10 Roozbeh Pournader 2003-11-10 11:44:12 EST
Behdad, I just talked to keithp. keithp says there is no need for the
second patch. "buf_end will be > dot whenever dlen is > 1". Can you
confirm?
Comment 11 Roozbeh Pournader 2003-11-10 13:16:12 EST
fontconfig's bug fixed in its CVS HEAD and 2.2 branch. glib2's bug
reported at <http://bugzilla.gnome.org/show_bug.cgi?id=126640>.
Comment 12 Behdad Esfahbod 2003-11-10 21:27:19 EST
Roozbeh:  Just for the record, we talked on IRC and you comment #10 is
not true.  Did you applied the second patch to fontconfig?

Please Close the bug as upstream.
Comment 13 Roozbeh Pournader 2003-11-11 08:59:36 EST
Behdad: Yes, Keithp agreed that you were correct. The second patch was
applied to fontconfig.

Closing the bug as upstream.