Red Hat Bugzilla – Bug 108484
[:alpha:] character class wrong for (some?) UTF-8 locales
Last modified: 2007-04-18 12:58:55 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6a) Gecko/20031014
Description of problem:
In at least some UTF-8 locales, the open bracket character ('[') is included in
the set of alphabetic characters. This leads to matches on word boundaries
breaking, for example.
The closing bracket is *not* included in the set of alphabetic characters,
neither are parantheses or braces.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. echo [ | LANG=en_AU.UTF-8 grep -E "[[:alpha:]]" -
Actual Results: The echoed string is matched (so '[' is returned).
Expected Results: Nothing should have been matched.
Replace en_AU.UTF-8 with just en_AU and nothing is matched.
Replace en_AU.UTF-8 with de_DE.UTF-8 and the match happens.
Replace en_AU.UTF-8 with de_DE and nothing is matched.
Replace the '[' with ']' in all cases and nothing is matched.
The situation where we originally discovered this was when we were running a
echo "offset" | grep -w "a[offset]"
and it would only work in some locales.
Hmm ... the last example was overly simplified and does not work in any locale.
But put something like "a[offset] = 6;" into a file called foo and run
grep -w offset foo
and it doesn't work in en_AU.UTF-8 (my default locale), but does work in C and
int main (void)
setlocale (LC_ALL, "");
if (isalpha ('['))
if (regcomp (&re, "[[:alpha:]]", 0) || !regexec (&re, "[", 2, rm, 0))
doesn't abort in LC_ALL=en_AU.UTF-8 nor any other locale that I've tried,
I'd say this has nothing to do with glibc but grep.
echo [ | LANG=en_AU.UTF-8 sed -n "/[[:alpha:]]/p"
doesn't print anything either.
Created attachment 95604 [details]
Here is a potential fix.
An errata has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.