This service will be undergoing maintenance at 20:00 UTC, 2017-04-03. It is expected to last about 30 minutes
Bug 40571 - locale bug?
locale bug?
Status: CLOSED NOTABUG
Product: Red Hat Linux
Classification: Retired
Component: glibc (Show other bugs)
7.1
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Jakub Jelinek
Aaron Brown
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-05-14 14:02 EDT by Alexander Kanevskiy
Modified: 2016-11-24 10:06 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2001-05-15 05:52:29 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alexander Kanevskiy 2001-05-14 14:02:28 EDT
Description of Problem:
seems that 'O' is between a-z

How Reproducible:

Steps to Reproduce:

[kad@bofh kad]$ set|grep -E 'LANG|LC_'
LANG=en_US
LC_CTYPE=ru_RU.KOI8-R
[kad@bofh kad]$ echo O|grep [a-z]
O
[kad@bofh kad]$ echo O|LANG= grep [a-z]
[kad@bofh kad]$ echo O|LANG=C grep [a-z]
[kad@bofh kad]$ echo O|LANG=ru_RU.KOI8-R grep [a-z]
O
[kad@bofh kad]$ echo O|LANG=de_DE grep [a-z]
O
[kad@bofh kad]$ echo O|LANG=uk_UA grep [a-z]
O
[kad@bofh kad]$ echo O|LANG=netu grep [a-z]
[kad@bofh kad]$ echo O|LANG=en_GB grep [a-z]
O
[kad@bofh kad]$ echo O|LC_CTYPE= grep [a-z]
O

Expected Results:
O must not be present in any output.

Additional Information:
sort make it like that:e
a
...
l
m
n
O
o
p
...
z
Comment 1 Jakub Jelinek 2001-05-15 05:43:34 EDT
Why do you think it is a bug?
In several locales, O is after n and before p.
If you expect the POSIX locale collation, you should use the POSIX locale,
or e.g. use [[:lower:]] if you really mean all lower case letters.
Comment 2 Alexander Kanevskiy 2001-05-15 05:52:24 EDT
but why it match [a-z] ?!?! Capital 'O' can be matched by [A-Z] but not on [a-z]
Comment 3 Jakub Jelinek 2001-05-15 05:59:03 EDT
Because if in a particular locale `O' comes between `a' and `z', it fits into
this range and thus matches `[a-z]' regular expression.
E.g. read info grep about regular expressions:
   For example, `[[:alnum:]]' means `[0-9A-Za-z]', except the latter
depends upon the POSIX locale and the ASCII character encoding, whereas
the former is independent of locale and character set.  (Note that the
brackets in these class names are part of the symbolic names, and must
be included in addition to the brackets delimiting the bracket list.)
As you can see, [a-z] is dependent on the locale (and whether grep has been
built with NLS support).
Comment 4 Eugene Kanter 2001-06-23 22:54:51 EDT
Looking at regex.c file it seems that there is a pattern to lower case
conversion unless (probably) told otherwise.

Note You need to log in before you can comment on or make changes to this bug.