Bug 217359 - [A-Z] globbing doesn't match glob() or fnmatch() behaviour
[A-Z] globbing doesn't match glob() or fnmatch() behaviour
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: bash (Show other bugs)
8
All Linux
medium Severity medium
: ---
: ---
Assigned To: Roman Rakus
Ben Levenson
:
Depends On:
Blocks: F8Target
  Show dependency treegraph
 
Reported: 2006-11-27 09:26 EST by Tim Waugh
Modified: 2014-01-12 19:06 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-05-29 09:29:41 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
regex_vs_strcoll.c (2.16 KB, text/plain)
2007-01-23 12:04 EST, Tim Waugh
no flags Details

  None (edit)
Description Tim Waugh 2006-11-27 09:26:51 EST
Description of problem:
In certain locales (e.g. en_US and en_GB), bash's handling of [A-Z] in a file
pattern differs from that of glob() or fnmatch(): in particular bash will match
'h' even though glibc will not.

Version-Release number of selected component (if applicable):
bash-3.1-16.1

How reproducible:
100%

Steps to Reproduce:
1. touch h
2. sh -c 'LC_ALL=en_US; echo [A-Z]'
3. python -c 'from locale import *; import glob; setlocale(LC_ALL,"en_US");
print glob.glob("[A-Z]")'

Actual results:
h
[]

Expected results:
Both lists identical.
Comment 1 Tim Waugh 2007-01-23 11:28:40 EST
Another example:

echo h | LC_ALL=en_US.UTF-8 grep '[A-Z]'

This is with the grep from Fedora, so it uses re_search() directly.
Comment 2 Tim Waugh 2007-01-23 11:42:21 EST
And here's a demo using glob():

#include <glob.h>
#include <stdio.h>
int
main (int argc, char *argv[])
{
        glob_t globr;
        int i, r;
        r = glob (argv[1], 0, NULL, &globr);
        for (i = 0; i < globr.gl_pathc; i++)
                puts (globr.gl_pathv[i]);
        return r;
}

Try:

touch h H
./glob '[A-Z]'

Output: H
Comment 3 Tim Waugh 2007-01-23 12:04:40 EST
Created attachment 146319 [details]
regex_vs_strcoll.c

Jakub's test case comparing regex, fnmatch and strcoll.
Comment 4 Tim Waugh 2007-02-08 08:21:04 EST
Changing component to glibc, since that seems the most likely culprit.
Comment 5 Ulrich Drepper 2007-02-09 16:55:06 EST
I fail to see the problem so far.  Jakub's problem shows this:

A 380
H 417
h 297
Z 491
A 101
H 108
h 82
Z 126
regex [A-Z] did not match h
strcoll (A, h) -7 strcoll (h, Z) -18 means in range
fnmatch ([A-Z], h, 0) did not match

That's all perfectly fine.  The strcoll result has nothing whatsoever to do with
the range match.  strcoll uses collation weights, ranges use collation sequence
values, completely different concept.  Not matching 'h' (note, lowercase) is
correct since if you look at the locale definition you'll see that first all
lower characters are described and then the uppercase.  So h is not in A-Z.  H
(uppercase) of course is.

From all I can see so far it's entirely bash's fault by not implementing
globbing correctly.  bash really must use the fnmatch code from glibc itself.
Comment 6 Hrunting Johnson 2007-06-27 21:29:45 EDT
Well, whatever the cause the current matching is extremely dangerous I today
wiped out a bunch of files I wasn't expecting to because rm -f [A-Z] didn't
behave as it should under en_US.TUTF8, the default LANG

Has there been any progress on this issue?
Comment 7 Tim Waugh 2007-06-28 05:34:50 EDT
No, there has been no progress.  Upstream has no interest in copying code from
glibc internals, and insists that the provided interfaces should be sufficient.
 The end result of course is that glibc's behaviour differs from bash's, but the
point about the interfaces being deficient remains.

Ulrich, is it really the case that bash must copy code from the glibc internals?
Comment 8 Ulrich Drepper 2007-08-13 17:52:09 EDT
It's been a long time since I looked at bash code.  The problem is that it
insists on using it's on glob and fnmatch because of the additional
functionality to interrupt the operation?

I don't think this interrupt functionality ever really worked.  If a glob call
hangs it is in readdir calls which are not interrupted.  In the very old days
glob might gobble up too much memory and slow down but this really isn't that
much of an issue anymore.  We have a lot more memory.

You should just use the glibc functions as they are.  If for some reason this
isn't possible you have to copy the glibc code.
Comment 9 Till Maas 2008-01-05 09:05:38 EST
This bug is still present in Fedora 7
Comment 10 William Shotts 2008-02-13 15:12:41 EST
I see the this problem with Ubuntu 7.10, as well as, Fedora 7.

There is a discussion of the bug over at CentOS:

http://bugs.centos.org/view.php?id=1511

From this, I have to assume that this issue also affects RHEL?  Can't believe
those customers haven't raised holy heck about it.
Comment 11 Bug Zapper 2008-05-14 08:05:28 EDT
This message is a reminder that Fedora 7 is nearing the end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 7. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '7'.

Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 7's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 7 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug. If you are unable to change the version, please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. If possible, it is recommended that you try the newest available Fedora distribution to see if your bug still exists.

Please read the Release Notes for the newest Fedora distribution to make sure it will meet your needs:
http://docs.fedoraproject.org/release-notes/

The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 12 Ulrich Drepper 2008-05-14 10:24:17 EDT
The bash in F8 (3.2-20) still doesn't use the glob from glibc as far as I can
see.  Moving the version to F8.
Comment 13 Roman Rakus 2008-05-28 09:48:05 EDT
bash should be compiled with -DUSE_POSIX_GLOB_LIBRARY if you want to use glob
from glibc.
Comment 14 Roman Rakus 2008-05-29 09:29:41 EDT
Compiled with -DUSE_POSIX_GLOB_LIBRARY in rawhide bash-3.2-24.fc10

Note You need to log in before you can comment on or make changes to this bug.