Bug 217359

Summary: [A-Z] globbing doesn't match glob() or fnmatch() behaviour
Product: [Fedora] Fedora Reporter: Tim Waugh <twaugh>
Component: bashAssignee: Roman Rakus <rrakus>
Status: CLOSED RAWHIDE QA Contact: Ben Levenson <benl>
Severity: medium Docs Contact:
Priority: medium    
Version: 8CC: bshotts, drepper, robatino, triage, tsmetana
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-29 13:29:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 235704    
Attachments:
Description Flags
regex_vs_strcoll.c none

Description Tim Waugh 2006-11-27 14:26:51 UTC
Description of problem:
In certain locales (e.g. en_US and en_GB), bash's handling of [A-Z] in a file
pattern differs from that of glob() or fnmatch(): in particular bash will match
'h' even though glibc will not.

Version-Release number of selected component (if applicable):
bash-3.1-16.1

How reproducible:
100%

Steps to Reproduce:
1. touch h
2. sh -c 'LC_ALL=en_US; echo [A-Z]'
3. python -c 'from locale import *; import glob; setlocale(LC_ALL,"en_US");
print glob.glob("[A-Z]")'

Actual results:
h
[]

Expected results:
Both lists identical.

Comment 1 Tim Waugh 2007-01-23 16:28:40 UTC
Another example:

echo h | LC_ALL=en_US.UTF-8 grep '[A-Z]'

This is with the grep from Fedora, so it uses re_search() directly.

Comment 2 Tim Waugh 2007-01-23 16:42:21 UTC
And here's a demo using glob():

#include <glob.h>
#include <stdio.h>
int
main (int argc, char *argv[])
{
        glob_t globr;
        int i, r;
        r = glob (argv[1], 0, NULL, &globr);
        for (i = 0; i < globr.gl_pathc; i++)
                puts (globr.gl_pathv[i]);
        return r;
}

Try:

touch h H
./glob '[A-Z]'

Output: H


Comment 3 Tim Waugh 2007-01-23 17:04:40 UTC
Created attachment 146319 [details]
regex_vs_strcoll.c

Jakub's test case comparing regex, fnmatch and strcoll.

Comment 4 Tim Waugh 2007-02-08 13:21:04 UTC
Changing component to glibc, since that seems the most likely culprit.

Comment 5 Ulrich Drepper 2007-02-09 21:55:06 UTC
I fail to see the problem so far.  Jakub's problem shows this:

A 380
H 417
h 297
Z 491
A 101
H 108
h 82
Z 126
regex [A-Z] did not match h
strcoll (A, h) -7 strcoll (h, Z) -18 means in range
fnmatch ([A-Z], h, 0) did not match

That's all perfectly fine.  The strcoll result has nothing whatsoever to do with
the range match.  strcoll uses collation weights, ranges use collation sequence
values, completely different concept.  Not matching 'h' (note, lowercase) is
correct since if you look at the locale definition you'll see that first all
lower characters are described and then the uppercase.  So h is not in A-Z.  H
(uppercase) of course is.

From all I can see so far it's entirely bash's fault by not implementing
globbing correctly.  bash really must use the fnmatch code from glibc itself.

Comment 6 Hrunting Johnson 2007-06-28 01:29:45 UTC
Well, whatever the cause the current matching is extremely dangerous I today
wiped out a bunch of files I wasn't expecting to because rm -f [A-Z] didn't
behave as it should under en_US.TUTF8, the default LANG

Has there been any progress on this issue?

Comment 7 Tim Waugh 2007-06-28 09:34:50 UTC
No, there has been no progress.  Upstream has no interest in copying code from
glibc internals, and insists that the provided interfaces should be sufficient.
 The end result of course is that glibc's behaviour differs from bash's, but the
point about the interfaces being deficient remains.

Ulrich, is it really the case that bash must copy code from the glibc internals?

Comment 8 Ulrich Drepper 2007-08-13 21:52:09 UTC
It's been a long time since I looked at bash code.  The problem is that it
insists on using it's on glob and fnmatch because of the additional
functionality to interrupt the operation?

I don't think this interrupt functionality ever really worked.  If a glob call
hangs it is in readdir calls which are not interrupted.  In the very old days
glob might gobble up too much memory and slow down but this really isn't that
much of an issue anymore.  We have a lot more memory.

You should just use the glibc functions as they are.  If for some reason this
isn't possible you have to copy the glibc code.

Comment 9 Till Maas 2008-01-05 14:05:38 UTC
This bug is still present in Fedora 7

Comment 10 William Shotts 2008-02-13 20:12:41 UTC
I see the this problem with Ubuntu 7.10, as well as, Fedora 7.

There is a discussion of the bug over at CentOS:

http://bugs.centos.org/view.php?id=1511

From this, I have to assume that this issue also affects RHEL?  Can't believe
those customers haven't raised holy heck about it.

Comment 11 Bug Zapper 2008-05-14 12:05:28 UTC
This message is a reminder that Fedora 7 is nearing the end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 7. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '7'.

Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 7's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 7 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug. If you are unable to change the version, please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. If possible, it is recommended that you try the newest available Fedora distribution to see if your bug still exists.

Please read the Release Notes for the newest Fedora distribution to make sure it will meet your needs:
http://docs.fedoraproject.org/release-notes/

The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 12 Ulrich Drepper 2008-05-14 14:24:17 UTC
The bash in F8 (3.2-20) still doesn't use the glob from glibc as far as I can
see.  Moving the version to F8.

Comment 13 Roman Rakus 2008-05-28 13:48:05 UTC
bash should be compiled with -DUSE_POSIX_GLOB_LIBRARY if you want to use glob
from glibc.

Comment 14 Roman Rakus 2008-05-29 13:29:41 UTC
Compiled with -DUSE_POSIX_GLOB_LIBRARY in rawhide bash-3.2-24.fc10