Bug 73394
Summary: | default LC_COLLATE should be C | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | seth arnold <sarnold> |
Component: | distribution | Assignee: | Bill Nottingham <notting> |
Status: | CLOSED NOTABUG | QA Contact: | Brock Organ <borgan> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.3 | CC: | ali, ed, jkeating, mitr, rvokal |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2002-09-04 03:21:14 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
seth arnold
2002-09-03 23:39:53 UTC
No. The sorting order in the locales is the standard order used by people in that locale for non-computer sorting for much longer than computers have been around. I'd consider it a RFE. I don't know how people were doing the sorts before computers were around but I'm pretty sure sysadmins weren't writing scripts before computers were around (or not long before) ;-) .. I, for one, would expect "C" or "POSIX" to be the default. And I've seen plenty of things written expecting as much. Even in RH 7.3 /etc/init.d/innd: # INN uses too many un-checked shell scripts unset LANG unset LC_COLLATE It's my opinion that people who ~want~ it to behave as LC_COLLATE=en_US (or whatever locale) can set that explicitly. Like the xinetd startup script does. ~I~ didn't find a place on the Opengroup site which specifies/recommends LC_COLLATE settings. Hrmm. I just found lots of docs on how locales should behave but not defaults. { There was some bits about POSIX being the fall-back.. } In any case, I think Seth has a valid concern. I'm guessing more people will run into trouble this way (en_US or what-not) and, in some cases, run in to trouble that backups will have to get them out of. I'm also guessing more people will be unhappy about this when it happens. One more note... right now Red Hat is targetting the data center of the enterprise, right? A lot of these places are moving from Solaris, AIX, IRIX, Hockey-PUX, etc. At least at Pratt & Whitney, where I currently work, a quick survey of non-Red Hat boxen show "C" and "POSIX" more often than not. Even the Cobalt boxes (based on RH 6.x) are "POSIX".. I hope Red Hat will re-consider the default behavior. Thanks much, -Ali I agree with the original poster. this is the behaviour I've come to expect from my work with Perl regexes , and other unixen. If I *wanted* [[:ascii:]] or [[:alpha:]] I'd say so explicitly. I do NOT expect [a-z] to arbitrarily include [A-Z] unless I ask *explicitly* for it. Setting the default LC_COLLATE="C" in /etc/sysconfig.i18n only makes sense. those who WANT the non-standard-unix behaviour can easily set this otherwise, but those of us who have come to expect a unix-like OS to behave like one, having this not be the default (and not being warned about it) is a considerable annoyance. I too, hope that Red Hat will consider the default behaviour. Bill, that is the problem: using "non-computer sorting" on a computer is highly surprising. Thanks. Wow, I would be really upset if rm [a-z]* deleted stuff that started w/ an uppercase letter. THis would really ruin the expected results. What is the technical reasoning for having [a-z] include [A-Z] as well? Go read the standards. Range expressions (such as [a-z]) are explicitly undefined outside the C (POSIX) locale (SuSV3, XBD6 (Base Definitions), section 9.3.5 RE Bracket Expression, paragraph 7). This discussion has already been beated at the Austin working group and elsewhere. If you are a human, LC_COLLATE=en_US makes sense. If you are a script, set LC_ALL=C at the beginning and be happy. I've always considered bash to be a constant script, with me feeding it line by line. Since the bash prompt doesn't take command such as: "Please go and find every html file and convert the permissions so that the world can read, group can read, and the owner can read and write.", I don't expect other aspects of the bash prompt to be 'human' as well. As I stated above, I, along with most the development community I am associated with, expect things to be sorted as the C or POSIX standard state. [a-z] does _not_ include A-Z. With all the work that Red Hat does to preserve standards, I really feel that the ball was dropped on this one. It just adds one more step that one has to do to make Red Hat a sane and useable OS again. |