Bug 28414
Summary: | Spanish locale at glibc seems to be bad | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Carlos Perells Marmn <carlos> |
Component: | glibc | Assignee: | Jakub Jelinek <jakub> |
Status: | CLOSED NOTABUG | QA Contact: | Aaron Brown <abrown> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 7.0 | CC: | fweimer |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2001-02-20 13:18:49 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Carlos Perells Marmn
2001-02-20 12:01:44 UTC
No, glibc (and sort) is correct on this. LC_COLLATE=es_ES@euro sort sorts in a way how things are sorted in Spanish vocabulary, not how things are sorted in ISO-8859-15. Use LC_ALL=C sort if you want ASCII sorting. The data you put above should be sorted the same way as es_ES under e.g. en_US. Just think as if -, . and ; were replaced with nothing and the thing would be sorted, you'd get the same result as in the es_ES collating sort. Please, if you read the link to the table that I have send you (http://czyborra.com/charsets/iso8859.html#ISO-8859-1) you can see that the "-" character must go BEFORE the "0" (zero) character and the result of the sort command is sorting badly (for that table). I know that the sort that I get with LC_ALL=en_US is not equal as the LC_ALL=es_ES one, but I also know that the LC_ALL=es_ES sort result is not the correct one for es_ES (or es_ES@euro). Please, Have you read the Derek Tattersall's answer? Thanks. Sorry, but the table has nothing to do with this issue. In the table you can find that strcmp("-B", "0A") < 0 which is not the same as strcoll("-B", "0A") in most locales (including Spanish). i18n sorting is not about comparing character values, it is a complex set of rules. Most west european locales use ISO/IEC TR 14652 for this. Please look into some printed Spanish vocabulary and you'll find out e.g. hyphen is not considered as a separate letter there when the letter characters are different, e.g. as in this order: aa a-b ac I don't know whom Derek talked about this to, but the thing is really, if you want to sort in the way Unix was sorting since 70's until several years ago, you can use LC_ALL=C sort, if you want to sort how people sort things for centuries, use your own locale. |