Description of problem: sort -n -t, does not work as expected (in the default, i.e. en_US.UTF-8 locale). Version-Release number of selected component (if applicable): coreutils-6.9-3.fc7 How reproducible: 100% Steps to Reproduce: 1. keep the default locales (LC_COLLATE and LC_NUMERIC) value of en_US.UTF-8 2. sort -n -t, -k1 <<'EOF' 2101,:4AgE3<G4RNDP` 21012,:A0QIX6AI10gMP 2101,2IJIETPY=g<10@ 21012,V8:AACI4TD925@ 21014,:1MG<hEb@AIhU` 2101,4H@38`5ELC66M` 2101,4h>HM812P4820P 21014,V8:AACI4TD925@ 2101,5AHBVEQW@dUGE@ EOF Actual results: 2101,:4AgE3<G4RNDP` 21012,:A0QIX6AI10gMP 2101,2IJIETPY=g<10@ 21012,V8:AACI4TD925@ 21014,:1MG<hEb@AIhU` 2101,4H@38`5ELC66M` 2101,4h>HM812P4820P 21014,V8:AACI4TD925@ 2101,5AHBVEQW@dUGE@ Expected results: 2101,4H@38`5ELC66M` 2101,4h>HM812P4820P 2101,5AHBVEQW@dUGE@ 2101,:4AgE3<G4RNDP` 21012,:A0QIX6AI10gMP 21012,V8:AACI4TD925@ 21014,:1MG<hEb@AIhU` 21014,V8:AACI4TD925@ Additional info: Locales from glibc-common-2.6-4. When LC_ALL=C is set, sorting works correctly. Apparently sort ignores the comma inside the number values (though in the en_US locale it probably should only be used as _thousands_ separator, so 2101,4 should not be a valid number in English). Also sort does not handle the "-t," separator argument correctly, because it seems the value after the comma is still being included in the sorting key.
LC_COLLATE seems to be irrelevant. It seems that LC_NUMERIC=en_US.UTF-8 is responsible for the problem. Only for en_US (same with en_US.UTF-8 ,en_US and en_US.iso885915) locales AND with comma separator I have output not sorted. All other locales and separators I checked seems to be ok. Will try to dig something from debug.
Thanks for the report, but what you're seeing is the required behavior. The problem is that by using -k1 you're telling it to use the entire line as the key, when you really want to use just the first column. Use -k1,1 instead, and it works the way you expect.
OK, maybe the sort(1) manpage should be fixed then. Currently it says: -k, --key=POS1[,POS2] start a key at POS1, end it at POS2 (origin 1) Maybe add something like "Without POS2, the whole part of the line starting at POS1 to the end of line is used." there.
Suggested manpage improvement added in RAWHIDE coreutils-6.10-2.fc9 , closing that bugzilla as NOTABUG , in the next update of F7/F8 coreutils I will backport the patch there too.