From Bugzilla Helper: User-Agent: Mozilla/4.77 [en] (X11; U; Linux 2.2.16-3 i686) For specific input files (see attached file), the sort utility will go into an infinite loop. I first noticed this problem when sorting a 404MB file, and managed to reproduce the problem with a smaller input file using the "-S 4" option. When I set LC_ALL=C, the problem does not appear. Reproducible: Always Steps to Reproduce: 1. Execute "sort -S 4 test1 -o test1.output" (program hangs--you can examine temp files) 2. Execute "LC_ALL=C sort -S 4 test1 -o test1.output" (program is slow, but finishes) 3. Actual Results: For step 1, nothing happens. For step 2, the sort utility finishes as expected. Expected Results: The sort utility should finish successfully in both cases. I get the following when I interrupt the sort utility in gdb: (gdb) cont Continuing. Program received signal SIGINT, Interrupt. 0x400a33a9 in strcoll () from /lib/libc.so.6 (gdb) where #0 0x400a33a9 in strcoll () from /lib/libc.so.6 #1 0x0804ef62 in memcoll (s1=0x8057000 "0-0-0-0-0-0-0-0-0-0.COM.", s1len=25, s2=0x8056f78 "00000-00000.COM.", s2len=17) at memcoll.c:44 #2 0x0804b73a in compare (a=0x8057060, b=0x8056fd8) at sort.c:1410 #3 0x0804be3e in mergefps (fps=0xbfffd6c0, nfps=16, ofp=0x8056e08, output_file=0x8057280 "/tmp/sortnm2z9d") at sort.c:1583 #4 0x0804c7c4 in merge (files=0x8056980, nfiles=288, ofp=0x8056810, output_file=0xbffffc04 "zxcv") at sort.c:1758 #5 0x0804cd45 in sort (files=0x80567ac, nfiles=-1, ofp=0x8056810, output_file=0xbffffc04 "zxcv") at sort.c:1863 #6 0x0804e231 in main (argc=6, argv=0xbffffabc) at sort.c:2459 #7 0x4003a237 in __libc_start_main () from /lib/libc.so.6 (gdb) Quit
Created attachment 15664 [details] Test input file for sort failure
problem is still in textutils-2.0.13-1
Seems to be a problem with the strcoll() function. Consider the following program: #include <stdio.h> #include <string.h> #include <locale.h> int main( int argc, char *argv[] ) { char *t1 = "0-0-0-0-0-0-0-0-0-0.COM"; char *t2 = "00000-00000.COM"; setlocale( LC_ALL, "" ); printf( "strcoll( \"%s\", \"%s\" ) = %d\n", t1, t2, strcoll( t1, t2 ) ); printf( "strcoll( \"%s\", \"%s\" ) = %d\n", t2, t1, strcoll( t2, t1 ) ); } Yields the output: strcoll( "0-0-0-0-0-0-0-0-0-0.COM", "00000-00000.COM" ) = 1 strcoll( "00000-00000.COM", "0-0-0-0-0-0-0-0-0-0.COM" ) = 4 So when the sort utility is trying to merge the temporary files together in mergefps(), it keeps swapping the same two entries corresponding to the test samples above, as the first argument will ALWAYS be considered "greater" than the second one, regardless of the order they are passed. Will go digging through glibc unless someone else finds this one first...
I've checed into the CVS archive a patch to fix this problem. The fixed version will be in 2.2.3.
Can this be included if an errata RPM is made of glibc-2.2.2? This affects lots of other utilities in addition to "sort" like "ls", etc., etc.
It's fixed in rawhide and will be in the glibc errata we'll release in a while.