i386 (all?) Programm: sort Environ: any Sporadic -- no. Systematic -- yes. Possible system crash -- no. Possible user programms crash -- yes. Description: Merge with fifo as input files always opens a temporary file and can crash the programm and overflow the file system. Testbed: mkfifo /tmp/aaa /tmp/aab awk 'BEGIN{for(i=0;i<10000000;i++) print i;}' >> /tmp/aab & awk 'BEGIN{for(i=0;i<10000000;i++) print i;}' >> /tmp/aaa & sort -m -n /tmp/aaa /tmp/aab >/dev/null & Well, the actual problem is merging very big files, usually when the result is not saved on the disk. It seems that RH 6.x had no problems. Expected: -- normal termination, no result. Actual results: -- No space on filesystem /tmp/ Lateral effects -- can crash any program that use /tmp. -- performance impact (file written twice). -- if /tmp/ is located on another filesystem the bug can lead to lost of data. Probable location: Buggy check for same file in the input list.
So, put simply: 'sort uses temporary file'?
Can you please explain in more detail where you think the bug is? I'm not sure what you mean by 'Buggy check for same file in the input list.'. Thanks.
Sorry for the delay. I think that the error is in the function first_same_file() in combination with the logic of temporaty file creation in merge(), when using pipes. The reason of the bug: When named pipes (pp) are used as an output of sort, there might be a feedback (so there can exists cat pp > entry_of_sort) and the result, according to /bin/sort, must be saved in a temporary file in order to avoid the immidiate feedback that will block the program. The behaviour is actually not the expected from the user, who usually uses the pipes as a dynamics structure and expects that the result will be written directly into the pipe. It is possible to construct some rather rare examples, where the data feedback via named pipes is usefull, but the "normal" usage of the pipes is to use them as unidirectional dataflow. After all it is also possible to add a buffer to sort as for example sort | buffer.sh > pp and the buffer will behave exactly as the current version of sort. buffer.sh: #!/bin/csh while(1) cat > tmpfile cat tmpfile end On contrary, it is not possible to avoid the temporary file with the current logic of the program sort. More sofisticated logic will be to check the existance of the data flow loop, but I think that too much sofistication is sort can bring only harm. At least the moment of the check can not be defined.
This is the sort of thing that needs to be addressed by the upstream maintainers of coreutils.