Bug 102208 - Bug in sort when merging FIFO's
Summary: Bug in sort when merging FIFO's
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: coreutils
Version: 8.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Tim Waugh
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-08-12 16:08 UTC by Kostadin Koruchev
Modified: 2007-04-18 16:56 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-12-07 11:42:25 UTC
Embargoed:


Attachments (Terms of Use)

Description Kostadin Koruchev 2003-08-12 16:08:28 UTC
i386 (all?)
   
Programm:
 sort

Environ: any

Sporadic -- no. Systematic -- yes.

Possible system crash -- no. 

Possible user programms crash -- yes.

Description:

  Merge with fifo as input files always opens 
  a temporary file and can crash the
  programm and overflow the file system.

Testbed:

 mkfifo /tmp/aaa /tmp/aab
 awk 'BEGIN{for(i=0;i<10000000;i++) print i;}' >> /tmp/aab &
 awk 'BEGIN{for(i=0;i<10000000;i++) print i;}' >> /tmp/aaa &
 sort -m -n /tmp/aaa /tmp/aab >/dev/null &

 Well, the actual problem is merging very big files, usually when the result is
 not saved on the disk.

 It seems that RH 6.x had no problems.

Expected:

 -- normal termination, no result.

Actual results:

 -- No space on  filesystem /tmp/

Lateral effects

 -- can crash any program that use /tmp.
 -- performance impact (file written twice).
 -- if /tmp/ is located on another filesystem the bug can 
    lead to lost of data.

Probable location:

  Buggy check for same file in the input list.

Comment 1 Tim Waugh 2003-08-12 16:20:11 UTC
So, put simply: 'sort uses temporary file'?

Comment 2 Tim Waugh 2003-10-03 09:22:19 UTC
Can you please explain in more detail where you think the bug is?  I'm not sure
what you mean by 'Buggy check for same file in the input list.'.  Thanks.

Comment 3 Kostadin Koruchev 2003-10-14 11:37:55 UTC
Sorry for the delay.

I think that the error is in the function first_same_file() in
combination with the logic of temporaty file creation in merge(),
when using pipes.

The reason of the bug:

When named pipes (pp) are used as an output of sort,
there might be a feedback (so there can exists cat pp > entry_of_sort) 
and the result, according to /bin/sort, 
must be saved in a temporary file in order to avoid the immidiate feedback
that will block the program.

The behaviour is actually not the expected from the user, who usually
uses the pipes as a dynamics structure and expects that the result will
be written directly into the pipe.

It is possible to construct some rather rare examples, where the data
feedback via named pipes is usefull, but the "normal" usage of the pipes is
to use them as unidirectional dataflow. After all it is also possible
to add a buffer to sort as for example sort | buffer.sh > pp and the buffer
will behave exactly as the current version of sort.

buffer.sh:

   #!/bin/csh
   while(1)
     cat > tmpfile
     cat tmpfile
   end

On contrary, it is not possible to avoid the temporary file with the current
logic of the program sort.

More sofisticated logic will be to check the existance of the 
data flow loop, 
but I think that too much sofistication is sort can bring only harm.
At least the moment of the check can not be defined.



Comment 4 Tim Waugh 2004-12-07 11:42:25 UTC
This is the sort of thing that needs to be addressed by the upstream maintainers
of coreutils.


Note You need to log in before you can comment on or make changes to this bug.