Bug 145124 (IT#61189_IT#60427)

Summary: Bash appears to be mishandling sub-processes that use recycled PIDs
Product: Red Hat Enterprise Linux 3 Reporter: Steve Conklin <sconklin>
Component: bashAssignee: Tim Waugh <twaugh>
Status: CLOSED ERRATA QA Contact: Ben Levenson <benl>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: cward, kmori, tao
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-05-19 12:53:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 132991, 191463    
Attachments:
Description Flags
testcase logs none

Description Steve Conklin 2005-01-14 16:17:00 UTC
Description of problem:

Customer runs a BASH script that sleeps and executes tests in a loop.
Occasionally, sleep terminates prematurely, which causes errors.

From IT#60427:
I've found the interesting fact from strace log collected by my
reproduction. Whenever the problem happened, the pid which had been
used by the background job process right before was assigned to the
process. For instance, pid 29255 had been assigned to "test_sub.sh" in
test case at the first, nextly, it was assigned to "sleep". The sleep
process was dealed as background job. As the result, this problem
happened. In grep issue, this fact is applicable.

29255 21:28:53.966341 execve("/mnt/test/60427/testing/test_sub.sh",
["/mnt/test/60427/testing/test_sub"...], [/* 31 vars */] <unfinished ...>
229255 22:31:41.371266 execve("/bin/sleep", ["sleep", "10"], [/* 31
vars */] <unfinished ...>


Version-Release number of selected component (if applicable):
bash-2.05b-29


How reproducible:
Customer has supplied scripts that reproduce the problem, they are
attached to issue 61189.

Steps to Reproduce:
1. Run test scripts
2.
3.
  
Actual results:
Every now and then, sleep returns prematurely.

Expected results:
Sleep should not return prematurely.

Additional info:

Comment 1 Steve Conklin 2005-01-14 16:19:18 UTC
This has been reproduced on U1 and U2, I have requested that it be
reproduced on U4.

Comment 2 Steve Conklin 2005-01-17 15:10:33 UTC
This is reported to have been reproduced on U4. Based on the way Bash performs
fork/exec, I think that the re-use of PIDs must be happening outside BASH.

Comment 5 Keiichi Mori 2005-01-18 07:00:57 UTC
If you execute many testcase at the same time, (of course, please change the log
file name in each testcase.), this problem happens shorter time, because pids
are re-used shorter term. In case of executing 5 testcases at the same time,
this problem has reproduced about one hour.



Comment 6 Tim Waugh 2005-01-18 11:39:07 UTC
From comment #2 it seems that this isn't a bash problem after all?

I'm not quite sure what this is about.  Could someone please explain it in more
detail?  What signal is delivered?  What does strace say at the time? (The
strace fragment in the original description only shows execve and no signals.)


Comment 14 Tim Waugh 2005-01-20 12:37:32 UTC
I understand now.  Thanks for the detailed strace.  I am now analysing it.

Comment 15 Tim Waugh 2005-01-25 15:50:48 UTC
This also seems to happen with bash-3.0.

Comment 16 Tim Waugh 2005-01-27 15:04:23 UTC
Reported upstream with simplified test case.

Comment 17 Tim Waugh 2005-01-28 13:16:17 UTC
Please try this package:

  ftp://people.redhat.com/twaugh/tmp/bash/bash-2.05b-41.2.i386.rpm


Comment 25 Dennis Gregorovic 2005-05-19 12:53:20 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2005-437.html