Bug 49911 - bash 2.04 has runaway process bug
bash 2.04 has runaway process bug
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: bash (Show other bugs)
7.1
All Linux
medium Severity high
: ---
: ---
Assigned To: Tim Waugh
Aaron Brown
http://www.foogod.com/autonet/
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-07-24 21:00 EDT by Alex Stewart
Modified: 2007-04-18 12:35 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2002-09-25 21:55:11 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alex Stewart 2001-07-24 21:00:53 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.72 [en] (X11; U; Linux 2.2.14-5.0 i586)

Description of problem:
bash 2.04 as shipped with RHL 7.1 has a bug which can cause scripts to
mysteriously spin out of control consuming large amounts of CPU time
(looping endlessly on wait4 system call).

A new version of bash (2.05) is available which fixes this problem.

This bug can render long-running scripts unusable.  The unpredictable
nature and resource-hogging behavior make this bug rather annoying, to say
the least.

How reproducible:
Usually

Steps to Reproduce:
1.  Download and install 'autonet' utility from above URL.
2.  Type 'autonet start'.
3.  Leave running for several hours.

(this process can be sped up a bit by changing the INTERVAL settings in
/etc/sysconfig/autonet to "1", but it still takes over an hour usually)

Actual Results:  After several hours of normal operation, the autonet
script will suddenly slip into a state where it consumes 99% of the CPU
time and becomes unresponsive.  An strace will show bash is doing endless
calls to the wait4 system call with an ECHLD result.

I suspect that similar results occur with other long-running or continuous
shell scripts, but have not yet pursued this.

Additional info:
Comment 1 Alex Stewart 2001-07-25 23:30:27 EDT
I have put updated bash RPMs up on http://www.foogod.com/software/autonet. 
These include all of the fixes from the 2.04 RPM, so they should be good for
general distribution unless there are any known issues in 2.05 that weren't in
2.04.

(someone might want to check whether the "exclude" patch is still needed if
anybody knows how to create a test case.  The bash code has been changing in
this area, so it may no longer be needed.  I have included it in the 2.05 RPM
just to be safe.)

-alex
Comment 2 Alex Stewart 2001-07-25 23:32:30 EDT
(err.. I meant "export" patch.. sigh.)
Comment 3 Bernhard Rosenkraenzer 2001-07-26 05:39:59 EDT
2.05 has been in rawhide for about 3 months.
Comment 4 Alex Stewart 2001-07-26 19:14:03 EDT
Ok, I missed the RPMs in the rawhide FTP directory before, however:

RAWHIDE is not an appropriate resolution for this bug.  This is not a new
feature request, or an issue with some part of the system that nobody uses, this
is a confirmed BUG in one of the core components of the system, a potentially
sytem crashing bug if it happens to hit the wrong process at the wrong time
(which it could do randomly).  I have encountered at least one other incident
with a runaway system script which I believe may be due to this bug (I was
unable to obtain enough info at the time to be sure, but it was similar).

Howabout an errata or _something_ to let people know that this problem exists
BEFORE it screws over their production systems without warning? (and that
there's a fix available).

Note You need to log in before you can comment on or make changes to this bug.