133318 – bash hangs when multiple background processes started at once

Bug 133318 - bash hangs when multiple background processes started at once

Summary: bash hangs when multiple background processes started at once

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	bash
Sub Component:
Version:	rawhide
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Tim Waugh
QA Contact:	Ben Levenson
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-09-23 04:55 UTC by Ellen Shull
Modified:	2007-11-30 22:10 UTC (History)
CC List:	0 users
Fixed In Version:	3.0-15
Clone Of:
Environment:
Last Closed:	2004-09-26 14:44:55 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Ellen Shull 2004-09-23 04:55:46 UTC

Description of problem: 
If you start up two programs (kmail and konqueror in this case; not 
sure if it's specific to these two or not) in quick succession, bash 
goes into a loop on a waitpid call using up all spare CPU cycles and 
becomes nonresponsive.  Only closing one of the programs you started 
will bring it back. 
 
Version-Release number of selected component (if applicable): 
bash-3.0-14 
 
How reproducible: 
Always, but very specific steps to reproduce.  Feels very much like 
some kind of race condition to me. 
 
Steps to Reproduce: 
1.  Open a new bash shell.  In my case I'm opening a new tab in 
konsole, haven't tested to see if that element is necessary.  I have 
not had success reproducing this without this step, but due to the 
speed dependency I might just not have been fast enough those times. 
2.  Start up program #1 in the background.  The command I use is 
"kmail &>/dev/null &".  I'm not sure if it's necessary that kmail is 
used here, or if it has to be &, because kmail normally automatically 
daemonizes anyway; I just happened to start it with & when I 
discovered this because I wasn't thinking. 
3.  Quickly, before program #1 finishes loading (you will probably 
want to do this against a slow disk, and make sure program #1 and #2 
aren't in disk cache), start up program #2 in background.  In my case 
"konqueror &>/dev/null &".  If you do this after program #1 has 
finished loading, the problem doesn't seem to occur.  Also not sure 
if konq has to be program #2 for this to happen.  Konq does need & 
since it doesn't daemonize on its own. 
 
Actual results: 
You've got something like 
[wes@ip68-110-7-34 ~]$ kmail &>/dev/null & 
[1] 4306 
[wes@ip68-110-7-34 ~]$ konqueror &>/dev/null & 
[2] 4307 
[wes@ip68-110-7-34 ~]$ 
 
in your shell, but the prompt is unresponsive to typing, and your cpu 
is pegged.  If you attach strace to the hung bash, you get a call 
like 
 
waitpid(-1, 0xfeffe774, WNOHANG|WUNTRACED) = 0 
 
over and over.  If you attach gdb to the hung bash, you get something 
like the following call stack: 
 
Sometimes you'll catch it with these two on the top: 
 
#0  0xf6fe9782 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 
#1  0xf6f1ea63 in __waitpid_nocancel () from /lib/tls/libc.so.6 
 
and the following below them numbered +2, other times just the below 
as numbered: 
 
#0  0x08076c38 in kill_pid () 
#1  0x08077250 in kill_pid () 
#2  <signal handler called> 
#3  0xf6fe9782 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 
#4  0xf6f4be43 in __read_nocancel () from /lib/tls/libc.so.6 
#5  0x080b8be6 in rl_getc () 
#6  0x080b8ae2 in rl_read_key () 
#7  0x080aa104 in readline_internal_char () 
#8  0x080aa4b1 in readline () 
#9  0x0805ded1 in yy_input_name () 
#10 0x080f44f0 in ?? () 
#11 0x080873c5 in termination_unwind_protect () 
#12 0x0805fc66 in execute_prompt_command () 
#13 0x08060c64 in execute_prompt_command () 
#14 0x080636a0 in yyparse () 
#15 0x0805d9d4 in parse_command () 
#16 0x0805da79 in read_command () 
#17 0x0805dbe2 in reader_loop () 
#18 0x0805ce1c in main () 
 
The programs started up load and run normally, however.  As soon as 
you quit one of them, the hung bash comes back to life. 
 
Expected results: 
bash shouldn't hang. 
 
Additional info: 
kmail from kdepim-3.3.0-1 
konqueror from kdebase-3.3.0-5 
glibc-2.3.3-54 
which are all current rawhide, as is everything else on the system.

Comment 1 Ellen Shull 2004-09-23 05:27:28 UTC

Er, quick addition/correction...  bash only comes back when you close 
program #2 (konqueror), which suggests that's what the waitpid is on.

Comment 2 Ellen Shull 2004-09-26 14:44:55 UTC

<< * Fri Sep 24 2004 Tim Waugh <twaugh> 3.0-15 
 
- Minor fix for job handling. >> 
 
That seems to fix it, so I'm closing this bug as rawhide.

Comment 3 Tim Waugh 2004-09-27 08:21:14 UTC

Thanks.  I hadn't got round to testing that it fixes this particular
bug, so it's nice that it does. :-)

Note You need to log in before you can comment on or make changes to this bug.