Bug 190459

Summary: bash sometimes deadlocks in futex(FUTEX_WAIT)
Product: Red Hat Enterprise Linux 3 Reporter: Andreas Luik <andreas.luik>
Component: bashAssignee: Pete Graner <pgraner>
Status: CLOSED WONTFIX QA Contact: Ben Levenson <benl>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-19 18:44:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andreas Luik 2006-05-02 15:52:11 UTC
Description of problem:
When I do a rlogin to a remote host, sometimes the bash process, which is
started from login deadlocks.
Another rlogin then usually works fine.  It happens rarely (once a month).

pstree output is:
     ââxinetdââ¬âin.rlogindâââloginâââ
     â        ââin.rlogindâââloginâââbashâââbash
     â        ââin.rlogindâââloginâââbashâââpstree

ps output is:
root     26002  5210  0 16:52 ?        00:00:00 in.rlogind
root     26003 26002  0 16:52 ?        00:00:00 login -- vts            
vts      26004 26003  0 16:52 pts/1    00:00:00 -bash
vts      26005 26004  0 16:52 pts/1    00:00:00 -bash

[vts@sysman2 vts]$ strace -p 26004
Process 26004 attached - interrupt to quit
read(3,  <unfinished ...>
Process 26004 detached
[vts@sysman2 vts]$ strace -p 26005
Process 26005 attached - interrupt to quit
futex(0x308810, FUTEX_WAIT, 2, NULL <unfinished ...>
Process 26005 detached

The parent process (26004) waits reading from a pipe.  The child process
(26005) waits forever for the futex.

The child process's backtrace is as follows:
#0  0x002bc0f9 in __lll_mutex_lock_wait () from /lib/tls/libc.so.6
#1  0x0024764c in _L_mutex_lock_10257 () from /lib/tls/libc.so.6
#2  0x0000000a in ?? ()
#3  0x0000001c in ?? ()
#4  0x00308360 in run_fp () from /lib/tls/libc.so.6
#5  0x00307a78 in __DTOR_END__ () from /lib/tls/libc.so.6
#6  0x00308360 in run_fp () from /lib/tls/libc.so.6
#7  0x00000001 in ?? ()
#8  0xbfffd4fc in ?? ()
#9  0x002430e6 in malloc () from /lib/tls/libc.so.6
#10 0x002430e6 in malloc () from /lib/tls/libc.so.6
#11 0x08090b07 in xmalloc ()
#12 0x0806ce7b in make_local_array_variable ()
#13 0x0806cefa in make_local_array_variable ()
#14 0x0806d19c in make_variable_value ()
#15 0x0806d230 in bind_variable ()
#16 0x0806c128 in sh_set_lines_and_columns ()
#17 0x08075e7d in initialize_job_control ()
#18 0x08075eb2 in initialize_job_control ()
#19 <signal handler called>
#20 0x00246ff0 in ptmalloc_unlock_all2 () from /lib/tls/libc.so.6
#21 0x0027bcc2 in fork () from /lib/tls/libc.so.6
#22 0x080739a3 in make_child ()
#23 0x0807b06c in command_substitute ()
#24 0x0807e103 in pat_subst ()
#25 0x0807ee92 in expand_words_shellexp ()
#26 0x0807f07c in expand_words_shellexp ()
#27 0x0807eaf9 in expand_words ()
#28 0x080696f6 in execute_command_internal ()
#29 0x08066d2a in execute_command_internal ()
#30 0x08066854 in execute_command ()
#31 0x0806900b in execute_command_internal ()
#32 0x08066f2d in execute_command_internal ()
#33 0x080946b2 in parse_and_execute ()
#34 0x080941c7 in eval_builtin ()
#35 0x0809434d in maybe_execute_file ()
#36 0x0805b553 in sh_exit ()
#37 0x0805acc1 in main ()


Version-Release number of selected component (if applicable):

RHEL3U7, bash-2.05b-41.5, 
Linux sysman2 2.4.21-40.EL #1 Thu Feb 2 22:32:00 EST 2006 i686 i686 i386 GNU/Linux

Both local and remote host use the same operating software version.
The problem does not depend on a particular host, we've seen it with
several machines (all Dell, though).



How reproducible:
difficult, happens only sometimes

Steps to Reproduce:
1. rlogin <remote host>

  
Actual results:

Hangs.


Expected results:

Succeeds.

Additional info:

Comment 1 RHEL Program Management 2007-10-19 18:44:46 UTC
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.