Bug 97534

Summary: Deadlock in init's log() funciton
Product: [Retired] Red Hat Linux Reporter: James Olin Oden <james.oden>
Component: SysVinitAssignee: Bill Nottingham <notting>
Status: CLOSED RAWHIDE QA Contact: David Lawrence <dkl>
Severity: medium Docs Contact:
Priority: medium    
Version: 9CC: mitr, rvokal, tao
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: 2.85-4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-06-25 20:38:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Causes log() in init.c to block signals while talking to syslog.
none
Simple programing showing how to cause a deadlock with syslog none

Description James Olin Oden 2003-06-17 14:38:34 UTC
Description of problem:
If init catches a signal while in one of the syslog functions, and
if the signal handler in turn tries to send something to syslog,
then a deadlock will occur.

Version-Release number of selected component (if applicable):
2.84-13, 2.85-3

How reproducible:
Its completely timing related.  I have not reproduced it ever on a single 
processor machines.  On duel processor machines with hyper-threading it
can be recreated with the debug output turned on in init.h.  Also, it is 
recreated if for some reason init segfaults while calling syslog(), as the
segv handler does try to send a message to syslog.  This case does not require
debug output to be turned on.

Steps to Reproduce:
1.  Build init with DEBUG set to 1 in init.h and install it.
2.  Reboot the machine.
3.  On reboot try run init 6.  
    
From experimentation you must be on a SMP machine.  All machines I have tested
on have been 2GigHz or higher, with the E7500 chipset (don't think the 
chip set has any bearing on the problem).

Actual results:
Init will hang at this point.  If you had compiled it with INITDEBUG set to
1, you can actually run strace, and gdb on init (note not the init that is 
pid 1, the other init that now shows up in the process table).  strace
will show that init is waiting on a futex:
 
       futex(0x4212f1f4, FUTEX_WAIT, -1, NULL

And the gdb will show that init was in the log() function:

        #0  0xffffe002 in ?? ()
        #1  0x0804a1ff in log ()
        #2  0x08049f17 in chld_handler ()
        #3  <signal handler called>
        #4  0xffffe000 in ?? ()
        #5  0x4202774e in sigaction () from /lib/tls/libc.so.6
        #6  0x420daa21 in vsyslog () from /lib/tls/libc.so.6
        #7  0x420da75f in syslog () from /lib/tls/libc.so.6
        #8  0x0804a20f in log ()
        #9  0x0804b1ba in read_inittab ()
        #10 0x0804bcc3 in fifo_new_level ()
        #11 0x0804bef6 in check_init_fifo ()
        #12 0x0804cb88 in init_main ()
        #13 0x0804cf40 in main ()
        #14 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6

Expected results:

That init would not deadlock (-;

Additional info:

The redhat-devel-list has a thread discussing this problem at:

   http://www.redhat.com/mailman/private/redhat-devel-list/2003-
June/msg00036.html

Comment 1 James Olin Oden 2003-06-17 14:40:14 UTC
Created attachment 92444 [details]
Causes log() in init.c to block signals while talking to syslog.

The fix for this (at least one of the fixes) is to simply block signals while
talking to syslog().

Comment 2 James Olin Oden 2003-06-17 14:45:34 UTC
Created attachment 92445 [details]
Simple programing showing how to cause a deadlock with syslog

This is a simple program that causes the same deadlock in syslog() that 
happens in init.c.  The strategy is simple:

   1) Fork().
   2) Parent sleeps a second.
   3) Child creates a signal handler for SIGUSR1.
   4) Child starts writing log messages to syslog over and over.
   5) Parent begins sending SIGUSR1 to the child.
   6) The child catches SIGUSR1, and in the signal handler tries to 
      write to syslog.	If it was in syslog when it caught the signal
      a deadlock occurs.

I have only tested this one on a 1GigHz single processor.

Comment 3 Bill Nottingham 2003-06-25 20:38:13 UTC
Fixed in 2.85-4, thanks!