Description of problem:
If init catches a signal while in one of the syslog functions, and
if the signal handler in turn tries to send something to syslog,
then a deadlock will occur.
Version-Release number of selected component (if applicable):
Its completely timing related. I have not reproduced it ever on a single
processor machines. On duel processor machines with hyper-threading it
can be recreated with the debug output turned on in init.h. Also, it is
recreated if for some reason init segfaults while calling syslog(), as the
segv handler does try to send a message to syslog. This case does not require
debug output to be turned on.
Steps to Reproduce:
1. Build init with DEBUG set to 1 in init.h and install it.
2. Reboot the machine.
3. On reboot try run init 6.
From experimentation you must be on a SMP machine. All machines I have tested
on have been 2GigHz or higher, with the E7500 chipset (don't think the
chip set has any bearing on the problem).
Init will hang at this point. If you had compiled it with INITDEBUG set to
1, you can actually run strace, and gdb on init (note not the init that is
pid 1, the other init that now shows up in the process table). strace
will show that init is waiting on a futex:
futex(0x4212f1f4, FUTEX_WAIT, -1, NULL
And the gdb will show that init was in the log() function:
#0 0xffffe002 in ?? ()
#1 0x0804a1ff in log ()
#2 0x08049f17 in chld_handler ()
#3 <signal handler called>
#4 0xffffe000 in ?? ()
#5 0x4202774e in sigaction () from /lib/tls/libc.so.6
#6 0x420daa21 in vsyslog () from /lib/tls/libc.so.6
#7 0x420da75f in syslog () from /lib/tls/libc.so.6
#8 0x0804a20f in log ()
#9 0x0804b1ba in read_inittab ()
#10 0x0804bcc3 in fifo_new_level ()
#11 0x0804bef6 in check_init_fifo ()
#12 0x0804cb88 in init_main ()
#13 0x0804cf40 in main ()
#14 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6
That init would not deadlock (-;
The redhat-devel-list has a thread discussing this problem at:
Created attachment 92444 [details]
Causes log() in init.c to block signals while talking to syslog.
The fix for this (at least one of the fixes) is to simply block signals while
talking to syslog().
Created attachment 92445 [details]
Simple programing showing how to cause a deadlock with syslog
This is a simple program that causes the same deadlock in syslog() that
happens in init.c. The strategy is simple:
2) Parent sleeps a second.
3) Child creates a signal handler for SIGUSR1.
4) Child starts writing log messages to syslog over and over.
5) Parent begins sending SIGUSR1 to the child.
6) The child catches SIGUSR1, and in the signal handler tries to
write to syslog. If it was in syslog when it caught the signal
a deadlock occurs.
I have only tested this one on a 1GigHz single processor.
Fixed in 2.85-4, thanks!