From Bugzilla Helper: User-Agent: Mozilla/5.0 Galeon/1.2.5 (X11; Linux i686; U;) Gecko/20020606 Description of problem: The /sbin/hotplug bash script is started by the kernel with SIGCHLD ignored. This causes initlog (and maybe some other things) to fail because the wait() and waitpid() calls do not work if SIGCHLD is ignored. This in turn breaks other things as pointed out in bug #64603. Version-Release number of selected component (if applicable): hotplug-2001_04_24-11 How reproducible: Always Steps to Reproduce: 1. add "cat /proc/self/status > /tmp/hotplug.status" to /sbin/hotplug. 2. do something that will cause a hotplug event, such as inserting a pcmcia network card. 3. the SigIgn: field of /tmp/hotplug.status indicates that SIGCHLD (17) is ignored. Actual Results: Anything that is started from hotplug (directly or indirectly) and that relies in the wait() or waitpid() signal will fail. Most notably initlog which is called from many init scripts (for example if someone does a service whatever restart from /sbin/ifup-local), leaving services in an incoherent state. See bug #64603. Expected Results: Hotplug should fix the ignored SIGCHLD before launching any other scripts, so that things such as initlog do not fail. Furthermore POSIX requires that SIGCHLD not be ignored. Additional info: I don't know if the proper fix belongs to the kernel or to hotplug, but this should clearly be solved. I have a wrapper that re-enables SIGCHLD, that I will attach next.
Created attachment 62024 [details] A wrapper that re-enables SIGCHLD and stdout and stderr.
I use the previously attached hotplug wrapper by adding if [ -x /usr/local/sbin/hotplugwrap ]; then AGENT="/usr/local/sbin/hotplugwrap $AGENT" fi just above the exec $AGENT "$@" call towards the end of the /sbin/hotplug script. This has solved the problem I previously reported as bug #64603.
Arjan: should the kernel not start hotplug with SIGCHLD ignoredor should we add a wrapper?
Any news on this? I installed RedHat 8.0 and the problem is still there. As per signal(7) the default action for SIGCHLD is "ignore" (i.e. SIG_DFL would ignore the signal). So it would seem to be equivalent to set SIGCHLD to SIG_IGN, but that is not the case. AFAIK, if SIGCHLD is set to SIG_IGN it is not possible to use wait() and waitpid(), which is the root of the problem here, since the child is reapead as soon as it exits. I checked the kernel-2.4.18-10 source. All the users of hotplug call the hotplug helper through call_usermodehelper() of kmod.c (except S390, in misc/chandev.c). After setting a new kernel thread call_usermodehelper() calls exec_usermodehelper() to exec the user mode program. exec_usermodehelper() will reset the signal handlers with flush_signal_handlers(). That function will set all signal handlers to SIG_DFL, *except* those which are already at SIG_IGN. So apparently, whatever task is calling call_usermodehelper() has SIGCHLD set to SIG_IGN (which would seem logical, since it does not want to get zombie children). However, the user mode helper should be able to use wait() and waitpid(), so it should be started with SIGCHLD set to SIG_DFL. I have no idea about kernel development, but I would say that exec_userhelper() should set SIGCHLD to SIG_DFL just after the call to flush_signal_handlers(). Using a wrapper is just a temporary solution, bound to break in the future.
Any program that plans to use SIGCLD must be sure to set the signal masks properly. If initlog gets it wrong then initlog is broken - and indeed in some cases may break in scripts or from cron. hotplug setting SIGCLD might paper over bugs and be a good thing short term but neither it nor the kernel are actually wrong in any way
OK, I agree, I updated bug #64603. But still, wouldn't it be more convenient to set SIGCHLD to SIG_DFL for the user processes started by the kernel? Just my 2 cents.
Closing out bugs on older, no longer supported, releases. Apologies for any lack of response. Please reopen if problems persist on more current releases. initlog is no longer shipped in development, FWIW.