From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030701 Description of problem: This problem was first noticed in a real-world test while benchmarking mail delivery through exim. The following requirements must be met for the problem occur: - Non-NPTL-enabled kernel. For example, vanilla 2.4.21 (not Red Hat patched kernels). - glibc 2.3.2. 2.3.1 does not exhibit the problem. - Program must be compiled with -lpthread. Attached is the example program which can be compiled with: gcc -o signal-crash-example signal-crash-example.c -lpthread When the problem occurs, the process exits with a Segmentation Fault. The test program should produce this in under a second. strace seems to help the problem occur, and shows this trace: rt_sigaction(SIGCHLD, {0x804bb68, [CHLD], SA_RESTORER|SA_RESTART, 0x804d8c8}, {SIG_DFL}, 8) = 0 --- SIGCHLD (Child exited) @ 0 (0) --- --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++ Compiled with -g3, GDB shows this backtrace: Starting program: /root/signal-crash-example [New Thread 16384 (LWP 25497)] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 16384 (LWP 25497)] 0x00000000 in ?? () (gdb) bt #0 0x00000000 in ?? () #1 0x400294be in __pthread_sighandler () from /lib/i686/libpthread.so.0 #2 <signal handler called> #3 0x4009652f in __libc_sigaction () from /lib/i686/libc.so.6 #4 0x40026a8a in sigaction () from /lib/i686/libpthread.so.0 #5 0x40096631 in sigaction () from /lib/i686/libc.so.6 #6 0x400963e3 in ssignal () from /lib/i686/libc.so.6 #7 0x08048567 in main (argc=1, argv=0xbfffee24) at signal-crash-example.c:34 #8 0x40083a07 in __libc_start_main () from /lib/i686/libc.so.6 Is it trying to jump to 0x00000000? That definitely won't work... Here is the sample signal-crash-example.c code which reproduces the problem: #include <sys/types.h> #include <sys/wait.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <signal.h> void signalhandler(int sig) { } int main(int argc, char *argv[]) { pid_t pid; int i; while (1) { for (i = 0; i < 100; i++){ if ((pid = fork()) == -1) { perror("Fork error"); exit(1); } if (pid == 0) { usleep(100000); signal(SIGCHLD, SIG_DFL); exit(0); } } usleep(100000); for (i = 0; i < 10000; i++){ signal(SIGCHLD, SIG_DFL); signal(SIGCHLD, signalhandler); signal(SIGCHLD, SIG_DFL); signal(SIGCHLD, signalhandler); } while (wait(NULL) != -1) ; } } Version-Release number of selected component (if applicable): glibc-2.3.2-27.9 How reproducible: Always Steps to Reproduce: 1. compile the sample code above with pthread flag 2. execute binary 3. Actual Results: Segmentation fault (core dumped) Additional info: This was also tested on RedHat 8
Don't know why you claim that 2.3.1 doesn't exhibit the problem, I can very easily reproduce it on any linuxthreads I've tried (e.g. 2.2.4, 2.2.5, 2.3.2; the relevant code hasn't changed since at least 1998 when it was added to glibc). Signal handling is broken in way more ways in linuxthreads than just this one, which doesn't mean we won't look at this exact case, just that it is certainly not very high priority. For usable signal handling there is always NPTL.
Yes, there is always NPTL, but correct me if I'm wrong, you can only use NPTL with RH's kernels, not a custom kernel from kernel.org. I know that RH9 ships with a glibc capable of regular linux threads and NPTL which will dynamically switch between them depending on the capabilities of the kernel. Is there a single patch available that we can apply to a regular kernel.org kernel so that we can utilize NPTL on custom compiled kernels? If that were the case perhaps it would 'fix' this issue as well.
You miss the point. LinuxThreads and signals never mixed, never worked, never will be. If you need signals and threads you have to use NPTL. This is not some act of forcing you to use a RH kernel. The functionality simply wasn't available before. Either stop using signals or require NPTL. There is no other reliable way.
OK, that makes sense. Now, is there a patch that we can apply to a vanilla kernel which will enable us to utilize NPTL?? I've tried to look through RH kernel SRPMs to see if I could find a single NPTL kernel patch but so far have been unable to do so. TIA
E.g. http://people.redhat.com/mingo/nptl-patches/nptl-2.4.22-ac1-A2
Please try the test version of the RHL9 errata at ftp://people.redhat.com/jakub/glibc/errata/2.3.2-27.9.4/ and let us know how it works.
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2003-334.html
Created attachment 99367 [details] A patch for RH AS 2.1 RH AS 2.1 has similar problem. This patch is backported from mainline.