** This bug also affects RHEL4, so I am cloning it ** +++ This bug was initially created as a clone of Bug #146850 +++ Description of problem: If cupsd gets a SIGHUP and is in the process of reloading the configuration when it receives a SIGCHLD it can segfault in sigchld_handler. Version-Release number of selected component (if applicable): cups-1.1.17-13.3.6 How reproducible: Occasionally Steps to Reproduce: 1. send SIGHUP to cupsd 2. 3. Actual results: cupsd exits with sig 11 Expected results: cupsd reloads configuration and continues running Additional info: Here is the customer's report: Core was generated by `cupsd'. Program terminated with signal 11, Segmentation fault. ... #0 sigchld_handler (sig=17) at main.c:775 775 if (job->state != NULL && (gdb) bt #0 sigchld_handler (sig=17) at main.c:775 #1 <signal handler called> #2 0xb7376edb in _int_free () from /lib/tls/libc.so.6 #3 0xb7375e68 in free () from /lib/tls/libc.so.6 #4 0xb74899f5 in _ipp_free_attr () from /usr/lib/libcups.so.2 #5 0xb748822a in ippDelete () from /usr/lib/libcups.so.2 #6 0x08068a50 in FreeAllJobs () at job.c:375 #7 0x08054951 in ReadConfiguration () at conf.c:177 #8 0x0805c5a4 in main (argc=1, argv=0xbfffb134) at main.c:411 #9 0xb731a768 in __libc_start_main () from /lib/tls/libc.so.6 #10 0x0804c401 in _start () Investigating an above backtrace, I guess that an invalid pointer operation in "sigchld_handler" function which is executed when cupsd receives "SIGCHLD" signal causes it abnormal termination with "Segmentation fault". A part of sigchld_hanlder function is: 770 /* 771 * Lookup the PID in the jobs list... 772 */ 773 774 for (job = Jobs; job != NULL; job = job->next) 775 if (job->state != NULL && 776 job->state->values[0].integer == IPP_JOB_PROCESSING) 777 { It seems that cupsd terminated abnormally in "Jobs" list operation. (gdb) print job $1 = (job_t *) 0x64 Therefore I am sure that cupsd terminated abnormally with "Segmentation fault" because of an invalid pointer operation. Moreover, I traced above backtrace and then we find "FreeAllJobs" function at #6. Looking the following lines in "FreeAllJobs" function on this backtrace #6: (gdb) up 6 #6 0x08068a50 in FreeAllJobs () at job.c:375 375 ippDelete(job->attrs); (gdb) list 370 371 for (job = Jobs; job; job = next) 372 { 373 next = job->next; 374 375 ippDelete(job->attrs); 376 free(job->filetypes); 377 free(job); 378 } 379 Apparently, it seems to be while the same "Jobs" list was operated. Referring the "next" value at this time: (gdb) print next $2 = (job_t *) 0x80c3918 (gdb) print job $3 = (job_t *) 0x80c28c0 It seems that these pointers are valid. (gdb) down 6 #0 sigchld_handler (sig=17) at main.c:775 775 if (job->state != NULL && Returning to #0 backtrace and then tracing "Jobs" list pointer one after another: (gdb) print Jobs $4 = (job_t *) 0x80ba600 (gdb) print Jobs->next $5 = (struct job_str *) 0x80bb598 (gdb) print Jobs->next->next $6 = (struct job_str *) 0x64 These are same on backtrace #6, too. Therefore, it can be judged that the trouble occurrence cause guessed first was correct. Summarizing this trouble occurrence cause: 1) "cupsd" received SIGHUP signal 2) "cupsd" did the following actions: [1] Sending SIGTERM(or SIGKILL) signal to all children processes which connect to their parent process [2] Releasing all items which cascade connection to the "Jobs" list 3) The parent process, "cupsd" received SIGCHLD signal 4) Start to perform "sigchld_handler", the signal handler of SIGCHLD after interrupting 2)-[2] process temporarily 5) In the sigchld_handler function, there is processing by which "Jobs" list is referred. The problem might occur depending on the timing of "SIGCHLD" interruption. At this time, "cupsd" sometimes remains an invalid pointer of "Jobs" list. And then "cupsd" tries to refer an invalid pointer when it traces each an element from "Jobs" list one after another in "sigchld_handler". And then "cupsd" terminates abnormally. Therefore, it can be judged that it is a timing trouble which occurs because sigchld_handler was called in "Jobs" list operating. Correction proposal and reproduction test with a patch: I performed the reproduction test with the patch which is fixed that cupsd is blocking SIGCHLD signal while "Jobs" list operation, and then I confirmed that this problem didn't occur.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-772.html