Created attachment 1457283 [details] One of the coredumps Description of problem: Apache daemon dumps core. Version-Release number of selected component (if applicable): httpd-2.4.33-5.fc28 How reproducible: Happens from time to time. Steps to Reproduce: 1. Don't know 2. 3. Actual results: Coredump Expected results: No coredump Additional info:
Please try 2.4.34 from updates-testing. If you can still reproduce, attach a backtrace (not a core dump).
PID: 13750 (httpd) UID: 0 (root) GID: 0 (root) Signal: 11 (SEGV) Timestamp: Sun 2018-07-29 03:39:05 CEST (1 day 14h ago) Command Line: /usr/sbin/httpd -DFOREGROUND Executable: /usr/sbin/httpd Control Group: /system.slice/httpd.service Unit: httpd.service Slice: system.slice Boot ID: afa847893dff4ef2bb75ab23404bd2f9 Machine ID: e11622fca54a439a9b3954265d001569 Hostname: manicminer.lan.zx-spectrum Storage: /var/lib/systemd/coredump/core.httpd.0.afa847893dff4ef2bb75ab23404bd2f9.13750.1532828345000000.lz4 Message: Process 13750 (httpd) of user 0 dumped core. Stack trace of thread 13750: #0 0x00007f5c992ce6c0 n/a (n/a) #1 0x00007f5cb14f7a68 apr_proc_fork (libapr-1.so.0) #2 0x00007f5ca2c48f29 wsgi_start_process (mod_wsgi.so) #3 0x00007f5ca2c4b598 wsgi_start_daemons (mod_wsgi.so) #4 0x00007f5ca2c4bf0b wsgi_hook_init (mod_wsgi.so) #5 0x0000564d58557083 ap_run_post_config (httpd) #6 0x0000564d585327bf main (httpd) #7 0x00007f5cb0d0f24b __libc_start_main (libc.so.6) #8 0x0000564d5853291a _start (httpd)
*** Bug 1612994 has been marked as a duplicate of this bug. ***
I don't know what's going on here, looks like a crash in mod_wsgi though that hasn't been updated in a while so possibly is not at fault here. References to dbus in bug 1612994 might be something random, looks like some unmapped function is being called. Is someone registering an atfork handler? Can we get some mod_wsgi package versions from reporters?
python2-mod_wsgi-4.5.20-4.fc28.x86_64
For my case in bug 1612994, it's python3-mod_wsgi-4.5.20-4.fc28.x86_64.
Of note, if you look at https://bugzilla.redhat.com/show_bug.cgi?id=1612994 it also shows: [Tue Aug 07 00:01:03.744087 2018] [core:notice] [pid 25705:tid 140235695196416] AH00051: child pid 16914 exit signal Segmentation fault (11), possible coredump in /etc/httpd [Tue Aug 07 00:01:03.744109 2018] [cgid:error] [pid 25705:tid 140235695196416] AH01239: cgid daemon process died, restarting So it isn't just mod_wsgi daemon processes which are crashing, but also cgid daemon process. This suggests it is a broader issue and not mod_wsgi specific. The errors in the other report of: Aug 7 00:01:01 server1 kernel: [35331.845063] httpd[16900]: segfault at 7f8b1798d6c0 ip 00007f8b1798d6c0 sp 00007ffe71b65db8 error 14 in libdbus-1.so.3.19.7[7f8b183d4000+50000] are therefore worth looking at, since that is a separate package again which may be triggering issues across any daemon sub process in Apache managed using the APR routines for other processes. Both mod_wsgi daemons and mod_cgid use these same APR routines. https://apr.apache.org/docs/apr/2.0/group__apr__thread__proc.html#ga5a9d123afe81eaa97955fbe45704b662 So possibly there is some conflict.
Good spot, thanks Graham. Let's keep this on httpd until we get better data. Can both reporters here give some context. Are these running servers which have recently started seeing crashes? If so, can you identify what packages were updated? I would bet the house there is something calling pthread_atfork() to register an atfork handler and then getting unloaded, but I have no idea what. grep atfork /etc/httpd/modules/*.so Also, can someone try: # systemctl stop httpd # LD_DEBUG=all LD_DEBUG_OUTPUT=/tmp/httpd.ld.debug httpd -DFOREGROUND & ... wait a bit... # httpd -k stop then gzip /tmp/httpd.ld.debug.* and mail it to me, or attach here (privately if you prefer).
For me, I installed IPA a few days ago. I got problem with httpd (mpm_event) 2 days ago, and httpd (mpm_worker) on yesterday. For both cases, it is having problem when it performed logrotate. I changed to use mpm_prefork and it seems fine just now (logrotate performed without problem).
Well, it started crashing right after upgrading from FC27 to FC28. I'm using eGroupware, a PHP-based, web application since ages.
(In reply to redhat from comment #10) > I'm using eGroupware, a PHP-based, web application since ages. I was wrong - mod_wsgi is used here with a private Firefox sync server.
Also seeing this problem ever since upgrading from F27 to F28: Process 28690 (/usr/sbin/httpd) of user 0 dumped core.#012#012Stack trace of thread 28690: #012#0 0x00007f5b9deed700 n/a (n/a) #012#1 0x00007f5bbf54ba68 apr_proc_fork (libapr-1.so.0) #012#2 0x00007f5bafdebf4a procmgr_post_config (mod_fcgid.so) #012#3 0x00007f5bafde5900 n/a (mod_fcgid.so) #012#4 0x000055580f4b2573 ap_run_post_config (httpd) #012#5 0x000055580f48d9cf main (httpd)#012#6 0x00007f5bbed6311b __libc_start_main (libc.so.6) #012#7 0x000055580f48db2a _start (httpd) Process 28694 (/usr/sbin/httpd) of user 0 dumped core. #012 #012Stack trace of thread 28694: #012#0 0x00007f5b9deed700 n/a (n/a) #012#1 0x00007f5bbf54ba68 apr_proc_fork (libapr-1.so.0) #012#2 0x00007f5bada86f29 wsgi_start_process (mod_wsgi.so) #012#3 0x00007f5bada89598 wsgi_start_daemons (mod_wsgi.so) #012#4 0x00007f5bada89f0b wsgi_hook_init (mod_wsgi.so) #012#5 0x000055580f4b2573 ap_run_post_config (httpd) #012#6 0x000055580f48d9cf main (httpd)#012#7 0x00007f5bbed6311b __libc_start_main (libc.so.6)#012#8 0x000055580f48db2a _start (httpd) Process 28697 (/usr/sbin/httpd) of user 0 dumped core. #012#012Stack trace of thread 28697: #012#0 0x00007f5b9deed700 n/a (libcap-ng.so.0) #012#1 0x00007f5bbf54ba68 apr_proc_fork (libapr-1.so.0) #012#2 0x00007f5baf7769dc post_config (mod_dnssd.so) #012#3 0x000055580f4b2573 ap_run_post_config (httpd) #012#4 0x000055580f48d9cf main (httpd) #012#5 0x00007f5bbed6311b __libc_start_main (libc.so.6)#012#6 0x000055580f48db2a _start (httpd) Process 28703 (/usr/sbin/httpd) of user 0 dumped core. #012#012Stack trace of thread 28703: #012#0 0x00007f5b9deed700 n/a (libcap-ng.so.0) #012#1 0x00007f5bb2ee70af make_child (mod_mpm_event.so) #012#2 0x00007f5bb2ee8024 event_run (mod_mpm_event.so) #012#3 0x000055580f49530e ap_run_mpm (httpd) #012#4 0x000055580f48da03 main (httpd) #012#5 0x00007f5bbed6311b __libc_start_main (libc.so.6)#012#6 0x000055580f48db2a _start (httpd) Process 28701 (/usr/sbin/httpd) of user 0 dumped core. #012#012Stack trace of thread 28701: #012#0 0x00007f5b9deed700 n/a (libcap-ng.so.0) #012#1 0x00007f5bb2ee70af make_child (mod_mpm_event.so) #012#2 0x00007f5bb2ee8024 event_run (mod_mpm_event.so) #012#3 0x000055580f49530e ap_run_mpm (httpd) #012#4 0x000055580f48da03 main (httpd) #012#5 0x00007f5bbed6311b __libc_start_main (libc.so.6) #012#6 0x000055580f48db2a _start (httpd) Its happening every time logroate performs a reload on httpd. python2-mod_wsgi-4.5.20-4.fc28.x86_64 httpd-2.4.34-3.fc28.x86_64
Just updated to FC29, httpd-2.4.37-3.fc29 and python2-mod_wsgi-4.6.4-2.fc29. It got even worse. Now I'm faced with an endless coredump loop: Sun 2018-11-25 10:14:49 CET 21511 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:49 CET 21534 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:49 CET 21524 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:49 CET 21536 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:50 CET 21515 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:50 CET 21514 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:50 CET 21529 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:51 CET 21522 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:51 CET 21513 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:51 CET 21517 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:51 CET 21584 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:51 CET 21585 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:51 CET 21595 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:52 CET 21615 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:52 CET 21617 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:53 CET 21646 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:55 CET 21648 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:55 CET 21656 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:57 CET 21649 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:57 CET 21647 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:57 CET 21653 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:58 CET 21692 0 0 11 present /usr/sbin/httpd Sun 2018-11-25 10:14:58 CET 21736 0 0 11 present /usr/sbin/httpd -- Notice: 13 systemd-coredump@.service units are running, output may be incomplete. Message: Process 22815 (httpd) of user 0 dumped core. Stack trace of thread 22815: #0 0x00007f017269c6b0 gdImageScale (libgd.so.3) #1 0x00007ffe3bf71a73 n/a (n/a)
If you are seeing crashes with traces like: Stack trace of thread 13750: #0 0x00007f5c992ce6c0 n/a (n/a) #1 0x00007f5cb14f7a68 apr_proc_fork (libapr-1.so.0) on F29 please try https://bodhi.fedoraproject.org/updates/FEDORA-2018-4b635e1df4 there was an off-by-two memory corruption bug in a patch to mod_ssl which could break in ways similar to this.
There was a completely unrelated bug with extremely similar symptoms fixed in libcap-ng for Fedora 29 (bug 1680481) - if you saw regressions with crashes in fork going to Fedora 29 please ensure you have both the httpd update mentioned in comment 14 and also: https://bodhi.fedoraproject.org/updates/FEDORA-2019-2cc0b7524 And let me know if you can reproduce with current stable updates.
Yes, the fix also fixes the problem by this issue. You may close this.
Thanks for confirming the fix. *** This bug has been marked as a duplicate of bug 1680481 ***