Bug 1740775

Summary: libcap-ng segfault after dlclose (httpd crash on reload with php-imap)
Product: Red Hat Enterprise Linux 8 Reporter: Steve Grubb <sgrubb>
Component: libcap-ngAssignee: Radovan Sroka <rsroka>
Status: CLOSED ERRATA QA Contact: Martin Zelený <mzeleny>
Severity: high Docs Contact:
Priority: high    
Version: 8.1CC: dapospis, mzeleny, omoris, pvrabec, rsroka, tjaros
Target Milestone: rcKeywords: EasyFix, Regression, Triaged
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-28 16:47:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch to fix issue none

Description Steve Grubb 2019-08-13 15:31:56 UTC
This bug was initially created as a copy of Bug #1680481

I am copying this bug because: 
RHEL 8 has the same issue.


Description of problem:
Ever since upgrade from Fedora 27 to 29 one box will have apache go into a segfault child-restart loop weekly when logrotate runs.  If I remove the php-imap rpm, the problem goes away.  From coredumps it appears something is causing random corruption in apache/modphp.

Bug only happens on reload, never restart of apache using systemd.  Apache seems 100% stable otherwise as long as reload is never issued.

The segfaults start immediately after the reload, no hit on the website is required (I blocked all access to 80/443 and bug still occurs).  It segfaults whether apache has been running a while (hours/days) or if I just restarted apache cleanly and no hits have occurred yet.

I think the php-imap might be a red herring.  It seems require to trigger the bug, but we're not using it at all from what I can tell.  And it seems fine on another box.

I have run tons of verifies, reinstalls, etc of relevant rpms, doesn't help, things look normal otherwise.  Box has ECC and is Ryzen-based but the bug occurred on an identical (decommissioned) box when it was i686-based.  It was F29 that started the bug.

rpm -V `rpm -qa | grep -iP 'php|httpd|mod_'`
looks normal.


Version-Release number of selected component (if applicable):
php-7.2.15-1.fc29.x86_64
php-imap-7.2.15-1.fc29.x86_64
httpd-2.4.38-2.fc29.x86_64
* we are not using any php/httpd related rpms or modules from any other source than fedora repos


How reproducible:
On demand on this one box when php-imap is installed, I can generate coredumps at will.  I tried creating a similar (not identical) config on another box andI cannot reproduce this on another box yet.  I'm trying to figure out what is different.


Steps to Reproduce:
1. Install typical complement of php rpms, have apache in modphp mode
2. systemctl reload httpd.service

Actual results:
[Mon Feb 25 00:27:15.934157 2019] [mpm_prefork:notice] [pid 21721] AH00163: Apache/2.4.38 (Fedora) OpenSSL/1.1.1a PHP/7.2.15 mod_perl/2.0.10 Perl/v5.28.1 configured -- resuming normal operations
[Mon Feb 25 00:27:15.934173 2019] [core:notice] [pid 21721] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[Mon Feb 25 00:27:16.938676 2019] [core:notice] [pid 21721] AH00051: child pid 21771 exit signal Segmentation fault (11), possible coredump in /etc/httpd
[Mon Feb 25 00:27:16.938760 2019] [core:notice] [pid 21721] AH00051: child pid 21773 exit signal Segmentation fault (11), possible coredump in /etc/httpd
[Mon Feb 25 00:27:16.938803 2019] [core:notice] [pid 21721] AH00051: child pid 21775 exit signal Segmentation fault (11), possible coredump in /etc/httpd
[Mon Feb 25 00:27:16.938845 2019] [core:notice] [pid 21721] AH00051: child pid 21777 exit signal Segmentation fault (11), possible coredump in /etc/httpd
[Mon Feb 25 00:27:16.938887 2019] [core:notice] [pid 21721] AH00051: child pid 21779 exit signal Segmentation fault (11), possible coredump in /etc/httpd
... many coredumps a second until parent is killed
... web server does not respond to requests during this time

Expected results:
normal operation, no coredumps, respond to requests


Additional info:
I removed rpms one by one until the problem disappeared and it was php-imap that I stumbled upon.  Removing it and things work bug free.  Install it and bug happens every time.

I did extensive strace (such that I can on child processes, which isn't much except on shutdown, not startup) and gdb on cores.

Each coredump in a set (i.e. during the same crash test before I kill the parent) seems similar, but often not identical.  Interestingly, each new crash test produces (usually) wildly varying coredump results (different segfault points and args to fns).  This hints to me that major memory corruption is going on here.

Most often the segfault occurs in some function of gd (php-gd), but not always.  The args to functions are usually totally not sane (like images with 0 width and 2 billion height).

See attached sampling of various gdb results from coredumps.

Comment 1 Steve Grubb 2019-11-04 16:15:15 UTC
Created attachment 1632620 [details]
Patch to fix issue

Comment 4 Martin Zelený 2020-01-13 09:59:08 UTC
Verified manually. No regression introduced. New version contains proposed patch libcap-ng-0.7.9-fixatfork.patch

Comment 7 errata-xmlrpc 2020-04-28 16:47:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1813