Bug 1313668 - httpd: segfault at 8 ip 00007fa45894fd80 sp 00007fa4415fedb0 error 4 in libpython2.7.so.1.0[7fa458855000+179000]
httpd: segfault at 8 ip 00007fa45894fd80 sp 00007fa4415fedb0 error 4 in libpy...
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: mod_wsgi (Show other bugs)
7.2
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Web Stack Team
BaseOS QE - Apps
:
: 1317376 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-03-02 02:43 EST by Sudhir Menon
Modified: 2017-08-01 10:57 EDT (History)
14 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-01 10:57:37 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
backtrace (1.27 KB, text/plain)
2016-03-02 02:45 EST, Sudhir Menon
no flags Details
httpd error_log (41.23 KB, text/plain)
2016-03-02 06:30 EST, Sudhir Menon
no flags Details
var_log_messages file in /var/spool/abrt folder (9.84 KB, text/plain)
2016-03-10 08:10 EST, Sudhir Menon
no flags Details
core_backtrace (6.41 KB, text/plain)
2016-03-10 08:12 EST, Sudhir Menon
no flags Details

  None (edit)
Description Sudhir Menon 2016-03-02 02:43:16 EST
Description of problem: segfault at 8 ip 00007fa45894fd80 sp 00007fa4415fedb0 error 4 in libpython2.7.so.1.0[7fa458855000+179000]


Version-Release number of selected component (if applicable):
ipa-server-4.2.0-15.el7_2.3.x86_64
httpd-2.4.6-40.el7.x86_64
sssd-1.13.0-40.el7_2.1.x86_64

How reproducible: Seen twice on the test system.


Steps to Reproduce:
1. Install IPA-server on RHEL7.2.
2. Configure IPA-client on RHEL6.8 and join to IPA-RHEL7.2 server.
3. Create a winsync agreement with Windows AD 2012R2
4. Now add two-way AD trust with same windows 2012R2 AD
5. Now run ipa-winsync-migrate
6. Run dmesg command

Actual results:
2. IPA-Client-6.8 is connected to IPA-Server-7.2 without any errors.
3. Winsync agreement is successfull.
4. Two way adtrust is successfully added in IPA-Server-7.2 with Win2012R2.
5. ipa-winsync-migrate is successfull without any errors.
6. dmesg -T shows segfault error.

[Tue Feb 16 11:48:48 2016] httpd[18112]: segfault at 8 ip 00007f3324743d80 sp 00007f330d3f2db0 error 4 in libpython2.7.so.1.0[7f3324649000+179000]
[Tue Feb 16 11:48:48 2016] httpd[18109]: segfault at 8 ip 00007f3324743d80 sp 00007f330d3f2db0 error 4 in libpython2.7.so.1.0[7f3324649000+179000]
[Tue Feb 16 15:25:15 2016] httpd[18820]: segfault at 8 ip 00007fc6f8846d80 sp 00007fc6e14f5db0 error 4 in libpython2.7.so.1.0[7fc6f874c000+179000]

[Tue Mar  1 15:58:51 2016] httpd[25638]: segfault at 8 ip 00007fa45894fd80 sp 00007fa4415fedb0 error 4 in libpython2.7.so.1.0[7fa458855000+179000]
[Tue Mar  1 16:32:41 2016] httpd[28235]: segfault at 8 ip 00007f5b9477fd80 sp 00007f5b7d42edb0 error 4 in libpython2.7.so.1.0[7f5b94685000+179000]

7. /var/spool/abrt folder contains the backtrace.
/var/spool/abrt/Python-2015-12-23-14:36:25-18427


Expected results:
segfault error should be fixed.

Additional info: Happen to see the logs in dmesg while verifying the bug#1285852, dmesg logs populated somewhere while executing the above mentioned steps.
Attaching the backtrace as well.
Comment 1 Sudhir Menon 2016-03-02 02:45 EST
Created attachment 1132160 [details]
backtrace
Comment 4 Sudhir Menon 2016-03-02 06:30 EST
Created attachment 1132258 [details]
httpd error_log
Comment 13 Christian Heimes 2016-03-14 09:23:03 EDT
This might be known issue with mod_wsgi, https://github.com/GrahamDumpleton/mod_wsgi/issues/11

What version of mod_wsgi, Apache and Python are you using?
Comment 16 Christian Heimes 2016-03-15 09:08:12 EDT
Graham, you are the maintainer and expert on mod_wsgi. Can you have a look please? The machine is running mod_wsgi-3.4-12.el7.
Comment 17 Graham Dumpleton 2016-03-15 19:12:04 EDT
The big question is whether this is happening on any attempt to shutdown a process or when handling a specific request.

If it is only on process shutdown there could be various issues that could cause it, including:

* The referenced issue 11 from mod_wsgi GitHub issue tracker.
* The fact that in a process where multithreading is used, you can't be certain that there aren't background threads still running when destroying the Python interpreter.
* Reference counting bugs in how Python is used, or misuse of Apache memory pools that only show up on process shutdown.

Without a stack trace from a core dump where at least function name symbols aren't stripped, impossible to even speculate.

That there are some corner cases that can cause core dumps on process shutdown is therefore known and not too much that can do about it unless have good details on how to cause it or where it is occurring. Trying to uncover the causes can be very very hard and even in latest mod_wsgi I myself experience some crashes on shutdown in testing but as soon as try and insert gdb to debug the problems they don't occur. I know it relates to incorrect use of memory due to memory checking libraries warning me first, but no closer to working it out.

If instead the core dump is occurring when a request is being handled, that is a bit easier to work out the problem.

A common cause of processes crashing when handling requests is the result of C extensions for Python modules not being implemented to work in Python sub interpreters. The solution to that is to force the use of the main Python interpreter context using the 'WSGIApplicationGroup`` directive for mod_wsgi in Apache configuration. This may necessitate also using mod_wsgi daemon mode if running multiple Django web applications as you can't run more than one of those in a process, thus need to delegate to different mod_wsgi daemon process groups.

So for this case based on details given, can't really identify any specific problem. If it is infrequent and the hosted web applications are otherwise working fine, would suggest it relates to process shutdown issues.

If application is impacted, would start by looking at how mod_wsgi is configured, whether using embedded or daemon mode and whether use of main Python interpreter is being forced using 'WSGIApplicationGroup' with the value '%{GLOBAL}'.
Comment 19 Petr Vobornik 2016-04-08 04:26:19 EDT
*** Bug 1317376 has been marked as a duplicate of this bug. ***
Comment 20 Petr Vobornik 2016-04-08 04:29:32 EDT
Moving to mod_wsgi, the workaround:
  WSGIApplicationGroup' with the value '%{GLOBAL}'.

is suboptimal.

IPA management framework, KDC proxy and future Custodia service(will be introduced in 7.3) needs to be separated.

Therefore proposing a fix on mod_wsgi side.
Comment 21 Graham Dumpleton 2016-04-08 05:15:33 EDT
If you have shown that setting:

WSGIApplicationGroup %{GLOBAL}

fixes the problem being seen, then it means the problem is in a third party Python package which includes a C extension component which has not been implemented to work correctly in sub interpreters.

In other words, it has got nothing to do with mod_wsgi and that third party Python package needs to be fixed.

That said, the recommended way of using mod_wsgi is to using separate daemon process groups for each separate hosted application and also forcing use of the main interpreter in those respective processes using WSGIApplicationGroup with value %{GLOBAL}.

Preferably, even if using embedded mode, you should still use WSGIApplicationGroup with that value. Using embedded mode is very much not recommended though because the typical Apache configuration is not set up properly for Python applications and using embedded mode will result in poor performance and excessive memory usage. This is because the sub optional configuration of Apache, not being tailored for Python web applications, will exacerbate issues arising around arguable design flaws in Apache process management. 

Why are you saying it is sub optimal when using it is the recommended way of doing things with mod_wsgi?

If you have been able to pin down further the scenario under which it occurs, please provide additional details. To date, insufficient information has been provided about how Apache has been configured to run the WSGI application using mod_wsgi, such as whether embedded mode or daemon mode is used, and how daemon mode and application groups are set up if it is used.
Comment 22 Joe Orton 2016-06-02 11:50:33 EDT
Thanks a lot Graham for providing such details feedback!

Sudhir, others: assigning application crashes to a web server is a bit like blaming a C compiler bug for your application not behaving right.  It's not impossible that's true, but you need to do a lot of work to demonstrate it.  Without narrowing this down to a very specific repro case there is not a lot we can do about it.
Comment 23 Graham Dumpleton 2016-06-02 15:35:57 EDT
FWIW, I have been working on changes to very latest mod_wsgi which will in the first instance provide a flag which allows you to say not to try and destroy the Python interpreter on process shutdown. It isn't a straight forward change as I need to still go through process of trying to stop threads and call Python atexit callbacks so an application has a chance to clean up stuff. These are things normally down when the interpreter is destroyed.

With the change it would at least be possible to hopefully silence the process crashes given that they appear to be occurring during destruction of the Python interpreter. These changes are unlikely to be able to be back ported to 3.4 and would also hesitate contemplating doing that for 4.3 either.

Note You need to log in before you can comment on or make changes to this bug.