| Summary: | Get " signal Segmentation fault" error in log and return 500 when accessing python-2.7/2.6 app with medium gear &python-2.6 app with large gear | ||
|---|---|---|---|
| Product: | OpenShift Online | Reporter: | chunchen <chunchen> |
| Component: | Containers | Assignee: | mfisher |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | libra bugs <libra-bugs> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 2.x | CC: | mfojtik, wsun, xtian |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-01-24 03:25:19 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
chunchen
2013-10-18 11:19:16 UTC
Could be reproduced with large gear python-2.6 app as well: [Mon Oct 21 05:58:05 2013] [notice] Apache/2.2.15 (Unix) mod_wsgi/3.2 Python/2.6.6 configured -- resuming normal operations [Mon Oct 21 05:59:29 2013] [error] [client 127.1.246.1] Premature end of script headers: application [Mon Oct 21 05:59:29 2013] [notice] child pid 14415 exit signal Segmentation fault (11) This is due to setting stack-size. I can reliably test this by removing the setting from performance.conf.erb to show no failure or putting it back in to show a failure. Looking through a few core dumps, it appears as though the settings are causing writes off of the stack which is causing the segfault. The docs on WSGIDaemonProcess mention that the value for stack-size is in bytes. We appear to be making the following settings: Gear size: Memory: stack-size: small 512MB 8388 bytes medium 1024MB 16777 bytes large 2048MB 33554 bytes The system limits allow up to 10485760 bytes and the default may well be that value. Looking at the manpage for pthread_attr_setstack, it fails if you try to set a stack size below 16384 bytes. I'll bet the only reason why this setting works on small gears is that 8388 is low enough that setstack fails and the default ends up getting used. Python docs claim it needs 32k stacks just to run the interpreter. http://docs.python.org/2/library/thread.html Also, the embedded formula produces values that are not aligned to 4k boundaries. stack-size=<%= (((ENV['OPENSHIFT_GEAR_MEMORY_MB'].to_i * 0.8)/25) * 1024).to_i/2 The value given is likely being rounded down to the nearest 4k page boundary. Tweaking the stack size seems to risk blowing things up. And it doesn't matter much since most of what the script itself is doing will be on the heap which we have no control over. Perhaps a better alternative would be to tweak down the number of threads we create instead. We set 25 for every gear size. How about something like "10 threads per 1024MB" instead which would result in the following table: Gear size: Memory: threads: small 512MB 5 medium 1024MB 10 large 2048MB 20 Hi Rob, Thanks a **lot** for this investigation. Yeah, I agree that the better way would be to set the number of threads OR number of processes. I'll work on this today and make a PR. Commit pushed to master at https://github.com/openshift/origin-server https://github.com/openshift/origin-server/commit/3cfc6954453fad0952cf53910bc69ea8e0d7abb7 Bug 1020841 - Tune python cartridge by increasing number of threads instead of stack-size It's fixed, verified on devenv_3932. |