Red Hat Bugzilla – Bug 1020841
Get " signal Segmentation fault" error in log and return 500 when accessing python-2.7/2.6 app with medium gear &python-2.6 app with large gear
Last modified: 2016-09-29 22:15:18 EDT
Description of problem:
when creating a python-2.7/2.6 app with medium gear size, will get 500 page when access this app via browser.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Create a python-2.7/2.6 app with medium gear size
rhc app create cpy27 python-2.7 -g medium --no-git
2. Access this app via browser and tail the log of this app
rhc tail cpy27
1) get 500 page(Internal Server Error)
2) meet errors:
[Fri Oct 18 06:11:20 2013] [error] [client 127.2.1.129] Premature end of script headers: application
[Fri Oct 18 06:11:20 2013] [notice] child pid 31629 exit signal Segmentation fault (11)
Should access python-2.7/2.6 app with medium gear size via browser
Could be reproduced with large gear python-2.6 app as well:
[Mon Oct 21 05:58:05 2013] [notice] Apache/2.2.15 (Unix) mod_wsgi/3.2 Python/2.6.6 configured -- resuming normal operations
[Mon Oct 21 05:59:29 2013] [error] [client 127.1.246.1] Premature end of script headers: application
[Mon Oct 21 05:59:29 2013] [notice] child pid 14415 exit signal Segmentation fault (11)
This is due to setting stack-size. I can reliably test this by removing the setting from performance.conf.erb to show no failure or putting it back in to show a failure.
Looking through a few core dumps, it appears as though the settings are causing writes off of the stack which is causing the segfault.
The docs on WSGIDaemonProcess mention that the value for stack-size is in bytes. We appear to be making the following settings:
Gear size: Memory: stack-size:
small 512MB 8388 bytes
medium 1024MB 16777 bytes
large 2048MB 33554 bytes
The system limits allow up to 10485760 bytes and the default may well be that value.
Looking at the manpage for pthread_attr_setstack, it fails if you try to set a stack size below 16384 bytes. I'll bet the only reason why this setting works on small gears is that 8388 is low enough that setstack fails and the default ends up getting used.
Python docs claim it needs 32k stacks just to run the interpreter.
Also, the embedded formula produces values that are not aligned to 4k boundaries.
stack-size=<%= (((ENV['OPENSHIFT_GEAR_MEMORY_MB'].to_i * 0.8)/25) * 1024).to_i/2
The value given is likely being rounded down to the nearest 4k page boundary.
Tweaking the stack size seems to risk blowing things up. And it doesn't matter much since most of what the script itself is doing will be on the heap which we have no control over.
Perhaps a better alternative would be to tweak down the number of threads we create instead.
We set 25 for every gear size. How about something like "10 threads per 1024MB" instead which would result in the following table:
Gear size: Memory: threads:
small 512MB 5
medium 1024MB 10
large 2048MB 20
Thanks a **lot** for this investigation. Yeah, I agree that the better way would be to set the number of threads OR number of processes. I'll work on this today and make a PR.
Commit pushed to master at https://github.com/openshift/origin-server
Bug 1020841 - Tune python cartridge by increasing number of threads instead of stack-size
It's fixed, verified on devenv_3932.