Bug 1120439
Summary: | MemoryError during /csv/action_export causes all subsequent HTTP requests to fail | ||
---|---|---|---|
Product: | [Retired] Beaker | Reporter: | Dan Callaghan <dcallagh> |
Component: | general | Assignee: | Dan Callaghan <dcallagh> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | tools-bugs <tools-bugs> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 0.17 | CC: | aigao, asaha, dcallagh, pbunyan, rmancy |
Target Milestone: | 0.17.3 | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-08-14 04:50:32 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Dan Callaghan
2014-07-16 23:49:31 UTC
If the MemoryError was due to some large allocation failing, or enough of the stack has been unwound, then allocations may actually work. This is quite a tricky one to reproduce in the gunicorn dev server. I was trying to fetch systems CSV and gradually reducing rlimit_as until it no longer succeeded. The problem was that below 660000000 rather than hitting a MemoryError in Python land, the worker would abort with this bizarre message: libgcc_s.so.1 must be installed for pthread_cancel to work That turned out to be because the MySQL client libraries do some pthread hackery which involves spawning a new thread that calls pthread_exit() on itself: http://osxr.org/mysql/source/mysys/my_thr_init.c#0054 But in order to implement stack unwinding in pthread_exit() glibc also has some hackery which dlopen's libgcc_s.so in order to use GCC's stack unwinding machinery: https://sourceware.org/ml/libc-help/2009-10/msg00023.html But the dlopen() was failing with ENOMEM because rlimit_as was already exceeded. Anyway, it turns out I could reproduce the MemoryError by exporting the system key-values CSV with rlimit_as 700000000, I guess because the system key-values CSV is much larger and involves loading more stuff into the SQLAlchemy session. (This is using a production db dump.) On Gerrit: http://gerrit.beaker-project.org/3216 This bug will stay at ON_QA until 0.17.3 passes smoke testing. We decided that independently verifying the fix was not feasible given how difficult it is to reproduce the exact failure scenario. While writing the patch I did verify on my development VM with a production DB dump that the worker process now aborts if session.close() fails due to MemoryError. Beaker 0.17.3 has been released (https://beaker-project.org/docs/whats-new/release-0.17.html#beaker-0-17-3) |