Bug 1854732

Summary: gnocchi_wsgi segfaults in envs with ceph
Product: Red Hat OpenStack Reporter: Masayuki Igawa <migawa>
Component: gnocchiAssignee: Matthias Runge <mrunge>
Status: CLOSED DUPLICATE QA Contact: Leonid Natapov <lnatapov>
Severity: medium Docs Contact:
Priority: medium    
Version: 16.0 (Train)CC: apevec, bhubbard, csibbitt, jjoyce, jschluet, lhh, lmadsen, mrunge, pkilambi
Target Milestone: asyncKeywords: Triaged, ZStream
Target Release: 16.0 (Train on RHEL 8.1)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-09-30 15:17:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Masayuki Igawa 2020-07-08 06:38:44 UTC
Description of problem:

In RHOSP16 setup with ceph, on controller nodes shows segfault issues with gnocchi_wsgi.

<DATE> <HOSTNAME> kernel: gnocchi_wsgi   [10292]: segfault at 58166a87 ip 0000000058166a87 sp 00007fcc4a7fa870 error 14 in httpd[5636a4c6c000+86000]


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Setup gnocchi with ceph
2. openstack metric measures show <ID>

Actual results:
The command response is 500 HTTP error.
Segfault errors happens.

Expected results:
No error happens.

Additional info:
It's not 100% sure but it seems that the issue seems to be specific only to setups with ceph.
We got core files from the errors, and tried to analyze them with gdb but we got very less information like following.
~~~
warning: .dynamic section for "/usr/lib64/python3.6/site-packages/markupsafe/_speedups.cpython-36m-x86_64-linux-gnu.so" is not at the expected address (wrong library or version mismatch?)

warning: Could not load shared library symbols for 15 libraries, e.g. /etc/httpd/modules/mod_ssl.so.
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `gnocchi_wsgi    -DFOREGROUND'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000d4596a87 in ?? ()
[Current thread is 1 (Thread 0x7f46c5ffb700 (LWP 841))]
(gdb) bt
#0  0x00000000d4596a87 in ?? ()
#1  0x00007f46d77dd250 in ?? ()
#2  0x00007f46d4cbdb08 in ?? ()
#3  0x0000000000000034 in ?? ()
#4  0x00007f46d4d61db8 in ?? ()
#5  0x00007f46d4cb66c8 in ?? ()
#6  0x00007f46d4cfdac0 in ?? ()
#7  0x00007f47090304c0 in small_ints () from /lib64/libpython3.6m.so.1.0
#8  0x11f3d9dfdcdf7e00 in ?? ()
#9  0x0000000000000000 in ?? ()
(gdb) 
~~~

Comment 1 Chris Sibbitt 2020-07-08 15:08:17 UTC
Potentially related to https://bugzilla.redhat.com/show_bug.cgi?id=1727907

Comment 3 Masayuki Igawa 2020-07-14 00:28:07 UTC
Hi,

Does it look similar to this bug? https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1813582
This is a ubuntu's bug, though.

Comment 5 Matthias Runge 2020-09-03 13:42:44 UTC
The solution is the same as for https://bugzilla.redhat.com/show_bug.cgi?id=1727907

Comment 8 Masayuki Igawa 2020-09-07 01:21:46 UTC
Hi Matthias,
Thank you for the information! I'll check the BZ.

Comment 12 Matthias Runge 2020-09-30 15:17:38 UTC
We are closing this for now. Please re-open if we need to fix it in aysnc or hot-fix.

*** This bug has been marked as a duplicate of bug 1727907 ***