Bug 1747901

Summary: ipactl command for ipa.service segfaults on process exit.
Product: [Fedora] Fedora Reporter: Christian Heimes <cheimes>
Component: python3Assignee: Charalampos Stratakis <cstratak>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: awilliam, cstratak, dmalcolm, jpazdziora, lslebodn, m.cyprian, mhroncok, pviktori, rkuska, robatino, shcherbina.iryna, slavek.kabrda, tomspur, torsava, vstinner
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: openqa
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-07 15:56:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1705303    

Description Christian Heimes 2019-09-02 08:50:27 UTC
Description of problem:
The ipactl command used by ipa.service segfaults on process exit.

Version-Release number of selected component (if applicable):
freeipa-server-4.8.1-2.fc32.x86_64
python3-3.8.0~b4-1.fc32.x86_64

How reproducible:
always

Steps to Reproduce:
1. Install FreeIPA with ipa-server-install
2. /usr/sbin/ipactl start

Actual results:
Existing service file detected!
Assuming stale, cleaning and proceeding
Starting Directory Service
Starting krb5kdc Service
Starting kadmin Service
Starting httpd Service
Starting ipa-custodia Service
Starting pki-tomcatd Service
Starting ipa-otpd Service
ipa: INFO: The ipactl command was successful
Segmentation fault (core dumped)


Expected results:
No segfault

Additional info:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7be354f in _PyFunction_Vectorcall (func=<function at remote 0x7fffe6ebfd30>, stack=0x7fffffffdb50, nargsf=1, kwnames=0x0) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/call.c:395
395         Py_ssize_t nkwargs = (kwnames == NULL) ? 0 : PyTuple_GET_SIZE(kwnames);
(gdb) bt
#0  0x00007ffff7be354f in _PyFunction_Vectorcall (func=<function at remote 0x7fffe6ebfd30>, stack=0x7fffffffdb50, nargsf=1, kwnames=0x0) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/call.c:395
#1  0x00007ffff7bb16b0 in _PyObject_Vectorcall (kwnames=0x0, nargsf=1, args=0x7fffffffdb50, callable=<function at remote 0x7fffe6ebfd30>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Include/cpython/abstract.h:127
#2  _PyObject_FastCall (nargs=1, args=0x7fffffffdb50, func=<function at remote 0x7fffe6ebfd30>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Include/cpython/abstract.h:147
#3  object_vacall (base=<optimized out>, callable=<function at remote 0x7fffe6ebfd30>, vargs=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/call.c:1186
#4  0x00007ffff7bb19e1 in PyObject_CallFunctionObjArgs (callable=callable@entry=<function at remote 0x7fffe6ebfd30>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/call.c:1259
#5  0x00007ffff7c3f7c3 in handle_callback (ref=<optimized out>, callback=<function at remote 0x7fffe6ebfd30>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/weakrefobject.c:877
#6  0x00007ffff7c01bf7 in PyObject_ClearWeakRefs (object=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/weakrefobject.c:922
#7  0x00007fffe905babc in ctypedescr_dealloc () from /usr/lib64/python3.8/site-packages/_cffi_backend.cpython-38-x86_64-linux-gnu.so
#8  0x00007fffe9058825 in cfield_dealloc () from /usr/lib64/python3.8/site-packages/_cffi_backend.cpython-38-x86_64-linux-gnu.so
#9  0x00007ffff7ba4c65 in _Py_DECREF (filename=<synthetic pointer>, lineno=541, op=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/dictobject.c:2000
#10 _Py_XDECREF (op=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Include/object.h:541
#11 free_keys_object (keys=0x55555609f590) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/dictobject.c:580
#12 dictkeys_decref (dk=0x55555609f590) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/dictobject.c:324
#13 dict_dealloc (mp=0x7fffe6dd5340) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/dictobject.c:1994
#14 0x00007fffe905bb25 in ctypedescr_dealloc () from /usr/lib64/python3.8/site-packages/_cffi_backend.cpython-38-x86_64-linux-gnu.so
#15 0x00007ffff7ba5058 in _Py_DECREF (filename=<optimized out>, lineno=541, op=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Include/object.h:478
#16 _Py_XDECREF (op=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Include/object.h:541
#17 tupledealloc (op=0x7fffe6e3bbb0) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/tupleobject.c:247
#18 0x00007ffff7ba5058 in _Py_DECREF (filename=<optimized out>, lineno=541, op=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Include/object.h:478
#19 _Py_XDECREF (op=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Include/object.h:541
#20 tupledealloc (op=0x7fffe6e55b80) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/tupleobject.c:247
#21 0x00007ffff7c16b76 in _Py_DECREF (filename=0x7ffff7d2e6a8 "/builddir/build/BUILD/Python-3.8.0b4/Objects/typeobject.c", lineno=1110, op=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Include/object.h:478
#22 clear_slots (self=<optimized out>, type=0x55555559d0d0) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/typeobject.c:1110
#23 subtype_dealloc (self=<KeyedRef at remote 0x7fffe6deb220>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/typeobject.c:1262
#24 0x00007ffff7ba4c65 in _Py_DECREF (filename=<synthetic pointer>, lineno=541, op=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/dictobject.c:2000
#25 _Py_XDECREF (op=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Include/object.h:541
#26 free_keys_object (keys=0x555556078d30) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/dictobject.c:580
#27 dictkeys_decref (dk=0x555556078d30) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/dictobject.c:324
#28 dict_dealloc (mp=0x7fffe6eba6c0) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/dictobject.c:1994
#29 0x00007ffff7b98a4b in _Py_DECREF (filename=<synthetic pointer>, lineno=541, op=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Include/object.h:478
#30 _Py_XDECREF (op=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Include/object.h:541
#31 cell_dealloc (op=0x7fffe6ebe2e0) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/cellobject.c:84
#32 0x00007ffff7ba5058 in _Py_DECREF (filename=<optimized out>, lineno=541, op=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Include/object.h:478
#33 _Py_XDECREF (op=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Include/object.h:541
#34 tupledealloc (op=0x7fffe6ebe310) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/tupleobject.c:247
#35 0x00007ffff7b991c2 in _Py_DECREF (filename=0x7ffff7d2e460 "/builddir/build/BUILD/Python-3.8.0b4/Objects/funcobject.c", lineno=584, op=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Include/object.h:478
#36 func_clear (op=0x7fffe6ebfd30) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Objects/funcobject.c:584
#37 0x00007ffff7ba5f84 in delete_garbage (state=<optimized out>, state=<optimized out>, old=<optimized out>, collectable=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Modules/gcmodule.c:929
#38 collect (generation=2, n_collected=0x0, n_uncollectable=0x0, nofail=1, state=0x7ffff7dd9598 <_PyRuntime+344>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Modules/gcmodule.c:1106
#39 0x00007ffff7cac8c2 in _PyGC_CollectNoFail () at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Modules/gcmodule.c:1848
#40 0x00007ffff7cacb74 in PyImport_Cleanup () at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Python/import.c:541
#41 0x00007ffff7cacf56 in Py_FinalizeEx () at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Python/pylifecycle.c:1226
#42 0x00007ffff7cad088 in Py_Exit (sts=0) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Python/pylifecycle.c:2248
#43 0x00007ffff7cad0cf in handle_system_exit () at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Python/pythonrun.c:658
#44 0x00007ffff7cad219 in _PyErr_PrintEx (set_sys_last_vars=1, tstate=0x55555555bfb0) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Python/pythonrun.c:755
#45 PyErr_PrintEx (set_sys_last_vars=set_sys_last_vars@entry=1) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Python/pythonrun.c:755
#46 0x00007ffff7cad23a in PyErr_Print () at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Python/pythonrun.c:761
#47 0x00007ffff7b91b42 in PyRun_SimpleFileExFlags (fp=<optimized out>, filename=<optimized out>, closeit=<optimized out>, flags=0x7fffffffe178) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Python/pythonrun.c:434
#48 0x00007ffff7cae36f in pymain_run_file (cf=0x7fffffffe178, config=0x55555555b410) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Modules/main.c:383
#49 pymain_run_python (exitcode=0x7fffffffe170) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Modules/main.c:567
#50 Py_RunMain () at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Modules/main.c:646
#51 0x00007ffff7cae559 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/python3-3.8.0~b4-1.fc32.x86_64/Modules/main.c:700
#52 0x00007ffff7e21193 in __libc_start_main () from /lib64/libc.so.6
#53 0x000055555555508e in _start ()


Sep 02 04:34:25 host-10-0-138-15.ipa.example systemd[1]: ipa.service: Main process exited, code=dumped, status=11/SEGV
Sep 02 04:34:25 host-10-0-138-15.ipa.example systemd[1]: ipa.service: Failed with result 'core-dump'.
Sep 02 04:34:25 host-10-0-138-15.ipa.example systemd[1]: Failed to start Identity, Policy, Audit.
Sep 02 04:34:25 host-10-0-138-15.ipa.example audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=ipa comm="systemd" exe="/usr/lib/systemd/systemd" ho>
Sep 02 04:34:25 host-10-0-138-15.ipa.example systemd[1]: ipa.service: Consumed 2.168s CPU time.
Sep 02 04:34:25 host-10-0-138-15.ipa.example audit[25892]: ANOM_ABEND auid=0 uid=0 gid=0 ses=4 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=25892 comm="ipa-server-inst" exe="/usr/bin/python>
Sep 02 04:34:25 host-10-0-138-15.ipa.example kernel: ipa-server-inst[25892]: segfault at 18 ip 00007f45139cf54f sp 00007ffcdc4e47b0 error 4 in libpython3.8.so.1.0[7f45138dd000+1be000]
Sep 02 04:34:25 host-10-0-138-15.ipa.example kernel: Code: a8 e9 92 d5 f5 ff e9 8d d5 f5 ff 41 56 41 55 41 54 49 89 f4 55 53 4c 8b 57 10 48 89 d3 48 8b 77 18 4c 8b 47 20 48 0f ba f3 3f <41> 8b 42 18 48 85 c9>
Sep 02 04:34:25 host-10-0-138-15.ipa.example systemd[1]: Started Process Core Dump (PID 29946/UID 0).
Sep 02 04:34:25 host-10-0-138-15.ipa.example audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump@19-29946-0 comm="systemd" exe="/usr>
Sep 02 04:34:25 host-10-0-138-15.ipa.example systemd-coredump[29945]: Process 29922 (ipactl) of user 0 dumped core.

Comment 1 Christian Heimes 2019-09-02 09:33:12 UTC
It looks like the func object is the local remove function of weakref.WeakValueDictionary() implementation, https://github.com/python/cpython/blob/353053d9ad08fea0e205e6c008b8a4350c0188e6/Lib/weakref.py#L90-L112

All function object fields except func_qualname and vectorcall are already NULL.

(gdb) print ((PyFunctionObject*)func).func_annotations 
$19 = 0x0
(gdb) print ((PyFunctionObject*)func).func_closure 
$20 = 0x0
(gdb) print ((PyFunctionObject*)func).func_code
$21 = 0x0
(gdb) print ((PyFunctionObject*)func).func_defaults 
$22 = 0x0
(gdb) print ((PyFunctionObject*)func).func_dict
$23 = 0x0
(gdb) print ((PyFunctionObject*)func).func_doc
$24 = 0x0
(gdb) print ((PyFunctionObject*)func).func_globals 
$25 = 0x0
(gdb) print ((PyFunctionObject*)func).func_kwdefaults 
$26 = 0x0
(gdb) print ((PyFunctionObject*)func).func_module 
$27 = 0x0
(gdb) print ((PyFunctionObject*)func).func_name
$28 = 0x0
(gdb) print ((PyFunctionObject*)func).func_qualname 
$29 = 'WeakValueDictionary.__init__.<locals>.remove'
(gdb) print ((PyFunctionObject*)func).func_weakreflist 
$30 = 0x0
(gdb) print ((PyFunctionObject*)func).vectorcall 
$31 = (vectorcallfunc) 0x7ffff7be3530 <_PyFunction_Vectorcall>
(gdb) print func.ob_refcnt 
$32 = 135
(gdb) print func.ob_type 
$33 = (struct _typeobject *) 0x7ffff7dd4e80 <PyFunction_Type>

Comment 2 Christian Heimes 2019-09-02 09:42:43 UTC
The crash occurs because PyFunction_GET_CODE(func) returns NULL and _PyFunction_Vectorcall() uses the code object without checking for a NULL code object.

Comment 3 Christian Heimes 2019-09-03 08:26:37 UTC
Victor and Pablo have analysed the issue and think that it might be related to a problem in CFFI, too. I have opened https://bitbucket.org/cffi/cffi/issues/416/python-38-segfault-cfield_type-does-not against CFFI. Armin Rigo is looking into the problem. I'll open a RHBZ against CFFI after CFFI upstream decides it's a bug.

Comment 4 Victor Stinner 2019-09-04 09:02:17 UTC
I'm still investigating the complex garbage collector issue. It seems like the root issue is a bug in Python itself.

Comment 5 Christian Heimes 2019-09-05 07:13:48 UTC
*** Bug 1747913 has been marked as a duplicate of this bug. ***

Comment 6 Adam Williamson 2019-09-06 00:18:38 UTC
openQA is seeing this also, now https://bugzilla.redhat.com/show_bug.cgi?id=1745450 is fixed in Rawhide. This blocks F32 Beta, as it prevents FreeIPA server deployment/operation from working.

Comment 7 Christian Heimes 2019-09-09 17:11:44 UTC
Victor Stinner has pushed a workaround for the segfault upstream, https://github.com/python/cpython/pull/15787

The patch does not address the root cause, but it gets rid of the segfault.

Comment 8 Victor Stinner 2019-09-12 10:14:36 UTC
I also pushed another change which change one of the root issues (there are multiple root issues, it's a complex bug): I removed func_clear() function which was added in Python 3.8.
https://github.com/python/cpython/pull/15826

The fix will be part of next Python 3.8.0rc1 release (expected release date: 2019-09-30).

Comment 9 Jan Pazdziora (Red Hat) 2019-10-06 06:39:47 UTC
With python3-3.8.0~rc1-1.fc32.x86_64 I no longer see the failure: https://travis-ci.org/adelton/freeipa-container/jobs/594119521

Comment 10 Adam Williamson 2019-10-07 15:56:12 UTC
Yeah, openQA server deployment is working again now too. Though it seems webUI login always fails, I'll need to look into that.