Bug 1977410
Summary: | Python scripts crash with ANOM_ABEND when SELinux is enabled | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | James Chamberlain <james.chamberlain> | ||||||
Component: | libffi | Assignee: | DJ Delorie <dj> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | qe-baseos-tools-bugs | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 7.9 | CC: | amike, codonell, fweimer, jwright, pandrade, pviktori, vstinner | ||||||
Target Milestone: | rc | Keywords: | Triaged | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2022-05-13 17:31:54 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Created attachment 1795886 [details]
strace when running the sample code.
What is this script supposed to do? Why are you using internal private API from _ctypes (rather than ctypes)? Ah, I see: it's a simplified example, and I can reproduce with public API as well. I can reproduce the issue with Python 2.7.18 and 3.9.5 on Fedora 34, but also with the Python development branch (future 3.11). The crash occurs at "callback = None" in the parent process. It seems to be a crash in ffi_closure_free() if the process uses fork(). The closure is allocated by Python _ctypes_alloc_callback() -> ffi_closure_alloc(48, &code) -> libffi dlmalloc() -> libffi sys_alloc() -> libffi dlmmap(). dlmmap() (src/closures.c of libffi) allocates memory using mmap(): --- 24038 openat(AT_FDCWD, "/tmp/ffiiuLnmM", O_RDWR|O_CREAT|O_EXCL, 0600) = 3 24038 unlink("/tmp/ffiiuLnmM") = 0 24038 ftruncate(3, 4096) = 0 24038 mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 3, 0) = 0x7f9a71d08000 24038 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f9a71d07000 --- The crash occurs in PyCFuncPtr_dealloc(): PyCFuncPtr_dealloc -> PyCFuncPtr_clear -> PyCData_clear -> CThunkObject_dealloc -> ffi_closure_free -> abort(). (gdb) where #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49 #1 0x00007ffff7c4f8a4 in __GI_abort () at abort.c:79 #2 0x00007ffff7fb1307 in ffi_closure_free.cold () from /lib64/libffi.so.6 #3 0x00007fffea3b74cc in CThunkObject_dealloc (myself=<_ctypes.CThunkObject at remote 0x7fffea412510>) at /home/vstinner/python/main/Modules/_ctypes/callbacks.c:23 #4 0x000000000047ca4c in _Py_Dealloc (op=<_ctypes.CThunkObject at remote 0x7fffea412510>) at Objects/object.c:2246 #5 0x000000000045fa17 in _Py_DECREF (filename=0x750b40 "./Include/object.h", lineno=569, op=<_ctypes.CThunkObject at remote 0x7fffea412510>) at ./Include/object.h:502 #6 0x000000000045fa65 in _Py_XDECREF (op=<_ctypes.CThunkObject at remote 0x7fffea412510>) at ./Include/object.h:569 #7 0x00000000004616ad in free_keys_object (keys=0x7fffea4122c0) at Objects/dictobject.c:597 #8 0x0000000000460ba2 in dictkeys_decref (dk=0x7fffea4122c0) at Objects/dictobject.c:310 #9 0x0000000000465368 in dict_dealloc (mp=0x7fffea5b5a90) at Objects/dictobject.c:1946 #10 0x000000000047ca4c in _Py_Dealloc (op={'0': <_ctypes.CThunkObject at remote 0x7fffea412510>}) at Objects/object.c:2246 #11 0x00007fffea3aacd9 in _Py_DECREF (filename=0x7fffea3c2180 "/home/vstinner/python/main/Modules/_ctypes/_ctypes.c", lineno=2770, op={'0': <_ctypes.CThunkObject at remote 0x7fffea412510>}) at ./Include/object.h:502 #12 0x00007fffea3b01aa in PyCData_clear (self=0x7fffea632e40) at /home/vstinner/python/main/Modules/_ctypes/_ctypes.c:2770 #13 0x00007fffea3b3c6e in PyCFuncPtr_clear (self=0x7fffea632e40) at /home/vstinner/python/main/Modules/_ctypes/_ctypes.c:4262 #14 0x00007fffea3b3c88 in PyCFuncPtr_dealloc (self=0x7fffea632e40) at /home/vstinner/python/main/Modules/_ctypes/_ctypes.c:4268 #15 0x0000000000494483 in subtype_dealloc (self=<CFunctionType at remote 0x7fffea632e40>) at Objects/typeobject.c:1480 #16 0x000000000047ca4c in _Py_Dealloc (op=<CFunctionType at remote 0x7fffea632e40>) at Objects/object.c:2246 #17 0x000000000045fa17 in _Py_DECREF (filename=0x750b40 "./Include/object.h", lineno=569, op=<CFunctionType at remote 0x7fffea632e40>) at ./Include/object.h:502 #18 0x000000000045fa65 in _Py_XDECREF (op=<CFunctionType at remote 0x7fffea632e40>) at ./Include/object.h:569 # Python: callback = None #19 0x0000000000462bea in insertdict (mp=0x7fffea5b5710, key='callback', hash=1245351814164185, value=None) at Objects/dictobject.c:1020 #20 0x0000000000463f4f in PyDict_SetItem (op={...}, key='callback', value=None) at Objects/dictobject.c:1493 #21 0x00000000005184b6 in _PyEval_EvalFrameDefault (tstate=0x8f4ea0, f=Frame 0x7fffea6be3d0, for file /home/vstinner/bz1977410.py, line 26, in <module> (), throwflag=0) at Python/ceval.c:2617 Zoom into the CThunkObject_dealloc() crash: Program received signal SIGABRT, Aborted. __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49 49 return ret; (...) #1 0x00007ffff7c4f8a4 in __GI_abort () at abort.c:79 79 raise (SIGABRT); #2 0x00007ffff7fb1307 in dlfree (mem=<optimized out>) at ../src/dlmalloc.c:4345 4345 USAGE_ERROR_ACTION(fm, p); #3 ffi_closure_free (ptr=<optimized out>) at ../src/closures.c:615 615 dlfree (ptr); #4 0x00007fffea3b74cc in CThunkObject_dealloc (myself=<_ctypes.CThunkObject at remote 0x7fffea412510>) at /home/vstinner/python/main/Modules/_ctypes/callbacks.c:23 23 Py_ffi_closure_free(self->pcl_write); ffi_closure_alloc() function: 578 /* Allocate a chunk of memory with the given size. Returns a pointer 579 to the writable address, and sets *CODE to the executable 580 corresponding virtual address. */ 581 void * 582 ffi_closure_alloc (size_t size, void **code) 583 { 584 void *ptr; 585 586 if (!code) 587 return NULL; 588 589 ptr = dlmalloc (size); 590 591 if (ptr) 592 { 593 msegmentptr seg = segment_holding (gm, ptr); 594 595 *code = add_segment_exec_offset (ptr, seg); 596 } 597 598 return ptr; 599 } ffi_closure_free() function: 601 /* Release a chunk of memory allocated with ffi_closure_alloc. If 602 FFI_CLOSURE_FREE_CODE is nonzero, the given address can be the 603 writable or the executable address given. Otherwise, only the 604 writable address can be provided here. */ 605 void 606 ffi_closure_free (void *ptr) 607 { 608 #if FFI_CLOSURE_FREE_CODE 609 msegmentptr seg = segment_holding_code (gm, ptr); 610 611 if (seg) 612 ptr = sub_segment_exec_offset (ptr, seg); 613 #endif 614 615 dlfree (ptr); // <==== CRASH HERE 616 } Created attachment 1799983 [details]
ffi_closure_fork.c
I can reproduce the abort() without Python: try attached ffi_closure_fork.c which only uses libffi. The root issue is that fork() doesn't duplicate the memory page, memory is shared between the parent and the child process: when the child process calls ffi_closure_free(closure), it writes into a memory mapping shared with its parent. In libffi, src/closures.c dlmmap() function creates a memory mapping using a temporary file. See the syscalls below. --- The reproducer calls ffi_closure_alloc(48), fork(), call ffi_closure_free(closure) in the child process, and then call ffi_closure_free(closure) in the parent process => ffi_closure_free(closure) in the parent process calls abort(). Debug on Fedora 34. $ grep selinux /proc/mounts selinuxfs /sys/fs/selinux selinuxfs rw,nosuid,noexec,relatime 0 0 syscalls made by ffi_closure_alloc(48) according to strace: --- # check if SELinux is enabled 179976 statfs("/selinux", 0x7ffc03513a40) = -1 ENOENT (Aucun fichier ou dossier de ce type) 179976 openat(AT_FDCWD, "/proc/mounts", O_RDONLY) = 3 179976 newfstatat(3, "", {st_mode=S_IFREG|0444, st_size=0, ...}, AT_EMPTY_PATH) = 0 179976 read(3, "proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0\nsysfs /sys sysfs rw,seclabel,nosuid,nodev,noexec,relatime 0 0\ndevtmpfs /dev devtmpfs rw,seclabel,nosuid,size=16327432k,nr_inodes=4081858,mode=755,inode64 0 0\nsecurityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0\ntmpfs /dev/shm tmpfs rw,seclabel,nosuid,nodev,inode64 0 0\ndevpts /dev/pts devpts rw,seclabel,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0\ntmpfs /run tmpfs rw,seclabel,nosuid,nodev,size=6539344k,nr_inodes=819200,mode=755,inode64 0 0\ncgroup2 /sys/fs/cgroup cgroup2 rw,seclabel,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot 0 0\npstore /sys/fs/pstore pstore rw,seclabel,nosuid,nodev,noexec,relatime 0 0\nefivarfs /sys/firmware/efi/efivars efivarfs rw,nosuid,nodev,noexec,relatime 0 0\nnone /sys/fs/bpf bpf rw,nosuid,nodev,noexec,relatime,mode=700 0 0\n/dev/nvme0n1p3 / btrfs rw,seclabel,relatime,ssd,space_cache,subvolid=257,subvol=/root 0 0\nselinuxfs /sys/fs/selinux selinuxfs rw,nosuid,noexe"..., 1024) = 1024 179976 close(3) = 0 # Create a temporary file of 4096 bytes and remove it 179976 openat(AT_FDCWD, "/tmp/ffiBpWKix", O_RDWR|O_CREAT|O_EXCL, 0600) = 3 179976 unlink("/tmp/ffiBpWKix") = 0 179976 ftruncate(3, 4096) = 0 # Create two memory mappings on this file 179976 mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 3, 0) = 0x7f3fc9be1000 179976 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f3fc9be0000 --- ffi_closure_alloc(48) returns closure=0x7f3fc9be0010 (PROT_READ|PROT_WRITE mapping) and code=0x7f3fc9be1010 (PROT_READ|PROT_EXEC mapping). ffi_closure_free(closure) in the parent process calls dlfree(closure) which falls into the "erroraction:" label which calls USAGE_ERROR_ACTION(fm, p) (abort() in short). dlfree(closure) falls into erroraction label because "if (RTCHECK(ok_address(fm, p) && ok_cinuse(p)))" condition is false, because ok_cinuse(p) is false: (gdb) p *(mchunkptr)((char*)mem-2*sizeof(size_t)) $12 = { prev_foot = 0, head = 4025, fd = 0x0, bk = 0x0 } (gdb) p ((mchunkptr)((char*)mem-2*sizeof(size_t)))->head & 2 $13 = 0 In the parent process, after ffi_closure_alloc(): (((mchunkptr)((char*)closure - 2*sizeof(size_t)))->head & 2) equals 2: ok_cinuse(p) is true. In the parent process, after the child process completes, before it calls ffi_closure_free(closure): (((mchunkptr)((char*)closure - 2*sizeof(size_t)))->head & 2) equals 0: ok_cinuse(p) is false. The root issue is that fork() doesn't duplicate the memory page: when the child process calls ffi_closure_free(closure), it writes into the same memory mapping. Note: On Debian 10.9 without SELinux, the reproducer doesn't crash. After some investigation I found that this issue is somewhat simple to fix, but needs a new libffi build, to just ignore if selinux is enabled. Based on discussion for rhel6 at bz#707944 it would be required to run python on a selinux domain with execmem enabled. There is another related bug report at bz#1249685 Going a bit further, and also related to bz#707944 there was also the rhel6 bz#1558164 above related two problems, one was incorrect selinux detection in rhel6 (never fixed), and the other was filling up /tmp and not being able to create a mapping file to workaround the selinux denial. Due to the selinux incorrect detection, it was enough to run python on a domain with execmem enabled. Long history short, in rhel7 we have: $ getsebool deny_execmem deny_execmem --> off I confess I did not fully check the history and reason of that boolean, and the reason it is disabled by default; likely some issue with libffi as well. So, while doing some tests, and this libffi patch: """ $ cat ~/rpmbuild/SOURCES/libffi-map.patch diff -up libffi-3.0.13/src/closures.c.orig libffi-3.0.13/src/closures.c --- libffi-3.0.13/src/closures.c.orig 2021-07-23 12:47:24.543323289 -0400 +++ libffi-3.0.13/src/closures.c 2021-07-23 12:48:14.317075341 -0400 @@ -506,7 +506,7 @@ dlmmap (void *start, size_t length, int return ptr; } - if (execfd == -1 && !is_selinux_enabled ()) + if (execfd == -1)// && !is_selinux_enabled ()) { ptr = mmap (start, length, prot | PROT_EXEC, flags, fd, offset); """ it just works, because it first attempts a normal MAP_PRIVATE | PROT_EXEC mapping, and then if it fails, falls back to the one with the file backing, that will trigger the problem initially reported in this bugzilla. With the above patch it just works, unless I run: # setsetbool deny_execmem=on where it forces the file backing allocation of MAP_SHARED memory, and the problem of "fork without exec" in python happens again, causing a crash. I'm concerned about turning deny_execmem off based on the comments in install_selinux(8): If you want to deny user domains applications to map a memory region as both executable and writable, this is dangerous and the executable should be reported in bugzilla, you must turn on the deny_execmem bool‐ ean. Enabled by default. Could someone comment on the implications? It is already default off, apparently since rhel 7.2. It would prevent any kind of jit to work. Searching around, first hit I see is bz#1726682 related to firefox crashing. Second one is bz#1393320 with issues with ruby. Likely there would be PCRE issues as well, as AFAIK PCRE, or some versions of it, generates jit for some pattern magic logic. Likely there are several other jit around; the most common one is java https://lists.fedoraproject.org/archives/list/selinux@lists.fedoraproject.org/thread/2K7OADKIORFWWPZ63BYEY5TXHRZ2YPV3/ In selinux-policy.spec for rhel7 I see: * Tue Nov 8 2011 Dan Walsh <dwalsh> 3.10.0-55.2 - Remove allow_execmem boolean and replace with deny_execmem boolean and apparently it was made default in this commit: """ commit 0e96e5baafa986f56e6afaf9676ae855dd5650f2 Author: Zdenek Pytela <zpytela> Date: Mon May 11 19:12:15 2020 +0200 Update the patches to take into account the objects hash length change Update the policy-rhel-7.9-base.patch and policy-rhel-7.9-contrib.patch patch files to take into account the objects hash length has increased by one. The shortened index hash values have length dependent on number of objects in the repo so cannot be changed other way than using full hash values. Related: rhbz#1820298 commit 224f0564cebc2d84c3653a585e4b65dd007d85e8 Author: Zdenek Pytela <zpytela> Date: Mon Mar 23 09:25:47 2020 +0100 * Mon Mar 23 2020 Zdenek Pytela <zpytela> - 3.13.1-267 - Allow chronyd_t domain to exec shell Resolves: rhbz#1775573 - Allow pmie daemon to send signal pcmd daemon Resolves: rhbz#1770123 - Allow auditd poweroff or switch to single mode Resolves: rhbz#1780332 """ without comments about making it default; apparently just pulled from some batch of updates. Red Hat Enterprise Linux 7 is in Maintenance Support 2 phase and we generally only review urgent priority bug fixes. This particular issue has been present since the start of RHEL7 since libffi does not support fork without exec using the current set of supported ffi closures in RHEL7. We will continue to review this issue for RHEL8. In RHEL9 we expect to be able to use the static trampolines present in libffi 3.4 to solve this problem. We have reviewed this issue and will not be fixing this in Red Hat Enterprise Linux 7. |
Created attachment 1795885 [details] Sample code to reproduce the problem. Description of problem: When SELinux is enabled, even in permissive mode, the attached code crashes. When SELinux is disabled, it runs fine. The only message I see in audit.log is: type=ANOM_ABEND msg=audit(1624551198.965:416): auid=47927 uid=47927 gid=47927 ses=14 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=4090 comm="test.py" reason="memory violation" sig=6 I'm not seeing any AVC messages, even after running "semodule -DB". I spotted a Bugzilla report (1249685) from a few years back which mentioned python-cffi, the system doesn't have that package installed. Adjusting the deny_execmem boolean didn't have any effect either - which I'm taking as a good thing, as the warnings about that were sufficiently dire. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Enable SELinux, in either enforcing or permissive mode. 2. Run attached code. 3. Code crashes. Actual results: Expected results: Additional info: I have talked with the Red Hat SELinux Userspace team, who have reproduced the issue on Fedora 34 with Python 3.9. They suspect the issue may be in the Python _ctypes module.