This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1871869 - setroubleshootd crashes when ffi closure cannot execute
Summary: setroubleshootd crashes when ffi closure cannot execute
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: pygobject3
Version: 8.2
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: 8.0
Assignee: Kalev Lember
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-24 14:04 UTC by Renaud Métrich
Modified: 2023-09-15 16:53 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-09-15 16:53:55 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker   RHEL-4084 0 None Migrated None 2023-09-15 16:53:52 UTC
Red Hat Knowledge Base (Solution) 5478591 0 None None None 2020-10-09 10:39:24 UTC

Internal Links: 1875340

Description Renaud Métrich 2020-08-24 14:04:21 UTC
Description of problem:

On customer's systems running latest packages for RHEL8.2 experience Python crashing 100% of the time when executing setroubleshootd.
Core dump analysis shows that it crashes due to GLib callback executing dereferencing a NULL pointer:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
Core was generated by `/usr/libexec/platform-python -Es /usr/sbin/setroubleshootd -f '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  g_callable_info_free_closure (callable_info=0x55fc1fcf02d0, closure=0x0)
    at girepository/girffi.c:426
426	  g_free (wrapper->ffi_closure.cif->arg_types);
[Current thread is 1 (Thread 0x7f584fde6740 (LWP 322982))]
(gdb) bt
#0  0x00007f584bb85814 in g_callable_info_free_closure (callable_info=0x55fc1fcf02d0, closure=0x0) at girepository/girffi.c:426
#1  0x00007f584bdd6b94 in _pygi_invoke_closure_free (data=0x55fc1fcefa80) at pygi-closure.c:638
#2  0x00007f584b652013 in g_source_callback_unref (cb_data=0x55fc25460090) at gmain.c:1546
#3  0x00007f584b652013 in g_source_callback_unref (cb_data=0x55fc25460090) at gmain.c:1539
#4  0x00007f584b653824 in g_source_destroy_internal (source=0x55fc22a85170, context=0x55fc2299d6c0, have_lock=1) at gmain.c:1236
#5  0x00007f584b655728 in g_main_dispatch (context=0x55fc2299d6c0) at gmain.c:3200
#6  0x00007f584b655728 in g_main_context_dispatch (context=context@entry=0x55fc2299d6c0) at gmain.c:3829
#7  0x00007f584b655a48 in g_main_context_iterate (context=0x55fc2299d6c0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3902
#8  0x00007f584b655d72 in g_main_loop_run (loop=0x55fc2545f520) at gmain.c:4098
#9  0x00007f584b40504e in ffi_call_unix64 () at ../src/x86/unix64.S:76
#10 0x00007f584b404a0f in ffi_call (cif=cif@entry=0x55fc2546e458, fn=<optimized out>, rvalue=<optimized out>, rvalue@entry=0x7fff451e4f38, avalue=<optimized out>) at ../src/x86/ffi64.c:525
#11 0x00007f584bdd9425 in pygi_invoke_c_callable (function_cache=0x55fc2546e3b0, state=<optimized out>, py_args=<optimized out>, py_kwargs=<optimized out>) at pygi-invoke.c:691
#12 0x00007f584bddaf3c in pygi_function_cache_invoke (function_cache=<optimized out>, py_args=py_args@entry=(<MainLoop at remote 0x7f583e11aa08>,), py_kwargs=py_kwargs@entry=0x0) at pygi-cache.c:863
#13 0x00007f584bdd9b4c in pygi_callable_info_invoke
    (info=<optimized out>, py_args=py_args@entry=(<MainLoop at remote 0x7f583e11aa08>,), kwargs=kwargs@entry=0x0, cache=<optimized out>, user_data=user_data@entry=0x0) at pygi-invoke.c:734
#14 0x00007f584bdd9b83 in _wrap_g_callable_info_invoke (self=<optimized out>, py_args=py_args@entry=(<MainLoop at remote 0x7f583e11aa08>,), kwargs=kwargs@entry=0x0) at pygi-invoke.c:771
#15 0x00007f584bdce973 in _callable_info_call (self=0x7f583e120928, args=(), kwargs=0x0) at pygi-info.c:561
#16 0x00007f584f2b9bfc in _PyObject_FastCallDict (func=<gi.FunctionInfo at remote 0x7f583e120928>, args=0x55fc2547a1b8, nargs=<optimized out>, kwargs=0x0)
    at /usr/src/debug/python3-3.6.8-23.el8.x86_64/Objects/abstract.c:2313
[...]
#34 0x00007f584f2942ee in run_file (p_cf=0x7fff451e59fc, filename=0x55fc1fadce30 L"/usr/sbin/setroubleshootd", fp=0x55fc1fadc360) at /usr/src/debug/python3-3.6.8-23.el8.x86_64/Modules/main.c:344
#35 0x00007f584f2942ee in Py_Main (argc=<optimized out>, argv=0x55fc1fadc2a0) at /usr/src/debug/python3-3.6.8-23.el8.x86_64/Modules/main.c:814
#36 0x000055fc1e0d0b96 in main (argc=5, argv=<optimized out>) at /usr/src/debug/python3-3.6.8-23.el8.x86_64/Programs/python.c:102

(gdb) p wrapper
$1 = (GIClosureWrapper *) 0x0

(gdb) info symbol g_callable_info_free_closure
g_callable_info_free_closure in section .text of /lib64/libgirepository-1.0.so.1

(gdb) 
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

We have wrapper being NULL, so dereferencing fails ...
The "wrapper" comes from frame 1 ("invoke_closure->closure") which is NULL here (this isn't expected at all).



Stracing client tool "sealert" and server tool "setroubleshootd" which both execute GLib internally, I can see the following checking when "GLib.py" and "GLib.cpython-36.pyc" are used):

sealert:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
1140907 08:23:05.819605 stat("/usr/lib64/python3.6/site-packages/gi/overrides/GLib.py", {st_dev=makedev(253, 2), st_ino=17246829, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=64, st_size=30093, st_atime=1598248902 /* 2020-08-24T01:01:42.361343008-0500 */, st_atime_nsec=361343008, st_mtime=1534094777 /* 2018-08-12T12:26:17-0500 */, st_mtime_nsec=0, st_ctime=1581368844 /* 2020-02-10T15:07:24.765005379-0600 */, st_ctime_nsec=765005379}) = 0 <0.000012>
1140907 08:23:05.819792 stat("/usr/lib64/python3.6/site-packages/gi/overrides/GLib.py", {st_dev=makedev(253, 2), st_ino=17246829, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=64, st_size=30093, st_atime=1598248902 /* 2020-08-24T01:01:42.361343008-0500 */, st_atime_nsec=361343008, st_mtime=1534094777 /* 2018-08-12T12:26:17-0500 */, st_mtime_nsec=0, st_ctime=1581368844 /* 2020-02-10T15:07:24.765005379-0600 */, st_ctime_nsec=765005379}) = 0 <0.000011>
1140907 08:23:05.819906 stat("/usr/lib64/python3.6/site-packages/gi/overrides/GLib.py", {st_dev=makedev(253, 2), st_ino=17246829, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=64, st_size=30093, st_atime=1598248902 /* 2020-08-24T01:01:42.361343008-0500 */, st_atime_nsec=361343008, st_mtime=1534094777 /* 2018-08-12T12:26:17-0500 */, st_mtime_nsec=0, st_ctime=1581368844 /* 2020-02-10T15:07:24.765005379-0600 */, st_ctime_nsec=765005379}) = 0 <0.000011>

1140907 08:23:05.819975 openat(AT_FDCWD, "/usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc", O_RDONLY|O_CLOEXEC) = 3</usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc> <0.006222>

1140907 08:23:05.826259 fstat(3</usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc>, {st_dev=makedev(253, 2), st_ino=25727846, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=48, st_size=23091, st_atime=1598248866 /* 2020-08-24T01:01:06.915509494-0500 */, st_atime_nsec=915509494, st_mtime=1534094795 /* 2018-08-12T12:26:35-0500 */, st_mtime_nsec=0, st_ctime=1581368844 /* 2020-02-10T15:07:24.773005380-0600 */, st_ctime_nsec=773005380}) = 0 <0.000009>

1140907 08:23:05.826343 lseek(3</usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc>, 0, SEEK_CUR) = 0 <0.000009>
1140907 08:23:05.826393 fstat(3</usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc>, {st_dev=makedev(253, 2), st_ino=25727846, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=48, st_size=23091, st_atime=1598248866 /* 2020-08-24T01:01:06.915509494-0500 */, st_atime_nsec=915509494, st_mtime=1534094795 /* 2018-08-12T12:26:35-0500 */, st_mtime_nsec=0, st_ctime=1581368844 /* 2020-02-10T15:07:24.773005380-0600 */, st_ctime_nsec=773005380}) = 0 <0.000009>
1140907 08:23:05.826459 read(3</usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc>, ..., 23092) = 23091 <0.000021>
1140907 08:23:05.826567 read(3</usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc>, "", 1) = 0 <0.000009>
1140907 08:23:05.826621 close(3</usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc>) = 0 <0.000017>
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Here we see that GLib.py is stat'ed to see if it's older than compiled GLib.cpython-36.pyc code. Since it's older, it uses the compiled code.


setroubleshootd:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
1140911 08:23:07.385134 stat("/usr/lib64/python3.6/site-packages/gi/overrides/GLib.py", {st_dev=makedev(253, 2), st_ino=17246829, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=64, st_size=30093, st_atime=1598248902 /* 2020-08-24T01:01:42.361343008-0500 */, st_atime_nsec=361343008, st_mtime=1534094777 /* 2018-08-12T12:26:17-0500 */, st_mtime_nsec=0, st_ctime=1581368844 /* 2020-02-10T15:07:24.765005379-0600 */, st_ctime_nsec=765005379}) = 0 <0.000025>
1140911 08:23:07.385362 stat("/usr/lib64/python3.6/site-packages/gi/overrides/GLib.py", {st_dev=makedev(253, 2), st_ino=17246829, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=64, st_size=30093, st_atime=1598248902 /* 2020-08-24T01:01:42.361343008-0500 */, st_atime_nsec=361343008, st_mtime=1534094777 /* 2018-08-12T12:26:17-0500 */, st_mtime_nsec=0, st_ctime=1581368844 /* 2020-02-10T15:07:24.765005379-0600 */, st_ctime_nsec=765005379}) = 0 <0.000024>
1140911 08:23:07.385496 stat("/usr/lib64/python3.6/site-packages/gi/overrides/GLib.py", {st_dev=makedev(253, 2), st_ino=17246829, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=64, st_size=30093, st_atime=1598248902 /* 2020-08-24T01:01:42.361343008-0500 */, st_atime_nsec=361343008, st_mtime=1534094777 /* 2018-08-12T12:26:17-0500 */, st_mtime_nsec=0, st_ctime=1581368844 /* 2020-02-10T15:07:24.765005379-0600 */, st_ctime_nsec=765005379}) = 0 <0.000024>

1140911 08:23:07.385584 openat(AT_FDCWD, "/usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc", O_RDONLY|O_CLOEXEC) = 3</usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc> <0.000038>

1140911 08:23:07.385671 fstat(3</usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc>, {st_dev=makedev(253, 2), st_ino=25727846, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=48, st_size=23091, st_atime=1598248866 /* 2020-08-24T01:01:06.915509494-0500 */, st_atime_nsec=915509494, st_mtime=1534094795 /* 2018-08-12T12:26:35-0500 */, st_mtime_nsec=0, st_ctime=1581368844 /* 2020-02-10T15:07:24.773005380-0600 */, st_ctime_nsec=773005380}) = 0 <0.000020>

1140911 08:23:07.385752 lseek(3</usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc>, 0, SEEK_CUR) = 0 <0.000015>
1140911 08:23:07.385810 fstat(3</usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc>, {st_dev=makedev(253, 2), st_ino=25727846, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=48, st_size=23091, st_atime=1598248866 /* 2020-08-24T01:01:06.915509494-0500 */, st_atime_nsec=915509494, st_mtime=1534094795 /* 2018-08-12T12:26:35-0500 */, st_mtime_nsec=0, st_ctime=1581368844 /* 2020-02-10T15:07:24.773005380-0600 */, st_ctime_nsec=773005380}) = 0 <0.000020>
1140911 08:23:07.385891 read(3</usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc>, ..., 23092) = 23091 <0.000036>
1140911 08:23:07.386007 read(3</usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc>, "", 1) = 0 <0.000020>
1140911 08:23:07.386080 close(3</usr/lib64/python3.6/site-packages/gi/overrides/__pycache__/GLib.cpython-36.pyc>) = 0 <0.000028>

STAT ON GLib.py AGAIN:
1140911 08:23:09.232603 stat("/usr/lib64/python3.6/site-packages/gi/overrides/GLib.py", {st_dev=makedev(253, 2), st_ino=17246829, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=64, st_size=30093, st_atime=1598248902 /* 2020-08-24T01:01:42.361343008-0500 */, st_atime_nsec=361343008, st_mtime=1534094777 /* 2018-08-12T12:26:17-0500 */, st_mtime_nsec=0, st_ctime=1581368844 /* 2020-02-10T15:07:24.765005379-0600 */, st_ctime_nsec=765005379}) = 0 <0.000014>

1140911 08:23:09.232681 openat(AT_FDCWD, "/usr/lib64/python3.6/site-packages/gi/overrides/GLib.py", O_RDONLY|O_CLOEXEC) = 9</usr/lib64/python3.6/site-packages/gi/overrides/GLib.py> <0.004274>
1140911 08:23:09.237015 fstat(9</usr/lib64/python3.6/site-packages/gi/overrides/GLib.py>, {st_dev=makedev(253, 2), st_ino=17246829, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=64, st_size=30093, st_atime=1598248902 /* 2020-08-24T01:01:42.361343008-0500 */, st_atime_nsec=361343008, st_mtime=1534094777 /* 2018-08-12T12:26:17-0500 */, st_mtime_nsec=0, st_ctime=1581368844 /* 2020-02-10T15:07:24.765005379-0600 */, st_ctime_nsec=765005379}) = 0 <0.000008>
1140911 08:23:09.237091 ioctl(9</usr/lib64/python3.6/site-packages/gi/overrides/GLib.py>, TCGETS, 0x7ffd6ce70db0) = -1 ENOTTY (Inappropriate ioctl for device) <0.000008>
1140911 08:23:09.237143 lseek(9</usr/lib64/python3.6/site-packages/gi/overrides/GLib.py>, 0, SEEK_CUR) = 0 <0.000007>
1140911 08:23:09.237188 read(9</usr/lib64/python3.6/site-packages/gi/overrides/GLib.py>, ..., 4096) = 4096 <0.000008>
...
1140911 08:23:09.237751 read(9</usr/lib64/python3.6/site-packages/gi/overrides/GLib.py>, "", 8192) = 0 <0.000007>
1140911 08:23:09.237793 close(9</usr/lib64/python3.6/site-packages/gi/overrides/GLib.py>) = 0 <0.000014>
1140911 08:23:09.237856 write(4<anon_inode:[eventfd]>, "\1\0\0\0\0\0\0\0", 8) = 8 <0.000009>
1140911 08:23:09.237895 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x18} ---

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Here we see that setroubleshootd also stats GLib.py and starts using compiled GLib.cpython-36.pyc code.
However, later we see GLib.py is read for some unknown reason even though the last "stat()" at timestamp 08:23:09.232603 (see "STAT ON GLib.py AGAIN" token above) is identical to all other stats, hence there is no modification of GLib.py that could explain Python "decides" to read it now and stop using GLib.cpython-36.pyc code.


My assumption is that something in Python implementation makes it believe the file changed, sources it again, which makes the pending objects in memory still referencing compiled code become dangling and causing the SEGV when the GLib callback is called.


Version-Release number of selected component (if applicable):

platform-python-3.6.8-23.el8.x86_64


How reproducible:

ALWAYS on customer site, on all of his RHEL8 systems, I'm not managing to reproduce at all.


Steps to Reproduce:
1. Execute "sealert -l \*"

Actual results:

setroubleshootd crashes

Comment 1 Petr Viktorin (pviktori) 2020-08-26 15:13:48 UTC
> We have wrapper being NULL, so dereferencing fails ...
> The "wrapper" comes from frame 1 ("invoke_closure->closure") which is NULL here (this isn't expected at all).

The backtrace looks like it's deep in GLib; I'm changing the component so the glib maintainers can take a look.
Let me know if I can help on the Python side; feel free to assign it back.

> Here we see that setroubleshootd also stats GLib.py and starts using compiled GLib.cpython-36.pyc code.
> However, later we see GLib.py is read for some unknown reason even though the last "stat()" at timestamp 08:23:09.232603 (see "STAT ON GLib.py AGAIN" token above) is identical to all other stats, hence there is no modification of GLib.py that could explain Python "decides" to read it now and stop using GLib.cpython-36.pyc code.

One reason Python reads the .py file is to display an error message: the backtrace is displayed with lines of the original source (.py), even if the bytecode cache (.pyc) is used for execution.
Reading the file doesn't necessarily mean that the already imported module is replaced by newly compiled code.

Comment 4 Renaud Métrich 2020-09-03 09:34:27 UTC
I can now reproduce at will.
All you need is to set up "noexec" on /tmp, /var/tmp and /dev/shm mount points:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
tmpfs	/tmp		tmpfs   defaults,nodev,nosuid,noexec 0 0
tmpfs	/var/tmp	tmpfs   defaults,nodev,nosuid,noexec 0 0
tmpfs	/dev/shm	tmpfs	nodev,nosuid,noexec	0 0
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

The strace of setroubleshootd shows it tries to create a temporary file under these locations to execute code, which fails due to having "noexec":
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
$ grep EPERM /tmp/setroubleshootd.strace 
5950  11:08:44.098864 mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 6</tmp/#36349 (deleted)>, 0) = -1 EPERM (Operation not permitted) <0.000004>
5950  11:08:44.098928 mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 6</var/tmp/#36350 (deleted)>, 0) = -1 EPERM (Operation not permitted) <0.000003>
5950  11:08:44.099070 mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 6</dev/shm/#36351 (deleted)>, 0) = -1 EPERM (Operation not permitted) <0.000003>
5950  11:08:44.099462 mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 7</dev/mqueue/ffiqJVA2R (deleted)>, 0) = -1 EPERM (Operation not permitted) <0.000004>
5950  11:08:44.099961 mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 7</dev/mqueue/ffiWsuidr (deleted)>, 0) = -1 EPERM (Operation not permitted) <0.000003>
5950  11:08:44.101134 mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 9</tmp/#36368 (deleted)>, 0) = -1 EPERM (Operation not permitted) <0.000003>
5950  11:08:44.101197 mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 9</var/tmp/#36370 (deleted)>, 0) = -1 EPERM (Operation not permitted) <0.000003>
5950  11:08:44.101255 mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 9</dev/shm/#36371 (deleted)>, 0) = -1 EPERM (Operation not permitted) <0.000002>
5950  11:08:44.101605 mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 10</dev/mqueue/ffiaUxeo0 (deleted)>, 0) = -1 EPERM (Operation not permitted) <0.000003>
5950  11:08:44.101992 mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 10</dev/mqueue/ffiqa6dzz (deleted)>, 0) = -1 EPERM (Operation not permitted) <0.000003>
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

This happens in the middle of parsing /proc/<pid>/mounts just after binding+listening on /var/run/setroubleshoot/setroubleshoot_server:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
5950  11:08:44.098372 bind(5<UNIX:[36342]>, {sa_family=AF_UNIX, sun_path="/var/run/setroubleshoot/setroubleshoot_server"}, 47) = 0 <0.000025>
 :
5950  11:08:44.098442 listen(5<UNIX:[36342,"/var/run/setroubleshoot/setroubleshoot_server"]>, 5) = 0 <0.000003>
 :
5950  11:08:44.098683 openat(AT_FDCWD, "/proc/mounts", O_RDONLY) = 6</proc/5950/mounts> <0.000010>
5950  11:08:44.098707 fstat(6</proc/5950/mounts>, {st_dev=makedev(0, 4), st_ino=36348, st_mode=S_IFREG|0444, st_nlink=1, st_uid=986, st_gid=982, st_blksize=1024, st_blocks=0, st_size=0, st_atime=1599124124 /* 2020-09-03T11:08:44.098161186+0200 */, st_atime_nsec=98161186, st_mtime=1599124124 /* 2020-09-03T11:08:44.098161186+0200 */, st_mtime_nsec=98161186, st_ctime=1599124124 /* 2020-09-03T11:08:44.098161186+0200 */, st_ctime_nsec=98161186}) = 0 <0.000002>
 :
5950  11:08:44.098810 close(6</proc/5950/mounts>) = 0 <0.000003>
5950  11:08:44.098825 openat(AT_FDCWD, "/tmp", O_RDWR|O_EXCL|O_CLOEXEC|O_TMPFILE, 0700) = 6</tmp/#36349 (deleted)> <0.000012>
5950  11:08:44.098851 ftruncate(6</tmp/#36349 (deleted)>, 4096) = 0 <0.000003>
5950  11:08:44.098864 mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 6</tmp/#36349 (deleted)>, 0) = -1 EPERM (Operation not permitted) <0.000004>
 :
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Then /etc/mtab is read and temporary file is created again, which ends up having the "could not allocate closure" message:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
 :
5950  11:08:44.100143 sendmsg(6<UNIX:[36354]>, {msg_name={sa_family=AF_UNIX, sun_path="/run/systemd/journal/socket"}, msg_namelen=29, msg_iov=[{iov_base="GLIB_OLD_LOG_API", iov_len=16}, {iov_base="=", iov_len=1}, {iov_base="1", iov_len=1}, {iov_base="\n", iov_len=1}, {iov_base="MESSAGE", iov_len=7}, {iov_base="\n", iov_len=1}, {iov_base="\33\0\0\0\0\0\0\0", iov_len=8}, {iov_base="could not allocate closure\n", iov_len=27}, {iov_base="\n", iov_len=1}, {iov_base="PRIORITY", iov_len=8}, {iov_base="=", iov_len=1}, {iov_base="4", iov_len=1}, {iov_base="\n", iov_len=1}], msg_iovlen=13, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 74 <0.000360>
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

# strings /usr/lib64/libgirepository-1.0.so.1.0.0 | grep "could not alloca"
could not allocate closure

Comment 6 Renaud Métrich 2020-09-03 10:23:35 UTC
Sorry, moving this back to gygobject3 for the following reason:

1. We have g_callable_info_prepare_closure (gobject-introspection package) just call ffi_closure_alloc()

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
364 ffi_closure *
365 g_callable_info_prepare_closure (GICallableInfo       *callable_info,
366                                  ffi_cif              *cif,
367                                  GIFFIClosureCallback  callback,
368                                  gpointer              user_data)
369 {
 :
380   closure = ffi_closure_alloc (sizeof (GIClosureWrapper), &exec_ptr);
381   if (!closure)
382     {
383       g_warning ("could not allocate closure\n");
384       return NULL;
385     }
 :
407   /* Return exec_ptr, which points to the same underlying memory as
408    * closure, but via an executable-non-writable mapping.
409    */
410   return exec_ptr;
411 }
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

We return NULL because ffi_closure_alloc() failed.

2. Caller of g_callable_info_prepare_closure() is _pygi_make_native_closure()

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
653 PyGICClosure*
654 _pygi_make_native_closure (GICallableInfo* info,
655                            GIScopeType scope,
656                            PyObject *py_function,
657                            gpointer py_user_data)
658 {
 :
675     fficlosure =
676         g_callable_info_prepare_closure (info, &closure->cif, _pygi_closure_handle,
677                                          closure);
678     closure->closure = fficlosure;
 :
684     return closure;
685 }
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

On line 678, we see the assignment to "closure" field without taking care of the result (fficlosure == NULL here).

I would expect the function to "return NULL" on "fficlosure == NULL" condition (after freeing "closure" of course).

Otherwise a condition should be evaluated to not call blindly g_callable_info_free_closure() in _pygi_invoke_closure_free() on line 639:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
634 void _pygi_invoke_closure_free (gpointer data)
635 {
636     PyGICClosure* invoke_closure = (PyGICClosure *) data;
637 
638     g_callable_info_free_closure (invoke_closure->info,
639                                   invoke_closure->closure);
 :
650 }
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Comment 10 Matthias Clasen 2021-04-16 13:43:28 UTC
This has to be some kind of permission issue. Any selinux denials in the logs?

Here is the relevant code from ffi_closure_alloc:


  /* Primary mapping is RW, but request permission to switch to PROT_EXEC later. */
  prot = PROT_READ | PROT_WRITE | PROT_MPROTECT(PROT_EXEC);
  dataseg = mmap(NULL, rounded_size, prot, MAP_ANON | MAP_PRIVATE, -1, 0);
  if (dataseg == MAP_FAILED)
    return NULL;

  /* Create secondary mapping and switch it to RX. */
  codeseg = mremap(dataseg, rounded_size, NULL, rounded_size, MAP_REMAPDUP);
  if (codeseg == MAP_FAILED) {
    munmap(dataseg, rounded_size);
    return NULL;
  }
  if (mprotect(codeseg, rounded_size, PROT_READ | PROT_EXEC) == -1) {
    munmap(codeseg, rounded_size);
    munmap(dataseg, rounded_size);
    return NULL;
  }


One of htese calls must fail.

Comment 11 Renaud Métrich 2021-05-05 14:48:09 UTC
There are no AVC denials, it's due to having "nosuid,noexec" on /tmp, /var/tmp and /dev/shm.

We have numerous customers having this kind of setup, this is apparently mandatory to meet STIG requirements.
See C#4 for the reproducer and C#6 for proposal.

Comment 12 juspence 2022-01-29 21:35:39 UTC
I'm also hitting this bug on my Thinkpad laptop running RHEL 8.5 CSB. It's very easy to reproduce, just add the three lines below to /etc/fstab:

tmpfs                   /dev/shm                tmpfs   defaults,nodev,noexec,nosuid        0 0
tmpfs                   /tmp                tmpfs   defaults,nodev,noexec,nosuid        0 0
/var/tmp                   /var/tmp         none          bind,nodev,noexec,nosuid        0 0

Then remount these paths with the new options:

$ for path in '/dev/shm' '/tmp' '/var/tmp'; do sudo mount -o remount $path; done

Then run strace on setroubleshootd, and you will see the call to mmap() that fails:

$ sudo strace --decode-fds=path setroubleshootd --nofork 2>&1 | egrep '/tmp|/shm'
openat(AT_FDCWD, "/tmp", O_RDWR|O_EXCL|O_CLOEXEC|O_TMPFILE, 0700) = 6</tmp/#90450 (deleted)>
ftruncate(6</tmp/#90450 (deleted)>, 4096) = 0
mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 6</tmp/#90450 (deleted)>, 0) = -1 EPERM (Operation not permitted)
close(6</tmp/#90450 (deleted)>)         = 0
openat(AT_FDCWD, "/var/tmp", O_RDWR|O_EXCL|O_CLOEXEC|O_TMPFILE, 0700) = 6</var/tmp/#8417060 (deleted)>
ftruncate(6</var/tmp/#8417060 (deleted)>, 4096) = 0
mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 6</var/tmp/#8417060 (deleted)>, 0) = -1 EPERM (Operation not permitted)
close(6</var/tmp/#8417060 (deleted)>)   = 0
openat(AT_FDCWD, "/dev/shm", O_RDWR|O_EXCL|O_CLOEXEC|O_TMPFILE, 0700) = 6</dev/shm/#93200 (deleted)>
ftruncate(6</dev/shm/#93200 (deleted)>, 4096) = 0
mmap(NULL, 4096, PROT_READ|PROT_EXEC, MAP_SHARED, 6</dev/shm/#93200 (deleted)>, 0) = -1 EPERM (Operation not permitted)
close(6</dev/shm/#93200 (deleted)>)     = 0

These mmap calls do not fail when /dev/shm, /tmp, and /var/tmp are mounted with "exec" instead of "noexec". Comment #4 also has more details. Is there any other info I can provide to help move this ticket forward? Or please let me know if I can help test a fix.

To explain the use case for nounting these paths as "noexec", comment #11 is correct that it's compliance-related. The "SCAP Workbench" program is used to scan a system for known misconfigurations and other issues, based on security hardening content / checklists provided by the "scap-security-guide" package. Many of the available configuration profiles, including "DISA STIG with GUI for Red Hat Enterprise Linux 8", require world-writable folders to be mounted with the "noexec" option.

The DISA profile is the one that US government agencies would likely use (and which I followed, triggering this bug). But there are many other profiles which probably have the same rule, e.g. NIST (other US government), HIPAA (US healthcare industry), PCI-DSS (US financial / retail / payment card industry), ANSSI (French government), and ACSC (Australian government). So this bug is probably affecting a lot of systems which have certain security requirements.

Comment 14 Martin Pitt 2022-06-15 11:26:30 UTC
TBH I have no idea about what all that does. My last pygi commit was 8 years ago, and I didn't work on these core FFI bits.

At first sight, this is too superficial, though -- you can of course try and ignore the ffi_closure_alloc() failure, and see what the tests say (assuming they fail on a system with noexec /tmp/). If they succeed afterwards, great, but that would surprise me -- these closures get created for a reason.

pygobject should of course be more defensive against that, and add a g_return_if_fail() if the ffi allocation fails. But it looks like this needs to be fixed in libffi, so that ffi_closure_alloc() works on such a system?

Comment 15 Tomas Popela 2022-06-15 11:35:55 UTC
(In reply to Martin Pitt from comment #14)
> TBH I have no idea about what all that does. My last pygi commit was 8 years
> ago, and I didn't work on these core FFI bits.

Ah, but thank you for your opinion!

> At first sight, this is too superficial, though -- you can of course try and
> ignore the ffi_closure_alloc() failure, and see what the tests say (assuming
> they fail on a system with noexec /tmp/). If they succeed afterwards, great,
> but that would surprise me -- these closures get created for a reason.
> 
> pygobject should of course be more defensive against that, and add a
> g_return_if_fail() if the ffi allocation fails. But it looks like this needs
> to be fixed in libffi, so that ffi_closure_alloc() works on such a system?

Any opinion dj or fweimer?

Comment 16 Renaud Métrich 2022-06-15 11:36:53 UTC
Hello,

Please note that the fact that libffi cannot execute when all temporary file systems are marked with "nodev" has been fixed through BZ #1875340 (RHEL 8.6.0), so now it's usually expected to run fine.
Still it would be nice to harden the code to not crash on ffi_closure_alloc() failure, who knows, there may be some scenario where this may happen.
We can hence downgrade the priority, but please try fixing this anyway.

Comment 17 Florian Weimer 2022-06-15 12:03:51 UTC
(In reply to Tomas Popela from comment #15)
> > pygobject should of course be more defensive against that, and add a
> > g_return_if_fail() if the ffi allocation fails. But it looks like this needs
> > to be fixed in libffi, so that ffi_closure_alloc() works on such a system?
> 
> Any opinion dj or fweimer?

libffi upstream has fixed this by offering a build options for static closures which do not depend at all on run-time code generation anymore. This work is not really backportable. We could maybe fix this with a private libffi version just for use with girepository in Red Hat Enterprise Linux 8.

Comment 18 RHEL Program Management 2023-09-15 16:51:48 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 19 RHEL Program Management 2023-09-15 16:53:55 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.


Note You need to log in before you can comment on or make changes to this bug.