Bug 1873975

Summary: libacl users segfault when built with LTO
Product: [Fedora] Fedora Reporter: Dan Horák <dan>
Component: aclAssignee: Kamil Dudka <kdudka>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: hannsj_uhl, jmoskovc, kdudka, law, mruprich, steved, svashisht
Target Milestone: ---   
Target Release: ---   
Hardware: s390x   
OS: Unspecified   
Whiteboard:
Fixed In Version: acl-2.2.53-9.fc34 acl-2.2.53-9.fc33 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-09-25 16:40:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 467765    

Description Dan Horák 2020-08-31 08:50:24 UTC
I have noticed segfaults in various commands when latest acl-2.2.53-8.fc33 (from F-33 mass-build) has been installed.

2 examples captured by coredumpctl when doing a system update to latest rawhide
- /usr/bin/systemd-tmpfiles --create
- setfacl -Rnm g:wheel:rx,d:g:wheel:rx,g:adm:rx,d:g:adm:rx /var/log/journal/

both commands are related to upgrading systemd, probably in some scriptlet

My suspicion was on enabled LTO for the -8 build and yes, when rebuilt without LTO, there are no such crashes.

Another issue is that, probably, acl during the build uses libacl from the system and not the newly built one, so it's not possible to rebuild it due test failures in the %check section. So my procedure on a local system was to downgrade to -6 (pre-mass rebuild, without LTO), new build with LTO disabled and update to the newly built rpms.

And as Jeff (in CC) already proved in other packages, the runtime failure with LTO enabled can be caused by some (well) hidden bug in the particular package (using undefined behaviour or similar).

An example of the segfault is

[sharkcz@devel10 acl]$ coredumpctl info 1136432
           PID: 1136432 (setfacl)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 11 (SEGV)
     Timestamp: Thu 2020-08-27 07:59:07 EDT (3 days ago)
  Command Line: setfacl -Rnm g:wheel:rx,d:g:wheel:rx,g:adm:rx,d:g:adm:rx /var/log/journal/
    Executable: /usr/bin/setfacl
 Control Group: /user.slice/user-1000.slice/session-113.scope
          Unit: session-113.scope
         Slice: user-1000.slice
       Session: 113
     Owner UID: 1000 (sharkcz)
       Boot ID: 9c6c8b8d0a15416d9402291365a93a0a
    Machine ID: 9f494311b8fe4625a05e6f0acd9c4b3f
      Hostname: devel10.s390.XXX
       Storage: /var/lib/systemd/coredump/core.setfacl.0.9c6c8b8d0a15416d9402291365a93a0a.1136432.1598529547000000000000.zst (inaccessible)
       Message: Process 1136432 (setfacl) of user 0 dumped core.
                
                Stack trace of thread 1136432:
                #0  0x000003ffb2d01c9c __acl_entry_pp_compare (libacl.so.1 + 0x1c9c)
                #1  0x000003ffb2b4c588 msort_with_tmp.part.0 (libc.so.6 + 0x4c588)
                #2  0x000003ffb2b4c46e msort_with_tmp.part.0 (libc.so.6 + 0x4c46e)
                #3  0x000003ffb2b4c46e msort_with_tmp.part.0 (libc.so.6 + 0x4c46e)
                #4  0x000003ffb2b4c8ba qsort_r (libc.so.6 + 0x4c8ba)
                #5  0x000003ffb2b4cc26 qsort (libc.so.6 + 0x4cc26)
                #6  0x000003ffb2d01d8a __acl_reorder_obj_p (libacl.so.1 + 0x1d8a)
                #7  0x000003ffb2d06024 acl_get_file (libacl.so.1 + 0x6024)
                #8  0x000002aa0d7053dc do_set (setfacl + 0x53dc)
                #9  0x000002aa0d706dd8 walk_tree_rec.constprop.0 (setfacl + 0x6dd8)
                #10 0x000002aa0d707346 walk_tree.part.0.constprop.0 (setfacl + 0x7346)
                #11 0x000002aa0d7075a0 next_file (setfacl + 0x75a0)
                #12 0x000002aa0d70308a main (setfacl + 0x308a)
                #13 0x000003ffb2b2bb4a __libc_start_main (libc.so.6 + 0x2bb4a)
                #14 0x000002aa0d7047f4 _start (setfacl + 0x47f4)


Version-Release number of selected component (if applicable):
acl-2.2.53-8.fc33

Comment 1 Michal Ruprich 2020-08-31 09:33:05 UTC
I found a little bit easier reproducer for this when rebuilding rsync. In the testsuite, rsync is testing whether various ACLs were correctly transferred and it fails on setfacl -k. Steps to reproduce:

1. # cd /tmp/
2. tmp # mkdir testdir
3. tmp # setfacl -k testdir/
4. tmp # setfacl -dm u::7,g::5,o:5 testdir/
5. tmp # cd testdir/
6. testdir # mkdir inner_dir
7. testdir # setfacl -k inner_dir/
Segmentation fault (core dumped)

Looking at the coredump Dan has provided, my is practically the same. This was tested on s390x as stated in the HW section in this bz.

Comment 2 Dan Horák 2020-08-31 10:52:28 UTC
diff -up acl-2.2.53/test/Makemodule.am.ld acl-2.2.53/test/Makemodule.am
--- acl-2.2.53/test/Makemodule.am.ld	2020-08-31 06:42:03.695409810 -0400
+++ acl-2.2.53/test/Makemodule.am	2020-08-31 06:44:57.905409810 -0400
@@ -33,5 +33,5 @@ libtestlookup_la_SOURCES = test/test_pas
 libtestlookup_la_CFLAGS = -DBASEDIR=\"$(abs_srcdir)\"
 libtestlookup_la_LDFLAGS = -rpath $(abs_builddir)
 
-AM_TESTS_ENVIRONMENT = PATH="$(abs_top_builddir):$$PATH";
+AM_TESTS_ENVIRONMENT = export PATH="$(abs_top_builddir):$$PATH" LD_LIBRARY_PATH="$(abs_top_builddir)/.libs:$$LD_LIBRARY_PATH";
 TEST_LOG_COMPILER = $(srcdir)/test/runwrapper


^^^ enables running with the freshly build libacl

Comment 3 Kamil Dudka 2020-08-31 15:45:39 UTC
Thank you for reporting the bug!  The code breaks strict aliasing rules.  I am working on a fix.

Comment 5 Fedora Update System 2020-08-31 16:22:30 UTC
FEDORA-2020-aa8f1b7735 has been submitted as an update to Fedora 33. https://bodhi.fedoraproject.org/updates/FEDORA-2020-aa8f1b7735

Comment 6 Fedora Update System 2020-08-31 18:58:37 UTC
FEDORA-2020-aa8f1b7735 has been pushed to the Fedora 33 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-aa8f1b7735`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-aa8f1b7735

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 7 Kamil Dudka 2020-08-31 20:34:06 UTC
upstream commit: http://git.savannah.nongnu.org/cgit/acl.git/commit/?id=cad5d695

Comment 8 Fedora Update System 2020-09-25 16:40:37 UTC
FEDORA-2020-aa8f1b7735 has been pushed to the Fedora 33 stable repository.
If problem still persists, please make note of it in this bug report.