Bug 1653068

Summary: Systemd core dumps on boot when mounting a raid device to a symlinked destination dir
Product: [Fedora] Fedora Reporter: bk
Component: selinux-policyAssignee: Lukas Vrabec <lvrabec>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 29CC: bk, dwalsh, lnykryn, lvrabec, mgrepl, msekleta, plautrba, ssahani, s, systemd-maint, walters, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-12-21 19:39:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
raw AVC messages from audit log none

Description bk 2018-11-25 12:53:48 UTC
Description of problem:
When booting, systemd core dumps when it tries to mount an mdadm array.
The array can be mounted without problems when it is not listed in /etc/fstab and is mounted manually after boot.


Version-Release number of selected component (if applicable):
Fedora 29 Atomic Host:
-------------------------------------
[root@frogstar tmp]# rpm-ostree status
State: idle
AutomaticUpdates: disabled
Deployments:
● ostree://fedora-atomic:fedora/29/x86_64/atomic-host
                   Version: 29.20181113.0 (2018-11-13T00:42:40Z)
                    Commit: 89bfa708d349a5856cc5cd3be441c07e1f96d0be2aa97e2b676f6004e7f6abed
              GPGSignature: Valid signature by 5A03B4DD8254ECA02FDA1637A20AA56B429476B4

  ostree://fedora-atomic:fedora/29/x86_64/atomic-host
                   Version: 29.20181025.1 (2018-10-25T14:46:54Z)
                    Commit: 4a999b4b303b47468ff1464051a14fd075d2e7b8bb647584b7cc80fed48cf27b
              GPGSignature: Valid signature by 5A03B4DD8254ECA02FDA1637A20AA56B429476B4
-------------------------------------



How reproducible:
1) 
I created a linear raid device with mdadm over four drives:
mdadm --create /dev/md/jbod --level=linear --raid-devices=4 /dev/sda /dev/sdb /dev/sdc /dev/sdd
2)
Created a GPT table and one primary partition spanning the whole device, resulting in /dev/md/jbod1
3)
Created xfs filesystem in /dev/md/jbod1, all default options
4)
Added a line in /etc/fstab to mount /dev/md/jbod1 at boot


Actual results:
System does not boot, systemd core dumps. Excerpt from log:
-------------------------------------
Nov 25 11:36:50 frogstar.fritz.box kernel: md127: detected capacity change from 0 to 48000005046272
Nov 25 11:36:50 frogstar.fritz.box kernel:  md127: p1
Nov 25 11:36:50 frogstar.fritz.box kernel: intel_rapl: Found RAPL domain package
Nov 25 11:36:50 frogstar.fritz.box kernel: intel_rapl: Found RAPL domain core
Nov 25 11:36:50 frogstar.fritz.box kernel: intel_rapl: Found RAPL domain uncore
Nov 25 11:36:50 frogstar.fritz.box kernel: intel_rapl: Found RAPL domain dram
Nov 25 11:36:50 frogstar.fritz.box kernel: checking generic (4000000000 1d5000) vs hw (4000000000 10000000)
Nov 25 11:36:50 frogstar.fritz.box kernel: fb: switching to inteldrmfb from EFI VGA
Nov 25 11:36:50 frogstar.fritz.box kernel: Console: switching to colour dummy device 80x25
Nov 25 11:36:50 frogstar.fritz.box kernel: [drm] Replacing VGA console driver
Nov 25 11:36:50 frogstar.fritz.box kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
Nov 25 11:36:50 frogstar.fritz.box kernel: [drm] Driver supports precise vblank timestamp query.
Nov 25 11:36:50 frogstar.fritz.box kernel: i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
Nov 25 11:36:50 frogstar.fritz.box kernel: [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4)
Nov 25 11:36:50 frogstar.fritz.box systemd[1]: Found device /dev/md/jbod1.
Nov 25 11:36:50 frogstar.fritz.box audit[1]: AVC avc:  denied  { getattr } for  pid=1 comm="systemd" path="/mnt" dev="dm-0" ino=8388757 scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:mnt_t:s0 tclass=lnk_file permissive=0
Nov 25 11:36:50 frogstar.fritz.box audit[832]: ANOM_ABEND auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:init_t:s0 pid=832 comm="systemd" exe="/usr/lib/systemd/systemd" sig=6 res=1
Nov 25 11:36:50 frogstar.fritz.box systemd-coredump[833]: Due to PID 1 having crashed coredump collection will now be turned off.
Nov 25 11:36:50 frogstar.fritz.box systemd[1]: Caught <ABRT>, dumped core as pid 832.
Nov 25 11:36:50 frogstar.fritz.box systemd[1]: Freezing execution.
Nov 25 11:36:50 frogstar.fritz.box kernel: systemd: 32 output lines suppressed due to ratelimiting
Nov 25 11:36:50 frogstar.fritz.box systemd-coredump[833]: Process 832 (systemd) of user 0 dumped core.
                                                          
                                                          Stack trace of thread 832:
                                                          #0  0x00007f92c615d83b kill (libc.so.6)
                                                          #1  0x000055db5163deda n/a (systemd)
                                                          #2  0x00007f92c62fdf70 __restore_rt (libpthread.so.0)
                                                          #3  0x00007f92c615d53f raise (libc.so.6)
                                                          #4  0x00007f92c6147895 abort (libc.so.6)
                                                          #5  0x00007f92c61a0957 __libc_message (libc.so.6)
                                                          #6  0x00007f92c61a728c malloc_printerr (libc.so.6)
                                                          #7  0x00007f92c61a77e4 munmap_chunk (libc.so.6)
                                                          #8  0x000055db516aedaa n/a (systemd)
                                                          #9  0x000055db516da271 n/a (systemd)
                                                          #10 0x000055db516aa24e n/a (systemd)
                                                          #11 0x000055db5165f73c n/a (systemd)
                                                          #12 0x000055db516626da n/a (systemd)
                                                          #13 0x000055db5167958c n/a (systemd)
                                                          #14 0x00007f92c65badd2 n/a (libsystemd-shared-239.so)
                                                          #15 0x00007f92c65bcc75 sd_event_dispatch (libsystemd-shared-239.so)
                                                          #16 0x00007f92c65bce00 sd_event_run (libsystemd-shared-239.so)
                                                          #17 0x000055db5167f13c n/a (systemd)
                                                          #18 0x000055db5163978b n/a (systemd)
                                                          #19 0x00007f92c6149413 __libc_start_main (libc.so.6)
                                                          #20 0x000055db5163b41e n/a (systemd)
-------------------------------------

With atomic host release 29.20181025.1, the same happens, but the callstack seems to have more debug info:
-------------------------------------
Nov 24 15:19:59 frogstar.fritz.box systemd[1]: Found device /dev/md/jbod1.
Nov 24 15:19:59 frogstar.fritz.box audit[1]: AVC avc:  denied  { getattr } for  pid=1 comm="systemd" path="/mnt" dev="dm-0" ino=8388757 scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:mnt_t:s0 tclass=lnk_file permissive=0
Nov 24 15:19:59 frogstar.fritz.box audit[821]: ANOM_ABEND auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:init_t:s0 pid=821 comm="systemd" exe="/usr/lib/systemd/systemd" sig=6 res=1
Nov 24 15:19:59 frogstar.fritz.box systemd-coredump[822]: Due to PID 1 having crashed coredump collection will now be turned off.
Nov 24 15:19:59 frogstar.fritz.box systemd[1]: Caught <ABRT>, dumped core as pid 821.
Nov 24 15:19:59 frogstar.fritz.box systemd[1]: Freezing execution.
Nov 24 15:19:59 frogstar.fritz.box kernel: systemd: 30 output lines suppressed due to ratelimiting
Nov 24 15:19:59 frogstar.fritz.box systemd-coredump[822]: Process 821 (systemd) of user 0 dumped core.
                                                          
                                                          Stack trace of thread 821:
                                                          #0  0x00007f37873ba83b kill (libc.so.6)
                                                          #1  0x000055fdca408e0a crash (systemd)
                                                          #2  0x00007f378755af70 __restore_rt (libpthread.so.0)
                                                          #3  0x00007f37873ba53f raise (libc.so.6)
                                                          #4  0x00007f37873a4895 abort (libc.so.6)
                                                          #5  0x00007f37873fd957 __libc_message (libc.so.6)
                                                          #6  0x00007f378740428c malloc_printerr (libc.so.6)
                                                          #7  0x00007f37874047e4 munmap_chunk (libc.so.6)
                                                          #8  0x000055fdca479b5a unit_fail_if_noncanonical (systemd)
                                                          #9  0x000055fdca4a50c1 mount_start (systemd)
                                                          #10 0x000055fdca474ece unit_start (systemd)
                                                          #11 0x000055fdca42a6fc job_perform_on_unit (systemd)
                                                          #12 0x000055fdca42d66a job_run_and_invalidate (systemd)
                                                          #13 0x000055fdca44474c manager_dispatch_run_queue (systemd)
                                                          #14 0x00007f378780d472 source_dispatch (libsystemd-shared-239.so)
                                                          #15 0x00007f378780f315 sd_event_dispatch (libsystemd-shared-239.so)
                                                          #16 0x00007f378780f4a0 sd_event_run (libsystemd-shared-239.so)
                                                          #17 0x000055fdca44a085 manager_loop (systemd)
                                                          #18 0x000055fdca4046bb main (systemd)
                                                          #19 0x00007f37873a6413 __libc_start_main (libc.so.6)
                                                          #20 0x000055fdca40634a _start (systemd)
-------------------------------------


Expected results:
The added line in /etc/fstab causes the device to be automatically mounted at boot, system boots to login prompt.


Additional info:
1)
Instead of using the whole drive for the raid (/dev/sda, /dev/sdb, etc.) I also tried to use partitions (/dev/sda1, /dev/sdb1, etc), but that made no difference.

2)
The raid device /dev/md/jbod1 is a symlink to /dev/md127p1. Specifiny the target directly in fstab did not make a difference.

3)
The raid device is mounted to /mnt/data. /mnt is a symlink to /var/mnt.
When I specify /var/mnt/data instead of /mnt/data in /etc/fstab, IT WORKS without problems!
So I guess there's a problem with the symlink handling in the systemd code. I saw in function "unit_fail_if_noncanonical" a method "chase_symlinks" is called. But this doesn't appear in the callstack, so I'm not sure if the problem is in that function.

4)
Perhaps the problem is reproducible without the whole mdadm array stuff if just the symlink is the problem, haven't tested that.

Comment 1 Colin Walters 2018-11-30 14:13:01 UTC
systemd shouldn't dump core on this, but this is the root cause:

Nov 24 15:19:59 frogstar.fritz.box audit[1]: AVC avc:  denied  { getattr } for  pid=1 comm="systemd" path="/mnt" dev="dm-0" ino=8388757

Comment 2 Lukas Vrabec 2018-12-12 15:50:23 UTC
Could you please attach complete raw AVC msg from ausearch? 

Thanks,
Lukas.

Comment 3 bk 2018-12-13 19:11:50 UTC
Created attachment 1514174 [details]
raw AVC messages from audit log

Comment 4 bk 2018-12-13 19:13:41 UTC
I attached the AVC messages, but I'm not sure if I did this right. It's the first time I use ausearch, please let me know if other parameters are needed.

Comment 5 Zbigniew Jędrzejewski-Szmek 2018-12-14 07:18:56 UTC
Oopsie. https://github.com/systemd/systemd/pull/11157

Comment 6 Fedora Update System 2018-12-19 03:36:23 UTC
systemd-239-7.git3bf819c.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-a26479d04a

Comment 7 Fedora Update System 2018-12-21 19:39:32 UTC
systemd-239-7.git3bf819c.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.