Bug 2159446

Summary: systemd-cryptsetup@ instances depend on systemd-udevd, hence must start AFTER systemd-udevd
Product: Red Hat Enterprise Linux 8 Reporter: Renaud Métrich <rmetrich>
Component: systemdAssignee: systemd maint <systemd-maint>
Status: NEW --- QA Contact: Frantisek Sumsal <fsumsal>
Severity: medium Docs Contact:
Priority: medium    
Version: 8.7CC: dtardon, lherbolt, scorreia, systemd-maint-list
Target Milestone: rcKeywords: Bugfix, Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Renaud Métrich 2023-01-09 15:56:18 UTC
Description of problem:

/usr/lib/systemd/systemd-cryptsetup relies on the libcryptsetup.so library, which itself makes use of the libdevmapper.so library.

Inside libdevmapper, there is code to detect the presence/usage of *udev*.
In such case, the code makes sure to rely on udev to create the symlinks /dev/mapper/luks-XXX back to "../dm-Y" devices.

It appears that during a small window, the code can believe udev is not in use, because /run/udev/control is not present "yet".
This causes the libdevmapper library to create the /dev/mapper/luks-XXX node, causing systemd-udevd rule in charge of creating the symlink to fail, e.g.:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
systemd-udevd[XXX]: conflicting device node '/dev/mapper/luks-7b4618b9-6a49-4ee1-838a-7691839dde78' found, link to '/dev/dm-13' will not be created
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

The reason for not detecting udev is systemd-udevd and systemd-cryptsetup@ instances start concurrently, systemd-cryptsetup@ can even start a few clock ticks before systemd-udevd (column 22 is start time in clock ticks):
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
# cat /proc/$(pgrep systemd-udevd)/stat | awk '{ print $22 }'
795

# ps -eaf | grep cryptsetup
root        1576       1  4 15:40 ?        00:00:01 /usr/lib/systemd/systemd-cryptsetup attach luksvdb8 /dev/disk/by-uuid/7b4618b9-6a49-4ee1-838a-7691839dde78 none discard
...

# cat /proc/1576/stat | awk '{ print $22 }'
763
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

To close this window, we need to generate a "After=systemd-udevd" stanza in systemd-cryptsetup@ excerpts.

Version-Release number of selected component (if applicable):

systemd-239

How reproducible:

Often

Steps to Reproduce:
1. Install a system, then add a disk with 8 Luks devices, no need to create a file system on those
2. Make sure to unlock them automatically (using a key file or Clevis) and add those to /etc/crypttab for automatic decryption

    Adding _netdev facilitates troubleshooting because we can then log in using sshd while unlocking happens.
    In such case, don't forget to enable "remote-cryptsetup.target"...

3. Boot until the issue occurs or delay startup of systemd-udevd as a hack

    -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
    # mkdir -p /etc/systemd/system/systemd-udevd.service.d
    # cat > /etc/systemd/system/systemd-udevd.service.d/delay.conf << EOF
    [Service]
    ExecStartPre=/bin/sleep 30
    -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Actual results:

Journal shows "conflicting device node" and some /dev/mapper/luks-XXX devices are not symlinks:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
# ll /dev/mapper/
total 0
...
lrwxrwxrwx 1 root root        8 Jan  9 16:53 luksvdb1 -> ../dm-10
lrwxrwxrwx 1 root root        7 Jan  9 16:53 luksvdb2 -> ../dm-7
brw-rw---- 1 root disk 253,   6 Jan  9 16:52 luksvdb3
lrwxrwxrwx 1 root root        7 Jan  9 16:53 luksvdb5 -> ../dm-3
lrwxrwxrwx 1 root root        7 Jan  9 16:53 luksvdb6 -> ../dm-8
brw-rw---- 1 root disk 253,   4 Jan  9 16:52 luksvdb7
lrwxrwxrwx 1 root root        7 Jan  9 16:53 luksvdb8 -> ../dm-5
...

# systemctl status systemd-udevd
...
Jan 09 16:53:06 vm-clevis86 systemd-udevd[6542]: conflicting device node '/dev/mapper/luksvdb3' found, link to '/dev/>
Jan 09 16:53:06 vm-clevis86 systemd-udevd[6530]: conflicting device node '/dev/mapper/luksvdb7' found, link to '/dev/>
...
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Expected results:

Always getting symlinks created by udevd

Comment 1 David Tardon 2023-01-10 13:46:28 UTC
Interesting... I assume you're using kernel disk names (like /dev/sd*) in your reproducer, right? Because if symlinks (like /dev/disk/by-uuid/*) are used, the dependency on systemd-udevd is implicit: systemd-cryptsetup@.service does have BindsTo= and After= to the underlying device, which only becames active after udevd has created the symlink. That may be the reason nobody has noticed this earlier...

Comment 2 Renaud Métrich 2023-02-01 13:06:11 UTC
*** Bug 2162676 has been marked as a duplicate of this bug. ***

Comment 3 Renaud Métrich 2023-02-01 13:14:23 UTC
Hi David,

As explained, there is a race for creating the /dev/mapper/<node> path between libdevmapper and systemd-udevd.
If libdevmapper executes before systemd-udevd is there, it will create the node as a device.

Apparently the BindTo on the "by-uuid" path doesn't work somehow, if it's systemd-udevd that creates this link.