After upgrading to kernel 5.19.4 from 5.18.19 on Fedora 36, I am no longer able to start systemd-nspawn machines with LUKS encrypted storage (XFS in my case): Starting systemd-nspawn - Container mymachine... loop0: detected capacity change from 0 to 20975616 loop0: p1 Failed to dissect image '/var/lib/machines/mymachine.raw': Connection timed out systemd-nspawn: Main process exited, code=exited, status=1/FAILURE systemd-nspawn: Failed with result 'exit-code'. Failed to start systemd-nspawn - Container mymachine.
Posting this in case you are using an nvidia driver. Saw your post on the fedora 5.19.4 test page and there's another problem report referencing a bug report that also involves not seeing the request for a LUKS password in 5.19.4. They think that is due to an nvidia driver. It works again for one of them if they switch to the nouveau driver. https://bugzilla.redhat.com/show_bug.cgi?id=2111555
Thank you. I am not using nvidia or nouveau.
Additional information from another machine suggesting the kernel update to 5.19.4 may be unrelated. On the other machine running kernel-5.18.19-200.fc36.x86_64 and systemd-250.8-1.fc36.x86_64 I have the same issue, though a bit more is reported regarding the inability to find a suitable filesystem in the LUKS-encrypted mymachine.raw file. Starting systemd-nspawn - Container mymachine... Setting RLIMIT_CPU to infinity. Setting RLIMIT_FSIZE to infinity. Setting RLIMIT_DATA to infinity. Setting RLIMIT_STACK to 8388608:infinity. Setting RLIMIT_CORE to 0:infinity. Setting RLIMIT_RSS to infinity. Setting RLIMIT_NPROC to 128096. Setting RLIMIT_NOFILE to 1024:524288. Setting RLIMIT_MEMLOCK to 65536. Setting RLIMIT_AS to infinity. Setting RLIMIT_LOCKS to infinity. Setting RLIMIT_SIGPENDING to 128096. Setting RLIMIT_MSGQUEUE to 819200. Setting RLIMIT_NICE to 0. Setting RLIMIT_RTPRIO to 0. Setting RLIMIT_RTTIME to infinity. Settings are trusted: yes Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy Opened '/var/lib/machines/mymachine.raw' in O_RDWR access mode, with O_DIRECT enabled. loop0: detected capacity change from 0 to 20975616 loop0: Will wait up to 45s for 'loop0' to initialize… loop0: Successfully waited for device 'loop0' to initialize for 21.274ms. /var/lib/machines/mymachine.raw: Couldn't identify a suitable partition table or file system. Note that the disk image needs to a) either contain only a single MBR partition of type 0x83 that is marked bootable b) or contain a single GPT partition of type 0FC63DAF-8483-4772-8E79-3D69D8477DE4 c) or follow https://systemd.io/DISCOVERABLE_PARTITIONS d) or contain a file system without a partition table in order to be bootable with systemd-nspawn. systemd-nspawn: Main process exited, code=exited, status=1/FAILURE systemd-nspawn: Failed with result 'exit-code'. Failed to start systemd-nspawn - Container mymachine. I am able to manually open the machine with losetup and cryptsetup luksOpen, etc. The password prompt doesn't ever display when issuing "machinectl start mymachine" All of my hosts are running systemd-250.8-1.fc36.x86_64
After additional testing, I am able to eventually get the machine to start, though I have to have multiple ssh sessions open to the host machine (headless). From session A, machinectl start mymachine fails. Then after I wait for the timeout, now from session B, machinectl start mymachine succeeds (displays the decryption prompt in session B), and in session A: Broadcast message from root.com (Sat 2022-09-10 14:23:50 CDT): Password entry required for 'Please enter image passphrase:' (PID 5072). Please enter password with the systemd-tty-ask-password-agent tool.
So it looks the problem is in showing the decryption password prompt; machinectl/nspawn just waits for the password and fails after a timeout. Could you run the following commands: # strace -p 1 -f -y -s 500 -o /tmp/pid1.log & # strace -f -y -s 500 -o /tmp/machinectl.log machinectl start mymachine and attach /tmp/pid1.log and /tmp/machinectl.log here?
Created attachment 1912666 [details] PID 1 Log Attaching the requested files. However, after attempting multiple times, everytime I start via strace, machinectl shows the decryption prompt.
Created attachment 1912667 [details] machinectl log
(In reply to Anthony Messina from comment #6) > Attaching the requested files. However, after attempting multiple times, > everytime I start via strace, machinectl shows the decryption prompt. Bummer. I was hoping to see where exactly it times out... Maybe the failure can still be reproduced if just one of PID1 and machinectl is run under strace?
I'll keep watching for this to reappear, but so far, I have not seen this after upgrading to Fedora 37 with systemd-251.8-586.fc37.x86_64
I do not see this issue in Fedora 37. Closing.