Bug 2121883 - systemd-nspawn: Failed to dissect image '/var/lib/machines/mymachine.raw': Connection timed out (after upgrading to kernel-5.19.4)
Summary: systemd-nspawn: Failed to dissect image '/var/lib/machines/mymachine.raw': Co...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 36
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-08-26 23:54 UTC by Anthony Messina
Modified: 2022-11-29 17:28 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-29 17:28:29 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
PID 1 Log (1.95 MB, application/gzip)
2022-09-18 09:19 UTC, Anthony Messina
no flags Details
machinectl log (28.84 KB, application/gzip)
2022-09-18 09:20 UTC, Anthony Messina
no flags Details

Description Anthony Messina 2022-08-26 23:54:38 UTC
After upgrading to kernel 5.19.4 from 5.18.19 on Fedora 36, I am no longer able to start systemd-nspawn machines with LUKS encrypted storage (XFS in my case):

Starting systemd-nspawn - Container mymachine...
loop0: detected capacity change from 0 to 20975616
 loop0: p1
Failed to dissect image '/var/lib/machines/mymachine.raw': Connection timed out
systemd-nspawn: Main process exited, code=exited, status=1/FAILURE
systemd-nspawn: Failed with result 'exit-code'.
Failed to start systemd-nspawn - Container mymachine.

Comment 1 wayne6001 2022-08-27 00:48:12 UTC
Posting this in case you are using an nvidia driver.  Saw your post on the fedora 5.19.4 test page and there's another problem report referencing a bug report that also involves not seeing the request for a LUKS password in 5.19.4.  They think that is due to an nvidia driver.  It works again for one of them if they switch to the nouveau driver. https://bugzilla.redhat.com/show_bug.cgi?id=2111555

Comment 2 Anthony Messina 2022-08-27 14:47:45 UTC
Thank you.  I am not using nvidia or nouveau.

Comment 3 Anthony Messina 2022-08-28 22:03:02 UTC
Additional information from another machine suggesting the kernel update to 5.19.4 may be unrelated.  On the other machine running kernel-5.18.19-200.fc36.x86_64 and systemd-250.8-1.fc36.x86_64 I have the same issue, though a bit more is reported regarding the inability to find a suitable filesystem in the LUKS-encrypted mymachine.raw file.

Starting systemd-nspawn - Container mymachine...
Setting RLIMIT_CPU to infinity.
Setting RLIMIT_FSIZE to infinity.
Setting RLIMIT_DATA to infinity.
Setting RLIMIT_STACK to 8388608:infinity.
Setting RLIMIT_CORE to 0:infinity.
Setting RLIMIT_RSS to infinity.
Setting RLIMIT_NPROC to 128096.
Setting RLIMIT_NOFILE to 1024:524288.
Setting RLIMIT_MEMLOCK to 65536.
Setting RLIMIT_AS to infinity.
Setting RLIMIT_LOCKS to infinity.
Setting RLIMIT_SIGPENDING to 128096.
Setting RLIMIT_MSGQUEUE to 819200.
Setting RLIMIT_NICE to 0.
Setting RLIMIT_RTPRIO to 0.
Setting RLIMIT_RTTIME to infinity.
Settings are trusted: yes
Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
Opened '/var/lib/machines/mymachine.raw' in O_RDWR access mode, with O_DIRECT enabled.
loop0: detected capacity change from 0 to 20975616
loop0: Will wait up to 45s for 'loop0' to initialize…
loop0: Successfully waited for device 'loop0' to initialize for 21.274ms.
/var/lib/machines/mymachine.raw: Couldn't identify a suitable partition table or file system.
Note that the disk image needs to
    a) either contain only a single MBR partition of type 0x83 that is marked bootable
    b) or contain a single GPT partition of type 0FC63DAF-8483-4772-8E79-3D69D8477DE4
    c) or follow https://systemd.io/DISCOVERABLE_PARTITIONS
    d) or contain a file system without a partition table
in order to be bootable with systemd-nspawn.
systemd-nspawn: Main process exited, code=exited, status=1/FAILURE
systemd-nspawn: Failed with result 'exit-code'.
Failed to start systemd-nspawn - Container mymachine.

I am able to manually open the machine with losetup and cryptsetup luksOpen, etc.  The password prompt doesn't ever display when issuing "machinectl start mymachine"

All of my hosts are running systemd-250.8-1.fc36.x86_64

Comment 4 Anthony Messina 2022-09-10 19:30:44 UTC
After additional testing, I am able to eventually get the machine to start, though I have to have multiple ssh sessions open to the host machine (headless).  From session A, machinectl start mymachine fails.  Then after I wait for the timeout, now from session B, machinectl start mymachine succeeds (displays the decryption prompt in session B), and in session A:

Broadcast message from root.com (Sat 2022-09-10 14:23:50 CDT):

Password entry required for 'Please enter image passphrase:' (PID 5072).
Please enter password with the systemd-tty-ask-password-agent tool.

Comment 5 David Tardon 2022-09-15 07:47:54 UTC
So it looks the problem is in showing the decryption password prompt; machinectl/nspawn just waits for the password and fails after a timeout. Could you run the following commands:

# strace -p 1 -f -y -s 500 -o /tmp/pid1.log &
# strace -f -y -s 500 -o /tmp/machinectl.log machinectl start mymachine

and attach /tmp/pid1.log and /tmp/machinectl.log here?

Comment 6 Anthony Messina 2022-09-18 09:19:29 UTC
Created attachment 1912666 [details]
PID 1 Log

Attaching the requested files.  However, after attempting multiple times, everytime I start via strace, machinectl shows the decryption prompt.

Comment 7 Anthony Messina 2022-09-18 09:20:11 UTC
Created attachment 1912667 [details]
machinectl log

Comment 8 David Tardon 2022-11-08 16:03:28 UTC
(In reply to Anthony Messina from comment #6)
> Attaching the requested files.  However, after attempting multiple times,
> everytime I start via strace, machinectl shows the decryption prompt.

Bummer. I was hoping to see where exactly it times out... Maybe the failure can still be reproduced if just one of PID1 and machinectl is run under strace?

Comment 9 Anthony Messina 2022-11-23 13:50:49 UTC
I'll keep watching for this to reappear, but so far, I have not seen this after upgrading to Fedora 37 with systemd-251.8-586.fc37.x86_64

Comment 10 Anthony Messina 2022-11-29 17:28:29 UTC
I do not see this issue in Fedora 37. Closing.


Note You need to log in before you can comment on or make changes to this bug.