Bug 2170544 - "Unable to find a persistent overlay" warning shown on boot of recent live images with persistence enabled, various problems with booted system
Summary: "Unable to find a persistent overlay" warning shown on boot of recent live im...
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: dracut
Version: 38
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Neal Gompa
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: openqa
Depends On:
Blocks: 2139918
TreeView+ depends on / blocked
 
Reported: 2023-02-16 16:53 UTC by Adam Williamson
Modified: 2023-02-22 17:33 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)
journalctl log from first boot (364.36 KB, text/plain)
2023-02-17 21:54 UTC, Frederick Grose
no flags Details
live.debug log for create-overlay (10.20 KB, text/plain)
2023-02-20 23:41 UTC, Frederick Grose
no flags Details
live.debug log for dmsquash-live-root (33.98 KB, text/plain)
2023-02-20 23:42 UTC, Frederick Grose
no flags Details
journalctl log for auto-overlay boot (496.40 KB, text/plain)
2023-02-20 23:43 UTC, Frederick Grose
no flags Details

Description Adam Williamson 2023-02-16 16:53:12 UTC
Fedora 38 and Rawhide live images since yesterday's compose show a warning message on boot: "Unable to find a persistent overlay; using a temporary one.      All root filesystem changes will be lost on shutdown.      Press [Enter] to continue."

This is a result of recent changes to overlay stuff, Neal asked me to file the bug here, so here it is.

Arguably this ought to be a release blocker, but for an oversight in the criteria. We have this wording regarding *installed system* boot:

"In all of the above cases, the boot should proceed without any unexpected user intervention being required."

but we don't, for some reason, have the same stipulation for *deployment media* boot, which seems weird. If we're against unexpected interaction in the one case, we should be against it in the other. I'll maybe bring this up on test@.

Comment 1 Adam Williamson 2023-02-16 17:42:47 UTC
Well, seems it's worse than that. In openQA, even after hitting enter, boot is failing most of the time, or occasionally reaching a login prompt instead of a desktop.

On a local VM boot does seem to work, but `livesys.service` fails to start, and running a console gives a wrong terminal prompt (bash-5.2$), which indicates the user account wasn't set up properly. ausearch shows a flood of AVCs like this:

----
time->Thu Feb 16 17:35:55 2023
type=AVC msg=audit(1676586955.315:143): avc:  denied  { create } for  pid=1065 comm="useradd" name="liveuser" scontext=system_u:system_r:kernel_t:s0 tcontext=unconfined_u:object_r:user_home_dir_t:s0 tclass=dir permissive=0
----
time->Thu Feb 16 17:35:55 2023
type=AVC msg=audit(1676586955.652:150): avc:  denied  { create } for  pid=1 comm="systemd" name="#42" scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:object_r:var_run_t:s0 tclass=chr_file permissive=0
----
time->Thu Feb 16 17:35:55 2023
type=AVC msg=audit(1676586955.688:151): avc:  denied  { create } for  pid=1 comm="systemd" name="#43" scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:object_r:var_run_t:s0 tclass=chr_file permissive=0
----
time->Thu Feb 16 17:35:55 2023
type=AVC msg=audit(1676586955.940:157): avc:  denied  { create } for  pid=1171 comm="rm" name="#4a" scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:object_r:var_run_t:s0 tclass=chr_file permissive=0
----

(there are lots more, those are just the first few).

If I edit the boot params and drop all the overlay-related ones, all services start successfully and the console prompt is normal.

Comment 2 Adam Williamson 2023-02-16 17:51:23 UTC
These effects happen whether the image is attached as an emulated optical drive or emulated USB stick. Haven't tested on metal with a real USB stick yet.

Comment 3 Adam Williamson 2023-02-16 18:27:49 UTC
Same behaviour on a real system with a real USB stick (prompt on boot, failed services, bad console prompt).

Comment 4 Adam Williamson 2023-02-17 07:34:39 UTC
The change has been reverted upstream and downstream for now, but leaving the bug report open as that's obviously not the long-term fix.

Comment 5 Frederick Grose 2023-02-17 21:54:01 UTC
Created attachment 1944865 [details]
journalctl log from first boot

Comment 6 Frederick Grose 2023-02-17 21:55:27 UTC
Problems I've noticed so far:

These prevent the autooverlay boot:
1. dracut-live is missing ../90overlayfs module          --  Where is that composed?
2. src/pylorax/creator.py is missing dmsquash-live-autooverlay in DRACUT_DEFAULT

With those corrections made via a patched livecd-creator --base-on --shell edit session,
I was able to boot a Fedora SoaS 38 LiveUSB dd'ed from the .iso with autooverlay by setting enforcing=0 in the kernel command line.

Evenso,
1. dmraid-activation.service failed for want of missing /etc/init.d/functions
2. polkit.service fails to start due to access denials
3. avahi-daemon.service fails to start due to access denials

See the attached journalctl log.

Comment 7 Frederick Grose 2023-02-17 22:15:27 UTC
This also should be part of the first list of items preventing autooverlay boot:

3. Irregular GPT partition table, see these lines:
[    4.530459] fedora kernel: scsi 8:0:0:0: Direct-Access     Generic  Flash Disk       8.07 PQ: 0 ANSI: 4
[    4.530903] fedora kernel: sd 8:0:0:0: Attached scsi generic sg3 type 0
[    4.531804] fedora kernel: sd 8:0:0:0: [sdc] 3934208 512-byte logical blocks: (2.01 GB/1.88 GiB)
[    4.532557] fedora kernel: sd 8:0:0:0: [sdc] Write Protect is off
[    4.532582] fedora kernel: sd 8:0:0:0: [sdc] Mode Sense: 23 00 00 00
[    4.533258] fedora kernel: sd 8:0:0:0: [sdc] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[    4.551675] fedora kernel: GPT:Primary header thinks Alt. header is not at the end of the disk.
[    4.551690] fedora kernel: GPT:2468151 != 3934207
[    4.551697] fedora kernel: GPT:Alternate GPT header not at the end of the disk.
[    4.551705] fedora kernel: GPT:2468151 != 3934207
[    4.551712] fedora kernel: GPT: Use GNU Parted to correct GPT errors.

I had to add --fix to the parted --script parameters in 90dmsquash-live-autooverlay/create-overlay.sh
at line 74:
74    freeSpaceStart=$(parted --script --fix ${blockDevice} unit % print free \

Comment 8 Frederick Grose 2023-02-20 23:41:34 UTC
Created attachment 1945372 [details]
live.debug log for create-overlay

Comment 9 Frederick Grose 2023-02-20 23:42:45 UTC
Created attachment 1945373 [details]
live.debug log for dmsquash-live-root

Comment 10 Frederick Grose 2023-02-20 23:43:58 UTC
Created attachment 1945374 [details]
journalctl log for auto-overlay boot

Comment 11 Frederick Grose 2023-02-21 00:11:48 UTC
https://github.com/dracutdevs/dracut/pull/2215
includes changes to allow autooverlay booting with
 * enforcing=0
 * an edited image to include the 90overlayfs dracut module in dracut-live,

   Where is dracut-live composed?

 * and adjusting the dracut configuration arguments to include dmsquash-live-autooverlay
   (https://github.com/weldr/lorax/pull/1308/files#r1112398922 shows the change needed.)

The attached journalctl log for Fedora-Workstation-Live-x86_64-38-20230215.n.0.iso
(updated with dnf upgrade) shows that SELinux configuration is
still lacking.

    Who can debug the SELinux denials?

Comment 12 Adam Williamson 2023-02-22 17:33:00 UTC
Dropping this from the FE list as it's been reverted for now.


Note You need to log in before you can comment on or make changes to this bug.