Since util-linux-2.39-0.1.fc39 was built, almost all openQA Rawhide tests are failing because /var appears to be mounted read-only after installing that update and rebooting. See: https://openqa.fedoraproject.org/tests/overview?distri=fedora&groupid=2&version=39&build=Update-FEDORA-2023-261bb867d0 Most tests fail trying to run dnf because /var/log is not writeable and dnf wants to write there. KDE tests fail because SDDM can't write to /var/lib/sddm . GNOME tests just get stuck at the boot screen, presumably for a similar reason. etc.
So, I've reproduced this locally. The reproducer is simply to install a recent Rawhide nightly - like https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20230321.n.0/compose/Everything/x86_64/iso/Fedora-Everything-netinst-x86_64-Rawhide-20230321.n.0.iso - download the builds from the util-linux task, update them all, and reboot. There isn't actually a /var partition (in the openQA test or my local reproducer), of course. It's / that's mounted ro. The output of `mount` shows it as such: /dev/vda3 on / type btrfs (ro,relatime,seclabel,compress=zstd:1,discard=async,space_cache=v2,subvolid=257,subvol=/root) the entry in /etc/fstab is just: UUID=(someuuid) / btrfs subvol=root,compress=zstd:1 0 0 `findmnt --target /` output is very similar to `mount` output: / /dev/vda3[/root] btrfs ro,relatime,seclabel,compress=zstd:1,discard=async,space_cache=v2,subvolid=257,subvol=/root Manually remounting as rw works fine, with `mount -o remount,rw /`. After that it shows as mounted rw and I can touch files in /var/log. Other filesystems, like /home , are correctly mounted rw. But of course the mechanism there is very different: / is the only filesystem subject to the 'switch root' operation where it's initially mounted as /sysroot in the dracut environment, then gets moved to / in the final booted system.
This is an automatic F39 Beta blocker per "Bugs which entirely prevent the composition of one or more of the release-blocking images required to be built for a currently-pending (pre-)release" (and probably some of the other criteria too).
I think I see where the problem is. The systemd uses "mount -o remount" to remount to read-write. It does not specify "rw" because this flag is expected from fstab (or by default). Unfortunately, the code in libmount does not explicitly add "rw" to the mount options if "ro" is not specified. This is not a problem for the classic mount(2) syscall because it does not differentiate between superblock and VFS node, and remount without "ro" is always interpreted as read-write in both layers. The new syscalls require two steps to set all layers read-write, fsconfig() to reconfigure the superblock, and mount_setattr() to modify VFS node (mountpoint) flags. Unfortunately, mount_setattr() is not triggered as there is no flag (rw is missing). You need "findmnt -no TARGET,FS-OPTIONS,VFS-OPTIONS" to see both sets of flags.
Perhaps systemd doesn't specify it because it won't *always* be rw? You *can* run with an ro root partition, and indeed our ostree-based flavors do that (though I don't know if they actually use the same switch root mechanism in the same way).
This results in the boot getting to the fedora logo and waiting forever. The login screen never appears on graphic boot. Booting at runlevel 3, logging in and running startx works. The remount will work before or after startx.
darrell: in case you didn't know, we untagged this build from rawhide as this bug is pretty bad. if you somehow got it, it's fine to just downgrade to util-linux-2.38.1-4.fc38 until Karel can fix this.
In my case I've workarounded it by adding 'rw,' before the defaults entry for the / file system.
for the record I have untagged the build from "eln" to unblock them as well
This still seems to be a problem with the new util-linux-2.39-0.3.fc39 , at least to some extent. Oddly, the KDE and Workstation tests passed this time, but Server and podman tests still fail with read-only filesystem issues: https://openqa.fedoraproject.org/tests/overview?build=Update-FEDORA-2023-c437cf6929&version=39&distri=fedora&groupid=2 The difference may be filesystem related. The failing tests use a Server base image, which probably has an xfs root filesystem. The passing tests probably have btrfs root filesystems.
Any suggestions on how to debug it locally? It would be nice to have output from "LIBMOUNT_DEBUG=all" ...
I'm able to reproduce this in a virtual machine.
util-linux-2.39-0.4.fc39 with bugfix (I hope) pushed.
Time to close this one, right?
Yeah, I haven't seen any problems with the new version. Thanks!