Bug 2430281 - Many Rawhide installs fail since 20260113.n.0 with sudden, mysterious dnf transaction failure
Summary: Many Rawhide installs fail since 20260113.n.0 with sudden, mysterious dnf tra...
Keywords:
Status: CLOSED DUPLICATE of bug 2429501
Alias: None
Product: Fedora
Classification: Fedora
Component: grub2
Version: rawhide
Hardware: All
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Nicolas Frayer
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: openqa
Depends On:
Blocks: BetaBlocker, F44BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2026-01-16 02:39 UTC by Adam Williamson
Modified: 2026-01-19 13:30 UTC (History)
12 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2026-01-19 13:30:21 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
dnf log from an affected install (706.93 KB, text/plain)
2026-01-16 02:42 UTC, Adam Williamson
no flags Details
packaging.log from an affected install (372.90 KB, text/plain)
2026-01-16 02:42 UTC, Adam Williamson
no flags Details
anaconda.log from an affected install (62.85 KB, text/plain)
2026-01-16 02:43 UTC, Adam Williamson
no flags Details
journal from an affected install (2.63 MB, text/plain)
2026-01-16 02:44 UTC, Adam Williamson
no flags Details

Description Adam Williamson 2026-01-16 02:39:08 UTC
Since Fedora-Rawhide-20260113.n.0 , many (but not all, weirdly) openQA install tests are failing. The dnf install process seems to suddenly fail, with no obvious reason why:

DEBUG:anaconda.modules.payloads.payload.dnf.transaction_progress:Configuring - unknown, systemd-0:259-1.fc44.x86_64, %triggerin
DEBUG:anaconda.modules.payloads.payload.dnf.transaction_progress:Configuring - unknown, systemd-0:259-1.fc44.x86_64, %triggerin
DEBUG:anaconda.modules.payloads.payload.dnf.transaction_progress:Configuring - unknown, systemd-0:259-1.fc44.x86_64, %triggerin
DEBUG:anaconda.modules.payloads.payload.dnf.transaction_progress:Configuring - unknown, systemd-0:259-1.fc44.x86_64, %triggerin
DEBUG:anaconda.modules.payloads.payload.dnf.transaction_progress:Configuring - unknown, systemd-0:259-1.fc44.x86_64, %triggerin
DEBUG:anaconda.modules.payloads.payload.dnf.transaction_progress:Configuring - unknown, systemd-0:259-1.fc44.x86_64, %triggerin
DEBUG:anaconda.modules.payloads.payload.dnf.transaction_progress:Configuring - unknown, systemd-udev-0:259-1.fc44.x86_64, %triggerin
DEBUG:anaconda.modules.payloads.payload.dnf.transaction_progress:Configuring - unknown, systemd-udev-0:259-1.fc44.x86_64, %triggerin
DEBUG:anaconda.modules.payloads.payload.dnf.transaction_progress:Configuring - unknown, man-db-0:2.13.1-2.fc43.x86_64, %triggerin
DEBUG:anaconda.modules.payloads.payload.dnf.transaction_progress:Done - False
DEBUG:anaconda.modules.payloads.payload.dnf.dnf_manager:The transaction finished with 5 (Rpm transaction failed.)
DEBUG:anaconda.modules.payloads.payload.dnf.dnf_manager:The transaction has ended.
ERROR:anaconda.modules.payloads.payload.dnf.transaction_progress:The transaction process has ended with errors.
DEBUG:anaconda.modules.payloads.payload.dnf.dnf_manager:The transaction process exited with 0.

In dnf.log we just see:

2026-01-15T19:01:18+0000 [4368] INFO RPM callback start %triggerin scriptlet "systemd-0:259-1.fc44.x86_64"
2026-01-15T19:01:18+0000 [4368] INFO RPM callback stop %triggerin scriptlet "systemd-0:259-1.fc44.x86_64" return code 0
2026-01-15T19:01:18+0000 [4368] INFO RPM callback start %triggerin scriptlet "systemd-0:259-1.fc44.x86_64"
2026-01-15T19:01:18+0000 [4368] INFO RPM callback stop %triggerin scriptlet "systemd-0:259-1.fc44.x86_64" return code 0
2026-01-15T19:01:18+0000 [4368] INFO RPM callback start %triggerin scriptlet "systemd-0:259-1.fc44.x86_64"
2026-01-15T19:01:19+0000 [4368] INFO RPM callback stop %triggerin scriptlet "systemd-0:259-1.fc44.x86_64" return code 0
2026-01-15T19:01:19+0000 [4368] INFO RPM callback start %triggerin scriptlet "systemd-0:259-1.fc44.x86_64"
2026-01-15T19:01:19+0000 [4368] INFO RPM callback stop %triggerin scriptlet "systemd-0:259-1.fc44.x86_64" return code 0
2026-01-15T19:01:19+0000 [4368] INFO RPM callback start %triggerin scriptlet "systemd-0:259-1.fc44.x86_64"
2026-01-15T19:01:19+0000 [4368] INFO [scriptlet] Running in chroot, ignoring command 'daemon-reload'
2026-01-15T19:01:19+0000 [4368] INFO [scriptlet] Running in chroot, ignoring command 'reload-or-restart'
2026-01-15T19:01:19+0000 [4368] INFO RPM callback stop %triggerin scriptlet "systemd-0:259-1.fc44.x86_64" return code 0
2026-01-15T19:01:19+0000 [4368] INFO RPM callback start %triggerin scriptlet "systemd-0:259-1.fc44.x86_64"
2026-01-15T19:01:19+0000 [4368] INFO [scriptlet] Running in chroot, ignoring command 'reload'
2026-01-15T19:01:19+0000 [4368] INFO [scriptlet] Running in chroot, ignoring command 'list-units'
2026-01-15T19:01:19+0000 [4368] INFO RPM callback stop %triggerin scriptlet "systemd-0:259-1.fc44.x86_64" return code 0
2026-01-15T19:01:19+0000 [4368] INFO RPM callback start %triggerin scriptlet "systemd-udev-0:259-1.fc44.x86_64"
2026-01-15T19:01:19+0000 [4368] INFO RPM callback stop %triggerin scriptlet "systemd-udev-0:259-1.fc44.x86_64" return code 0
2026-01-15T19:01:19+0000 [4368] INFO RPM callback start %triggerin scriptlet "systemd-udev-0:259-1.fc44.x86_64"
2026-01-15T19:01:19+0000 [4368] INFO [scriptlet] Running in chroot, ignoring command 'set-property'
2026-01-15T19:01:19+0000 [4368] INFO RPM callback stop %triggerin scriptlet "systemd-udev-0:259-1.fc44.x86_64" return code 0
2026-01-15T19:01:19+0000 [4368] INFO RPM callback start %triggerin scriptlet "man-db-0:2.13.1-2.fc43.x86_64"
2026-01-15T19:01:19+0000 [4368] INFO RPM callback stop %triggerin scriptlet "man-db-0:2.13.1-2.fc43.x86_64" return code 0

i.e. there's no obvious errors there.

This seems to affect almost all installs from the Server DVD or Server netinst (which defaults to the Server package set). It doesn't affect installs from the Everything netinst (which defaults to the minimal package set). Weirdly, it *does* affect install of the minimal package set from the Server DVD. Unaffected tests are an odd bunch with no obvious common property. One that nearly works is that ext4 tests seem to pass...except shrink_ext4 doesn't.

I'm filing against dnf5 as we seem to be clearly in dnf here, but dnf5 didn't change in the affected compose. Things potentially below it that did change are glibc (from glibc-2.42.9000-17.fc44 to glibc-2.42.9000-21.fc44) and util-linux (from util-linux-2.41.3-8.fc44 to util-linux-2.41.3-11.fc44). The full changelog is at https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/APMVBOC7TOPPWHM76S6NYE4OONY6GICC/ .

This is an obvious release blocker, as it causes multiple blocking install paths to fail.

Comment 1 Adam Williamson 2026-01-16 02:41:10 UTC
Possibly the triggerin scriptlet for man-db is just the end of the transaction, and the failure happened earlier? But I can't see anything in the log that obviously indicates an earlier failure...

Comment 2 Adam Williamson 2026-01-16 02:42:01 UTC
Created attachment 2122319 [details]
dnf log from an affected install

Comment 3 Adam Williamson 2026-01-16 02:42:47 UTC
Created attachment 2122320 [details]
packaging.log from an affected install

Comment 4 Adam Williamson 2026-01-16 02:43:26 UTC
Created attachment 2122321 [details]
anaconda.log from an affected install

Comment 5 Adam Williamson 2026-01-16 02:44:40 UTC
Created attachment 2122322 [details]
journal from an affected install

Comment 6 Adam Williamson 2026-01-16 02:59:52 UTC
Huh. When I tried this locally, I didn't get the RPM transaction failure, instead I got a bootloader install failure, with this error from grub2-probe:

Jan 16 02:56:28 localhost.localdomain org.fedoraproject.Anaconda.Modules.Storage[2180]: INFO:program:Running in chroot '/mnt/sysroot'... grub2-mkconfig -o /boot/grub2/grub.cfg
Jan 16 02:56:29 localhost.localdomain org.fedoraproject.Anaconda.Modules.Storage[2180]: INFO:program:/usr/bin/grub2-editenv: warning: cannot probe fs for hd0,gpt2: ../grub-core/kern/fs.c:grub_fs_probe:123:unknown filesystem.
Jan 16 02:56:29 localhost.localdomain org.fedoraproject.Anaconda.Modules.Storage[2180]: INFO:program:Generating grub configuration file ...
Jan 16 02:56:29 localhost.localdomain org.fedoraproject.Anaconda.Modules.Storage[2180]: INFO:program:/usr/bin/grub2-probe: error: ../grub-core/kern/fs.c:grub_fs_probe:123:unknown filesystem.
Jan 16 02:56:29 localhost.localdomain org.fedoraproject.Anaconda.Modules.Storage[2180]: DEBUG:program:Return code of grub2-mkconfig: 1
Jan 16 02:56:29 localhost.localdomain org.fedoraproject.Anaconda.Modules.Storage[2180]: ERROR:anaconda.modules.storage.bootloader.installation:Bootloader installation has failed: failed to write boot loader configuration

I do note that in the DNF logs from the openQA failures we see similar errors in the kernel scripts, though the script exits 0 (presumably because it's written to always exit 0 even on failure, as it should be):

2026-01-15T18:59:57+0000 [4368] INFO RPM callback start %posttrans scriptlet "kernel-core-0:6.18.0-65.fc44.x86_64"
2026-01-15T18:59:57+0000 [4368] INFO [scriptlet] grub2-probe: error: ../grub-core/kern/fs.c:grub_fs_probe:123:unknown filesystem.
2026-01-15T18:59:57+0000 [4368] INFO [scriptlet] grub2-editenv: warning: cannot probe fs for hostdisk//dev/vda,msdos1: ../grub-core/kern/fs.c:grub_fs_probe:123:unknown filesystem.
2026-01-15T19:00:11+0000 [4368] INFO [scriptlet] grep: /var/tmp/dracut.dLeRqmJ/initramfs/etc/shadow: No such file or directory
2026-01-15T19:01:04+0000 [4368] INFO [scriptlet] grub2-editenv: warning: cannot probe fs for hostdisk//dev/vda,msdos1: ../grub-core/kern/fs.c:grub_fs_probe:123:unknown filesystem.
2026-01-15T19:01:06+0000 [4368] INFO [scriptlet] grub2-editenv: warning: cannot probe fs for hostdisk//dev/vda,msdos1: ../grub-core/kern/fs.c:grub_fs_probe:123:unknown filesystem.
2026-01-15T19:01:06+0000 [4368] INFO [scriptlet] grub2-editenv: warning: cannot probe fs for hostdisk//dev/vda,msdos1: ../grub-core/kern/fs.c:grub_fs_probe:123:unknown filesystem.
2026-01-15T19:01:06+0000 [4368] INFO RPM callback stop %posttrans scriptlet "kernel-core-0:6.18.0-65.fc44.x86_64" return code 0

Comment 7 Petr Pisar 2026-01-16 11:55:09 UTC
Please provide a list of packages Anaconda tries to install and exact compose identifier which exhibits this behavior and a compose identifier which does not exhibit this behavior.

Comment 8 Adam Williamson 2026-01-16 16:14:32 UTC
The bug already has all that information. The summary says "since 20260113.n.0", meaning Fedora-Rawhide-20260113.n.0 is the first affected compose, thus Fedora-Rawhide-20260112.n.0 was not affected. Both the dnf and packaging log files attached include the list of packages installed.

Comment 9 Adam Williamson 2026-01-16 21:29:19 UTC
Aha. I think this is xfsprogs.

A reproducer is to run `chroot /mnt/sysroot; grub2-probe --target=fs /boot/grub2` after the install has completed or failed. On broken images, this gives "grub2-probe: error: ../grub-core/kern/fs.c:grub_fs_probe:123:unknown filesystem." On working images, it gives "xfs" (when /boot is XFS, obviously). I suspect affected tests are ones where /boot is XFS (versus ones where it's ext4). And xfsprogs changed in the affected compose:

Package:      xfsprogs-6.18.0-1.fc44
Old package:  xfsprogs-6.17.0-1.fc44
Summary:      Utilities for managing the XFS filesystem
RPMs:         xfsprogs xfsprogs-devel xfsprogs-xfs_extras xfsprogs-xfs_scrub
Size:         7.61 MiB
Size change:  56.90 KiB
Changelog:
  * Mon Jan 12 2026 Pavel Reichl <preichl> - 6.18.0-1
  - Rebase to v6.18
  - Related: rhbz#2426297

Comment 10 Adam Williamson 2026-01-16 21:30:13 UTC
I'm not *totally* sure why we sometimes get an RPM transaction failure and sometimes a bootloader install failure, but, well, it seems clear we need to at least fix this problem first, then if we still have RPM transaction issues we can deal with that.

Comment 11 Adam Williamson 2026-01-16 21:57:15 UTC
Yup, confirmed. The problem is: grub2-probe can't handle XFS partitions created with xfsprogs 6.18.0 (with default options). Reproducer:

1. Create a test partition, say /dev/vda4
2. mkfs.xfs /dev/vda4
3. mkdir -p /mnt/xfs
4. mount /dev/vda4 /mnt/xfs
5. grub2-probe --target=fs /mnt/xfs/

If xfsprogs 6.17.0 is installed when mkfs.xfs happens, this works. If xfsprogs 6.18.0 is installed, it fails, with the "unknown filesystem" error.

ISTR we've seen this before when xfs changed default options. In the 6.18.0 change list, https://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git/commit/?id=54aad16b4b9b923442b4042afaba4438ca1aa868 looks like the obvious suspect.

It looks like https://cgit.git.savannah.gnu.org/cgit/grub.git/commit/grub-core/fs/xfs.c?id=1ed2628b560cedac7fd1a696985ab85b24541a8e was the last go-round of this. CCing grub folks...

Comment 12 Adam Williamson 2026-01-16 22:04:28 UTC
Ah, actually - I think https://cgit.git.savannah.gnu.org/cgit/grub.git/commit/grub-core/fs/xfs.c?id=1ed2628b560cedac7fd1a696985ab85b24541a8e is the *fix* for this, and we don't have it downstream yet. We just need to add it to our backports.

Comment 13 Petr Pisar 2026-01-19 08:43:17 UTC
Yes, xfsprogs change correlates with the indicated compose (20260113.n.0):

Tue Jan 13 03:26:25 2026 xfsprogs-6.18.0-1.fc44 untagged from f44-updates-testing by bodhi
Tue Jan 13 03:26:25 2026 xfsprogs-6.18.0-1.fc44 untagged from f44-updates-candidate by bodhi
Tue Jan 13 03:27:01 2026 xfsprogs-6.18.0-1.fc44 tagged into f44 by bodhi [still active]

Comment 14 Marta Lewandowska 2026-01-19 13:30:21 UTC

*** This bug has been marked as a duplicate of bug 2429501 ***


Note You need to log in before you can comment on or make changes to this bug.