Created attachment 2039189 [details] Serial console output booting successfully with 6.8.10-200.fc39.x86_64 1. Please describe the problem: After a kernel upgrade, my system no longer boots: [ 125.404282] XFS (md127): Ending clean mount [ *** ] Job sysroot.mount/start running (2min / 2min 17s) [ 125.670498] XFS (md127): Unmounting Filesystem 1780b7fe-4500-4d0a-a3b3-fc3849983039 [FAILED] Failed to mount sysroot.mount - /sysroot. See 'systemctl status sysroot.mount' for details. ... /root# systemctl status sysroot.mount | cat ^M× sysroot.mount - /sysroot Loaded: loaded (/proc/cmdline; generated) Active: failed (Result: timeout) since Sun 2024-07-07 10:32:18 UTC; 44s ago Where: /sysroot What: /dev/disk/by-uuid/1780b7fe-4500-4d0a-a3b3-fc3849983039 Docs: man:fstab(5) man:systemd-fstab-generator(8) CPU: 1.448s Jul 07 10:30:19 desiato.infradead.org systemd[1]: Mounting sysroot.mount - /sysroot... Jul 07 10:31:04 desiato.infradead.org systemd[1]: sysroot.mount: Mounting timed out. Terminating. Jul 07 10:31:49 desiato.infradead.org systemd[1]: sysroot.mount: Mount process timed out. Killing. Jul 07 10:31:49 desiato.infradead.org systemd[1]: sysroot.mount: Killing process 744 (mount) with signal SIGKILL. Jul 07 10:32:18 desiato.infradead.org systemd[1]: sysroot.mount: Mount process exited, code=killed, status=15/TERM Jul 07 10:32:18 desiato.infradead.org systemd[1]: sysroot.mount: Failed with result 'timeout'. Jul 07 10:32:18 desiato.infradead.org systemd[1]: Failed to mount sysroot.mount - /sysroot. Jul 07 10:32:18 desiato.infradead.org systemd[1]: sysroot.mount: Consumed 1.448s CPU time. 2. What is the Version-Release number of the kernel: 6.9.7-100.fc39.x86_64 3. Did it work previously in Fedora? If so, what kernel version did the issue *first* appear? Old kernels are available for download at https://koji.fedoraproject.org/koji/packageinfo?packageID=8 : Yes. 6.9.4-100.fc39.x86_64 was the first to fail, 6.8.10-200.fc39 is the latest I have working. 4. Can you reproduce this issue? If so, please provide the steps to reproduce the issue below: Boot any 6.9 kernel. 5. Does this problem occur with the latest Rawhide kernel? To install the Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by ``sudo dnf update --enablerepo=rawhide kernel``: 6. Are you running any modules that not shipped with directly Fedora's kernel?: No. 7. Please attach the kernel logs. You can get the complete kernel log for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the issue occurred on a previous boot, use the journalctl ``-b`` flag.
Created attachment 2039190 [details] Serial console output for failed boot with 6.9.7-100.fc39.x86_64 (including SysRq-T) [ 47.560567] task:mount state:D stack:0 pid:744 tgid:744 ppid:1 flags:0x00004002 [ 47.560567] Call Trace: [ 47.592564] <TASK> [ 47.592564] __schedule+0x410/0x1710 [ 47.592564] schedule+0x27/0xf0 [ 47.592564] schedule_timeout+0x15d/0x170 [ 47.592564] __down_common+0x119/0x220 [ 47.592564] down+0x47/0x60 [ 47.592564] xfs_buf_lock+0x31/0xe0 [xfs] [ 47.592564] xfs_buf_find_lock+0x5f/0x190 [xfs] [ 47.592564] xfs_buf_get_map+0x1e5/0xb30 [xfs] [ 47.624590] ? blk_finish_plug+0x28/0x40 [ 47.624590] xfs_buf_read_map+0x62/0x310 [xfs] [ 47.624590] ? xfs_btree_read_buf_block+0xa7/0x120 [xfs] [ 47.637593] xfs_trans_read_buf_map+0x12e/0x310 [xfs] [ 47.637593] ? xfs_btree_read_buf_block+0xa7/0x120 [xfs] [ 47.637593] ? __xfs_buf_submit+0xa7/0x1e0 [xfs] [ 47.637593] xfs_btree_read_buf_block+0xa7/0x120 [xfs] [ 47.637593] xfs_btree_lookup_get_block+0xa6/0x1f0 [xfs] [ 47.637593] ? __pfx_xfs_btree_count_blocks_helper+0x10/0x10 [xfs] [ 47.670917] xfs_btree_visit_block+0x101/0x170 [xfs] [ 47.670917] ? __pfx_xfs_btree_count_blocks_helper+0x10/0x10 [xfs] [ 47.670917] xfs_btree_visit_blocks+0xd4/0x150 [xfs] [ 47.670917] xfs_finobt_calc_reserves+0xbe/0x170 [xfs] [ 47.670917] ? xfs_refcountbt_calc_reserves+0xf6/0x120 [xfs] [ 47.670917] xfs_ag_resv_init+0xb7/0x260 [xfs] [ 47.702570] xfs_fs_reserve_ag_blocks+0x3d/0xf0 [xfs] [ 47.702570] xfs_mountfs+0x63e/0xa50 [xfs] [ 47.702570] ? __pfx_xfs_fstrm_free_func+0x10/0x10 [xfs] [ 47.702570] xfs_fs_fill_super+0x6ae/0x970 [xfs] [ 47.702570] ? __pfx_xfs_fs_fill_super+0x10/0x10 [xfs] [ 47.702570] get_tree_bdev+0x123/0x1c0 [ 47.702570] vfs_get_tree+0x26/0xf0 [ 47.734776] vfs_cmd_create+0x65/0xf0 [ 47.734776] ? __do_sys_fsconfig+0x24d/0x620 [ 47.734776] __do_sys_fsconfig+0x43c/0x620 [ 47.734776] do_syscall_64+0x82/0x170 [ 47.734776] ? __do_sys_prctl+0x43/0xa10 [ 47.734776] ? syscall_exit_to_user_mode+0x75/0x230 [ 47.734776] ? do_syscall_64+0x8f/0x170 [ 47.734776] ? syscall_exit_to_user_mode+0x75/0x230 [ 47.768591] ? do_syscall_64+0x8f/0x170 [ 47.768591] ? do_user_addr_fault+0x30f/0x6c0 [ 47.768591] ? exc_page_fault+0x7e/0x180 [ 47.781597] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 47.781597] RIP: 0033:0x7f1410dbbaae [ 47.781597] RSP: 002b:00007ffe895f8258 EFLAGS: 00000246 ORIG_RAX: 00000000000001af [ 47.781597] RAX: ffffffffffffffda RBX: 0000563a84497e30 RCX: 00007f1410dbbaae [ 47.804594] RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000003 [ 47.804594] RBP: 00007ffe895f83a0 R08: 0000000000000000 R09: 00007f1410e7db20 [ 47.804594] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f1410f02b60 [ 47.804594] R13: 0000000000000000 R14: 0000563a8449f500 R15: 0000000000000065 [ 47.804594] </TASK>
[root@desiato ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS sda 8:0 0 7.3T 0 disk ├─sda1 8:1 0 512M 0 part │ └─md126 9:126 0 511.9M 0 raid1 /boot/efi ├─sda2 8:2 0 2G 0 part [SWAP] └─sda3 8:3 0 7.3T 0 part └─md127 9:127 0 29.1T 0 raid5 / sdb 8:16 0 7.3T 0 disk ├─sdb1 8:17 0 512M 0 part │ └─md126 9:126 0 511.9M 0 raid1 /boot/efi ├─sdb2 8:18 0 2G 0 part [SWAP] └─sdb3 8:19 0 7.3T 0 part └─md127 9:127 0 29.1T 0 raid5 / sdc 8:32 0 7.3T 0 disk ├─sdc1 8:33 0 512M 0 part │ └─md126 9:126 0 511.9M 0 raid1 /boot/efi ├─sdc2 8:34 0 2G 0 part [SWAP] └─sdc3 8:35 0 7.3T 0 part └─md127 9:127 0 29.1T 0 raid5 / sdd 8:48 0 7.3T 0 disk ├─sdd1 8:49 0 512M 0 part │ └─md126 9:126 0 511.9M 0 raid1 /boot/efi ├─sdd2 8:50 0 2G 0 part [SWAP] └─sdd3 8:51 0 7.3T 0 part └─md127 9:127 0 29.1T 0 raid5 / sde 8:64 0 7.3T 0 disk ├─sde1 8:65 0 512M 0 part │ └─md126 9:126 0 511.9M 0 raid1 /boot/efi ├─sde2 8:66 0 2G 0 part [SWAP] └─sde3 8:67 0 7.3T 0 part └─md127 9:127 0 29.1T 0 raid5 / [root@desiato ~]# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] [raid1] md126 : active raid1 sda1[0] sde1[7] sdd1[8] sdc1[6] sdb1[5] 524224 blocks super 1.0 [5/5] [UUUUU] md127 : active raid5 sda3[7] sdc3[8] sde3[5] sdd3[9] sdb3[6] 31245086720 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU] bitmap: 7/59 pages [28KB], 65536KB chunk unused devices: <none>
Created attachment 2039191 [details] Serial console output for failed boot with rawhide 6.10.0-0.rc6.20240703gite9d22f7a6655.53.fc41 kernel 5. Does this problem occur with the latest Rawhide kernel? Yes.
I have just bisected a similar (same?) problem, hera are my findings: With a filesystem that contains a very large amount of hardlinks the time to mount the filesystem skyrockets to around 15 minutes on 6.9.11-200.fc40.x86_64 as compared to around 1 second on 6.8.10-300.fc40.x86_64, this of course makes booting drop into emergency mode if the filesystem is in /etc/fstab. A git bisect nails the offending commit as 14dd46cf31f4aaffcf26b00de9af39d01ec8d547. The filesystem is a collection of daily snapshots of a live filesystem collected over a number of years, organized as a storage of unique files, that are reflinked to inodes that contain the actual {owner,group,permission, mtime}, and these inodes are hardlinked into the daily snapshot trees. The numbers for the filesystem are: Total file size: 3.6e+12 bytes Unique files: 12.4e+06 Reflink inodes: 18.6e+06 Hardlinks: 15.7e+09 Timing between the systems are: 6.8.10-300.fc40.x86_64: # time mount /dev/vg1/test /test real 0m0.835s user 0m0.002s sys 0m0.014s 6.9.11-200.fc40.x86_64: # time mount /dev/vg1/test /test real 15m36.508s user 0m0.000s sys 0m27.628s (iotop reports 1-4 MB/s mostly in two threads)
Probably thi patch fixes the problem: https://lore.kernel.org/lkml/ZrslIPV6%2Fqk6cLVy@dread.disaster.area/ Dave Chinner david xfs: xfs_finobt_count_blocks() walks the wrong btree From: Dave Chinner <dchinner> As a result of the factoring in commit 14dd46cf31f4 ("xfs: split xfs_inobt_init_cursor"), mount started taking a long time on a user's filesystem. For Anders, this made mount times regress from under a second to over 15 minutes for a filesystem with only 30 million inodes in it. Anders bisected it down to the above commit, but even then the bug was not obvious. In this commit, over 20 calls to xfs_inobt_init_cursor() were modified, and some we modified to call a new function named xfs_finobt_init_cursor(). If that takes you a moment to reread those function names to see what the rename was, then you have realised why this bug wasn't spotted during review. And it wasn't spotted on inspection even after the bisect pointed at this commit - a single missing "f" isn't the easiest thing for a human eye to notice.... The result is that xfs_finobt_count_blocks() now incorrectly calls xfs_inobt_init_cursor() so it is now walking the inobt instead of the finobt. Hence when there are lots of allocated inodes in a filesystem, mount takes a -long- time run because it now walks a massive allocated inode btrees instead of the small, nearly empty free inode btrees. It also means all the finobt space reservations are wrong, so mount could potentially given ENOSPC on kernel upgrade. In hindsight, commit 14dd46cf31f4 should have been two commits - the first to convert the finobt callers to the new API, the second to modify the xfs_inobt_init_cursor() API for the inobt callers. That would have made the bug very obvious during review. Fixes: 14dd46cf31f4 ("xfs: split xfs_inobt_init_cursor") Reported-by: Anders Blomdell <anders.blomdell> Signed-off-by: Dave Chinner <dchinner> --- fs/xfs/libxfs/xfs_ialloc_btree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c index 496e2f72a85b..797d5b5f7b72 100644 --- a/fs/xfs/libxfs/xfs_ialloc_btree.c +++ b/fs/xfs/libxfs/xfs_ialloc_btree.c @@ -749,7 +749,7 @@ xfs_finobt_count_blocks( if (error) return error; - cur = xfs_inobt_init_cursor(pag, tp, agbp); + cur = xfs_finobt_init_cursor(pag, tp, agbp); error = xfs_btree_count_blocks(cur, tree_blocks); xfs_btree_del_cursor(cur, error); xfs_trans_brelse(tp, agbp);
Thanks. This still isn't in the latest -stable kernels.
Notified -stable + GKH 2024-09-02, so will hopefully arrive in the stable tree soon (6.10): https://lore.kernel.org/stable/348d0bbc-050e-4d3d-9956-2dc8888188f1@gmail.com/T/#t
FEDORA-2024-c4b4460831 (kernel-6.10.8-200.fc40) has been submitted as an update to Fedora 40. https://bodhi.fedoraproject.org/updates/FEDORA-2024-c4b4460831
FEDORA-2024-c2e066226b (kernel-6.10.8-100.fc39) has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2024-c2e066226b
Patch still not applied in 6.10.8, so bug is still present.
> Patch still not applied in 6.10.8, so bug is still present. It was applied to the Fedora build of 6.10.8: https://koji.fedoraproject.org/koji/buildinfo?buildID=2541219 (F40) See its changelog: ``` * Wed Sep 04 2024 Augusto Caringi <acaringi> [6.10.8-0] - Add to BugsFixed (Augusto Caringi) - xfs: xfs_finobt_count_blocks() walks the wrong btree (Dave Chinner) - Linux v6.10.8 ... ... ... ``` -> it is "xfs: xfs_finobt_count_blocks() walks the wrong btree (Dave Chinner)" It is also in the F39 build: https://koji.fedoraproject.org/koji/buildinfo?buildID=2541217
Thanks! https://koji.fedoraproject.org/koji/buildinfo?buildID=2541219 works!
FEDORA-2024-c4b4460831 has been pushed to the Fedora 40 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-c4b4460831` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-c4b4460831 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2024-c2e066226b has been pushed to the Fedora 39 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-c2e066226b` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-c2e066226b See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2024-c4b4460831 (kernel-6.10.8-200.fc40) has been pushed to the Fedora 40 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2024-c2e066226b (kernel-6.10.8-100.fc39) has been pushed to the Fedora 39 stable repository. If problem still persists, please make note of it in this bug report.