Bug 2394998 - during btrfs scrub, Freezing user space processes failed
Summary: during btrfs scrub, Freezing user space processes failed
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 43
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: fedora-kernel-btrfs
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2025-09-13 21:54 UTC by Chris Murphy
Modified: 2025-10-24 01:47 UTC (History)
22 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)
full dmesg (110.77 KB, text/plain)
2025-09-13 21:54 UTC, Chris Murphy
no flags Details

Description Chris Murphy 2025-09-13 21:54:16 UTC
Created attachment 2106546 [details]
full dmesg

Created attachment 2106546 [details]
full dmesg

Created attachment 2106546 [details]
full dmesg

6.17.0-0.rc5.42.fc43.x86_64
btrfs-progs-6.16-1.fc42.x86_64

[ 8088.052124] kernel: BTRFS info (device dm-1): scrub: started on devid 1
[ 9662.647055] kernel: Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7
[ 9662.689046] kernel: Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7
[ 9662.793052] kernel: wlp0s20f3: deauthenticating from a4:22:49:b2:cb:a6 by local choice (Reason: 3=DEAUTH_LEAVING)
[ 9727.984200] kernel: PM: suspend entry (deep)
[ 9727.991082] kernel: Filesystems sync: 0.007 seconds
[ 9748.172951] kernel: Freezing user space processes
[ 9748.173350] kernel: Freezing user space processes failed after 20.001 seconds (1 tasks refusing to freeze, wq_busy=0):
[ 9748.173520] kernel: task:btrfs           state:D stack:0     pid:15156 tgid:15155 ppid:4043   task_flags:0x440140 flags:0x00004006
[ 9748.173653] kernel: Call Trace:
[ 9748.173768] kernel:  <TASK>
[ 9748.173884] kernel:  __schedule+0x2f9/0x7b0
[ 9748.174026] kernel:  schedule+0x27/0x80
[ 9748.174166] kernel:  io_schedule+0x46/0x70
[ 9748.174295] kernel:  blk_mq_get_tag+0x11d/0x2d0
[ 9748.174444] kernel:  ? __pfx_autoremove_wake_function+0x10/0x10
[ 9748.174545] kernel:  __blk_mq_alloc_requests+0xb0/0x2b0
[ 9748.174651] kernel:  blk_mq_submit_bio+0x2c3/0x890
[ 9748.174764] kernel:  __submit_bio+0x74/0x280
[ 9748.174855] kernel:  __submit_bio_noacct+0x90/0x210
[ 9748.174925] kernel:  btrfs_submit_chunk+0x1a2/0x6c0
[ 9748.175027] kernel:  ? __pfx_scrub_read_endio+0x10/0x10
[ 9748.175118] kernel:  btrfs_submit_bbio+0x1a/0x30
[ 9748.175184] kernel:  submit_initial_group_read+0x8a/0x1d0
[ 9748.175264] kernel:  scrub_simple_mirror+0x26f/0x310
[ 9748.175372] kernel:  scrub_stripe+0x512/0x7a0
[ 9748.175445] kernel:  scrub_chunk+0xd0/0x170
[ 9748.175508] kernel:  scrub_enumerate_chunks+0x319/0x710
[ 9748.175571] kernel:  btrfs_scrub_dev+0x225/0x660
[ 9748.175641] kernel:  btrfs_ioctl+0xe77/0x15d0
[ 9748.175710] kernel:  __x64_sys_ioctl+0x94/0xe0
[ 9748.175779] kernel:  do_syscall_64+0x82/0x2c0
[ 9748.175848] kernel:  ? __lruvec_stat_mod_folio+0x85/0xd0
[ 9748.175919] kernel:  ? xas_load+0x11/0x100
[ 9748.176032] kernel:  ? xas_find+0x83/0x1b0
[ 9748.176116] kernel:  ? next_uptodate_folio+0xa0/0x350
[ 9748.176186] kernel:  ? filemap_map_pages+0x35c/0x5a0
[ 9748.176255] kernel:  ? memcg1_check_events+0x60/0x1d0
[ 9748.176325] kernel:  ? do_read_fault+0x107/0x260
[ 9748.176393] kernel:  ? handle_pte_fault+0x118/0x240
[ 9748.176461] kernel:  ? do_fault+0x150/0x260
[ 9748.176523] kernel:  ? __handle_mm_fault+0x551/0x6a0
[ 9748.176591] kernel:  ? count_memcg_events+0xd6/0x220
[ 9748.176670] kernel:  ? handle_mm_fault+0x248/0x360
[ 9748.176740] kernel:  ? do_user_addr_fault+0x21a/0x690
[ 9748.176803] kernel:  ? exc_page_fault+0x74/0x180
[ 9748.176873] kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 9748.176943] kernel: RIP: 0033:0x7f4a739060ed
[ 9748.176996] kernel: RSP: 002b:00007f4a737aec50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 9748.177102] kernel: RAX: ffffffffffffffda RBX: 000055e4140b79e0 RCX: 00007f4a739060ed
[ 9748.177181] kernel: RDX: 000055e4140b79e0 RSI: 00000000c400941b RDI: 0000000000000003
[ 9748.177251] kernel: RBP: 00007f4a737aeca0 R08: 0000000000000020 R09: 31203a6b63617473
[ 9748.177330] kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007f4a737af6c0
[ 9748.177399] kernel: R13: 00007ffe1aba7a10 R14: 00007f4a737afcdc R15: 00007ffe1aba7b17
[ 9748.177461] kernel:  </TASK>
[ 9748.177531] kernel: OOM killer enabled.
[ 9748.177593] kernel: Restarting tasks: Starting
[ 9748.177678] kernel: Restarting tasks: Done
[ 9748.177746] kernel: random: crng reseeded on system resumption
[ 9748.318065] kernel: PM: suspend exit


The storage stack is: USB flash drive -> dm-crypt -> Btrfs


Upstream report: https://lore.kernel.org/linux-btrfs/d93b2a2d-6ad9-4c49-809f-11d769a6f30a@app.fastmail.com/T/#u

Comment 1 Chris Murphy 2025-09-15 23:53:08 UTC
Upstream is aware of the issue. It's not a regression. And a couple of solutions are being mulled over.

Comment 2 Chris Murphy 2025-10-12 22:16:54 UTC
Per response upstream https://lore.kernel.org/linux-btrfs/20251012085256.8628-1-safinaskar@gmail.com/ switching to systemd.

See also bug and patch: https://github.com/systemd/systemd/issues/38337

The problem isn't happening on Fedora 43. But does happen on Fedora 42 with systemd-257.9-2.fc42.

Comment 3 David Tardon 2025-10-16 13:20:58 UTC
(In reply to Chris Murphy from comment #2)
> See also bug and patch: https://github.com/systemd/systemd/issues/38337

The fix for this is already included in v257.8...

Comment 4 Chris Murphy 2025-10-24 01:47:20 UTC
OK so then systemd isn't the right component after all. Switching back to kernel.


Note You need to log in before you can comment on or make changes to this bug.