Bug 2213967 - large folio related page cache iteration hang
Summary: large folio related page cache iteration hang
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 39
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-10 13:26 UTC by Ivan Mironov
Modified: 2024-01-15 12:23 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-01-15 12:17:28 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Ivan Mironov 2023-06-10 13:26:59 UTC
I have two machines with similar hardware, similar workload and similar disk partitioning using XFS, both running Fedora 37.

Some time ago I encountered a kernel crash on the first machine (kernel 6.3.4-101.fc37.x86_64):

	watchdog: BUG: soft lockup - CPU#14 stuck for 26s! [rocksdb:low:37079]
	Modules linked in: nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel rfkill tcp_bbr ip_set nf_tables nfnetlink tun nct6775 nct6775_core hwmon_vid vfat fat ipmi_ssif intel_rapl_msr intel_rapl_common edac_mce_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi kvm_amd snd_hda_codec snd_hda_core snd_hwdep kvm snd_pcm snd_timer acpi_ipmi ipmi_si cdc_ether joydev snd usbnet ipmi_devintf irqbypass wmi_bmof soundcore i2c_piix4 k10temp rapl mii ipmi_msghandler fuse loop xfs raid1 crct10dif_pclmul nvme crc32_pclmul crc32c_intel igb polyval_clmulni polyval_generic nvme_core ghash_clmulni_intel ast ccp dca sha512_ssse3 wmi sp5100_tco nvme_common i2c_algo_bit
	CPU: 14 PID: 37079 Comm: rocksdb:low Not tainted 6.3.4-101.fc37.x86_64 #1
	Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021
	RIP: 0010:xas_load+0x34/0x50
	Code: 22 ff ff ff 48 89 c2 83 e2 03 48 83 fa 02 75 08 48 3d 00 10 00 00 77 07 5b 5d c3 cc cc cc cc 0f b6 4b 10 48 8d 68 fe 38 48 fe <72> ec 48 89 ee 48 89 df e8 cf fd ff ff 80 7d 00 00 75 c7 eb d9 0f
	RSP: 0018:ffffb0a0c17cfb08 EFLAGS: 00000246
	RAX: ffff9a47734716d2 RBX: ffffb0a0c17cfb20 RCX: 0000000000000000
	RDX: 0000000000000002 RSI: ffff9a4356eb8248 RDI: ffffb0a0c17cfb20
	RBP: ffff9a47734716d0 R08: 0000000000000000 R09: 000000000000121c
	R10: ffff9a4c705a06b0 R11: 0000000000000000 R12: 000000000000d405
	R13: 000000000000d403 R14: 000000000000d403 R15: ffffb0a0c17cfdb8
	FS:  00007f3b517ff6c0(0000) GS:ffff9a5e3ed80000(0000) knlGS:0000000000000000
	CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	CR2: 00005609c3547510 CR3: 00000007537ce000 CR4: 0000000000750ee0
	PKRU: 55555554
	Call Trace:
	 <TASK>
	 filemap_get_read_batch+0x179/0x270
	 filemap_get_pages+0xab/0x6a0
	 ? filemap_get_pages+0xab/0x6a0
	 ? _copy_to_iter+0xc4/0x650
	 filemap_read+0xdf/0x350
	 xfs_file_buffered_read+0x4f/0xd0 [xfs]
	 xfs_file_read_iter+0x74/0xe0 [xfs]
	 vfs_read+0x240/0x310
	 __x64_sys_pread64+0x98/0xd0
	 do_syscall_64+0x5f/0x90
	 ? __x64_sys_pread64+0xa8/0xd0
	 ? syscall_exit_to_user_mode+0x1b/0x40
	 ? do_syscall_64+0x6b/0x90
	 ? irqtime_account_irq+0x40/0xc0
	 ? __irq_exit_rcu+0x4b/0xf0
	 entry_SYSCALL_64_after_hwframe+0x72/0xdc
	RIP: 0033:0x7f3b6743c227
	Code: 08 89 3c 24 48 89 4c 24 18 e8 b5 e3 f8 ff 4c 8b 54 24 18 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 04 24 e8 05 e4 f8 ff 48 8b
	RSP: 002b:00007f3b517f9330 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
	RAX: ffffffffffffffda RBX: 000000000000121c RCX: 00007f3b6743c227
	RDX: 000000000000121c RSI: 00007f3b4ea25800 RDI: 000000000000008c
	RBP: 00007f3b517f9480 R08: 0000000000000000 R09: 00007f3b517f94c0
	R10: 000000000d403fdd R11: 0000000000000293 R12: 000000000000121c
	R13: 000000000d403fdd R14: 00007f3b4ea25800 R15: 0000000000000000
	 </TASK>

Today similar crash happened on the second machine (kernel 6.3.5-100.fc37.x86_64):

	kernel: watchdog: BUG: soft lockup - CPU#28 stuck for 26s! [rocksdb:low:2195]
	kernel: Modules linked in: tls nf_conntrack_netbios_ns nf_conntrack_broadcast nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel tcp_bbr rfkill ip_set nf_tables nfnetlink nct6775 nct6775_core tun hwmon_vid jc42 vfat fat ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core kvm snd_hwdep snd_pcm snd_timer cdc_ether irqbypass acpi_ipmi snd usbnet wmi_bmof rapl ipmi_si k10temp soundcore i2c_piix4 joydev mii ipmi_devintf ipmi_msghandler fuse loop xfs uas usb_storage raid1 hid_cp2112 igb crct10dif_pclmul ast crc32_pclmul nvme crc32c_intel polyval_clmulni dca polyval_generic i2c_algo_bit nvme_core ghash_clmulni_intel ccp sha512_ssse3 wmi sp5100_tco nvme_common
	kernel: CPU: 28 PID: 2195 Comm: rocksdb:low Not tainted 6.3.5-100.fc37.x86_64 #1
	kernel: Hardware name: To Be Filled By O.E.M. X570D4U/X570D4U, BIOS T1.29b 05/17/2022
	kernel: RIP: 0010:xas_load+0x45/0x50
	kernel: Code: 3d 00 10 00 00 77 07 5b 5d c3 cc cc cc cc 0f b6 4b 10 48 8d 68 fe 38 48 fe 72 ec 48 89 ee 48 89 df e8 cf fd ff ff 80 7d 00 00 <75> c7 eb d9 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90
	kernel: RSP: 0018:ffffaab80392fb40 EFLAGS: 00000246
	kernel: RAX: fffff69f82a7c000 RBX: ffffaab80392fb58 RCX: 0000000000000000
	kernel: RDX: 0000000000000010 RSI: ffff94a4268a6480 RDI: ffffaab80392fb58
	kernel: RBP: ffff94a4268a6480 R08: 0000000000000000 R09: 000000000000424a
	kernel: R10: ffff94af1ec69ab0 R11: 0000000000000000 R12: 0000000000001610
	kernel: R13: 000000000000160c R14: 000000000000160c R15: ffffaab80392fdf0
	kernel: FS:  00007f49f7bfe6c0(0000) GS:ffff94b63f100000(0000) knlGS:0000000000000000
	kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	kernel: CR2: 00007f01446e9000 CR3: 000000014a4be000 CR4: 0000000000750ee0
	kernel: PKRU: 55555554
	kernel: Call Trace:
	kernel:  <IRQ>
	kernel:  ? watchdog_timer_fn+0x1a8/0x210
	kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
	kernel:  ? __hrtimer_run_queues+0x112/0x2b0
	kernel:  ? hrtimer_interrupt+0xf8/0x230
	kernel:  ? __sysvec_apic_timer_interrupt+0x61/0x130
	kernel:  ? sysvec_apic_timer_interrupt+0x6d/0x90
	kernel:  </IRQ>
	kernel:  <TASK>
	kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
	kernel:  ? xas_load+0x45/0x50
	kernel:  filemap_get_read_batch+0x179/0x270
	kernel:  filemap_get_pages+0xab/0x6a0
	kernel:  ? touch_atime+0x48/0x1b0
	kernel:  ? filemap_read+0x33f/0x350
	kernel:  filemap_read+0xdf/0x350
	kernel:  xfs_file_buffered_read+0x4f/0xd0 [xfs]
	kernel:  xfs_file_read_iter+0x74/0xe0 [xfs]
	kernel:  vfs_read+0x240/0x310
	kernel:  __x64_sys_pread64+0x98/0xd0
	kernel:  do_syscall_64+0x5f/0x90
	kernel:  ? native_flush_tlb_local+0x34/0x40
	kernel:  ? flush_tlb_func+0x10d/0x240
	kernel:  ? do_syscall_64+0x6b/0x90
	kernel:  ? sched_clock_cpu+0xf/0x190
	kernel:  ? irqtime_account_irq+0x40/0xc0
	kernel:  ? __irq_exit_rcu+0x4b/0xf0
	kernel:  entry_SYSCALL_64_after_hwframe+0x72/0xdc
	kernel: RIP: 0033:0x7f4a0c23c227
	kernel: Code: 08 89 3c 24 48 89 4c 24 18 e8 b5 e3 f8 ff 4c 8b 54 24 18 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 04 24 e8 05 e4 f8 ff 48 8b
	kernel: RSP: 002b:00007f49f7bf8310 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
	kernel: RAX: ffffffffffffffda RBX: 000000000000424a RCX: 00007f4a0c23c227
	kernel: RDX: 000000000000424a RSI: 00007f04294a35c0 RDI: 00000000000004be
	kernel: RBP: 00007f49f7bf8460 R08: 0000000000000000 R09: 00007f49f7bf84a0
	kernel: R10: 000000000160c718 R11: 0000000000000293 R12: 000000000000424a
	kernel: R13: 000000000160c718 R14: 00007f04294a35c0 R15: 0000000000000000
	kernel:  </TASK>
	...
	kernel: ------------[ cut here ]------------
	kernel: kernel BUG at fs/inode.c:612!
	kernel: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
	kernel: CPU: 21 PID: 2195 Comm: rocksdb:low Tainted: G             L     6.3.5-100.fc37.x86_64 #1
	kernel: Hardware name: To Be Filled By O.E.M. X570D4U/X570D4U, BIOS T1.29b 05/17/2022
	kernel: RIP: 0010:clear_inode+0x76/0x80
	kernel: Code: 2d a8 40 75 2b 48 8b 93 28 01 00 00 48 8d 83 28 01 00 00 48 39 c2 75 1a 48 c7 83 98 00 00 00 60 00 00 00 5b 5d c3 cc cc cc cc <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 90 90 90 90 90 90 90 90 90 90 90 90
	kernel: RSP: 0018:ffffaab80392fe58 EFLAGS: 00010002
	kernel: RAX: 0000000000000000 RBX: ffff94af1ec69938 RCX: 0000000000000000
	kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff94af1ec69ab8
	kernel: RBP: ffff94af1ec69ab8 R08: ffffaab80392fd38 R09: 0000000000000002
	kernel: R10: 0000000000000001 R11: 0000000000000005 R12: ffffffffc08b9860
	kernel: R13: ffff94af1ec69938 R14: 00000000ffffff9c R15: ffff94979dd5da40
	kernel: FS:  00007f49f7bfe6c0(0000) GS:ffff94b63ef40000(0000) knlGS:0000000000000000
	kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	kernel: CR2: 00007eefca8e2000 CR3: 000000014a4be000 CR4: 0000000000750ee0
	kernel: PKRU: 55555554
	kernel: Call Trace:
	kernel:  <TASK>
	kernel:  ? die+0x36/0x90
	kernel:  ? do_trap+0xda/0x100
	kernel:  ? clear_inode+0x76/0x80
	kernel:  ? do_error_trap+0x6a/0x90
	kernel:  ? clear_inode+0x76/0x80
	kernel:  ? exc_invalid_op+0x50/0x70
	kernel:  ? clear_inode+0x76/0x80
	kernel:  ? asm_exc_invalid_op+0x1a/0x20
	kernel:  ? clear_inode+0x76/0x80
	kernel:  ? clear_inode+0x1d/0x80
	kernel:  evict+0x1b8/0x1d0
	kernel:  do_unlinkat+0x174/0x320
	kernel:  __x64_sys_unlink+0x42/0x70
	kernel:  do_syscall_64+0x5f/0x90
	kernel:  ? __irq_exit_rcu+0x4b/0xf0
	kernel:  entry_SYSCALL_64_after_hwframe+0x72/0xdc
	kernel: RIP: 0033:0x7f4a0c23faab
	kernel: Code: f0 ff ff 73 01 c3 48 8b 0d 82 63 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 57 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 55 63 0d 00 f7 d8 64 89 01 48
	kernel: RSP: 002b:00007f49f7bfab58 EFLAGS: 00000206 ORIG_RAX: 0000000000000057
	kernel: RAX: ffffffffffffffda RBX: 00007f49f7bfac38 RCX: 00007f4a0c23faab
	kernel: RDX: 00007f49f7bfadd0 RSI: 00007f4a0bc2fd30 RDI: 00007f49dd3c32d0
	kernel: RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000
	kernel: R10: ffffffffffffdf58 R11: 0000000000000206 R12: 0000000000280bc0
	kernel: R13: 00007f4a0bca77b8 R14: 00007f49f7bfadd0 R15: 00007f49f7bfadd0
	kernel:  </TASK>
	kernel: Modules linked in: tls nf_conntrack_netbios_ns nf_conntrack_broadcast nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel tcp_bbr rfkill ip_set nf_tables nfnetlink nct6775 nct6775_core tun hwmon_vid jc42 vfat fat ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core kvm snd_hwdep snd_pcm snd_timer cdc_ether irqbypass acpi_ipmi snd usbnet wmi_bmof rapl ipmi_si k10temp soundcore i2c_piix4 joydev mii ipmi_devintf ipmi_msghandler fuse loop xfs uas usb_storage raid1 hid_cp2112 igb crct10dif_pclmul ast crc32_pclmul nvme crc32c_intel polyval_clmulni dca polyval_generic i2c_algo_bit nvme_core ghash_clmulni_intel ccp sha512_ssse3 wmi sp5100_tco nvme_common
	kernel: ---[ end trace 0000000000000000 ]---
	kernel: RIP: 0010:clear_inode+0x76/0x80
	kernel: Code: 2d a8 40 75 2b 48 8b 93 28 01 00 00 48 8d 83 28 01 00 00 48 39 c2 75 1a 48 c7 83 98 00 00 00 60 00 00 00 5b 5d c3 cc cc cc cc <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 90 90 90 90 90 90 90 90 90 90 90 90
	kernel: RSP: 0018:ffffaab80392fe58 EFLAGS: 00010002
	kernel: RAX: 0000000000000000 RBX: ffff94af1ec69938 RCX: 0000000000000000
	kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff94af1ec69ab8
	kernel: RBP: ffff94af1ec69ab8 R08: ffffaab80392fd38 R09: 0000000000000002
	kernel: R10: 0000000000000001 R11: 0000000000000005 R12: ffffffffc08b9860
	kernel: R13: ffff94af1ec69938 R14: 00000000ffffff9c R15: ffff94979dd5da40
	kernel: FS:  00007f49f7bfe6c0(0000) GS:ffff94b63ef40000(0000) knlGS:0000000000000000
	kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	kernel: CR2: 00007eefca8e2000 CR3: 000000014a4be000 CR4: 0000000000750ee0
	kernel: PKRU: 55555554
	kernel: note: rocksdb:low[2195] exited with irqs disabled
	kernel: note: rocksdb:low[2195] exited with preempt_count 1

Reproducible: Sometimes

Steps to Reproduce:
Not sure how to reproduce this. But seems to be related to RocksDB multithreaded mostly-write workload on XFS on NVMe.

Never happened on kernel 6.2. Looks like a regression.
Actual Results:  
BUG: soft lockup

Expected Results:  
No soft lockup, no crashes, just normal operation.

Comment 1 Ivan Mironov 2023-06-14 14:13:05 UTC
It happened again, now with Fedora 38 with kernel 6.3.6-200.fc38.x86_64:

[ 1088.788665] watchdog: BUG: soft lockup - CPU#23 stuck for 27s! [rocksdb:low:1855]
[ 1088.788689] Modules linked in: nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill tcp_bbr ip_set nf_tables nfnetlink tun nct6775 nct6775_core hwmon_vid vfat ipmi_ssif fat intel_rapl_msr intel_rapl_common edac_mce_amd snd_hda_intel kvm_amd snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core kvm snd_hwdep snd_pcm acpi_ipmi snd_timer ipmi_si snd cdc_ether usbnet ipmi_devintf irqbypass wmi_bmof soundcore k10temp i2c_piix4 mii ipmi_msghandler rapl joydev fuse loop xfs raid1 igb nvme ast crct10dif_pclmul crc32_pclmul dca crc32c_intel nvme_core i2c_algo_bit polyval_clmulni polyval_generic ghash_clmulni_intel ccp sp5100_tco nvme_common wmi sha512_ssse3
[ 1088.788742] CPU: 23 PID: 1855 Comm: rocksdb:low Not tainted 6.3.6-200.fc38.x86_64 #1
[ 1088.788744] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021
[ 1088.788746] RIP: 0010:xas_descend+0xa/0x70
[ 1088.788755] Code: 07 48 c1 e8 20 48 89 57 08 c3 cc cc cc cc 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f b6 0e 48 8b 57 08 48 d3 ea <83> e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48 89 77 18 48 89 c1 83
[ 1088.788756] RSP: 0018:ffffbe2701ecfbc8 EFLAGS: 00000246
[ 1088.788758] RAX: ffff974ae7875daa RBX: ffffbe2701ecfbe8 RCX: 0000000000000000
[ 1088.788760] RDX: 000000000000f601 RSI: ffff974ae7875da8 RDI: ffffbe2701ecfbe8
[ 1088.788761] RBP: ffff974ae7875da8 R08: 0000000000000000 R09: 0000000000000ea7
[ 1088.788762] R10: ffff9749f24fdab0 R11: 0000000000000000 R12: 000000000000f601
[ 1088.788763] R13: 000000000000f600 R14: 000000000000f600 R15: ffffbe2701ecfe80
[ 1088.788764] FS:  00007f3a499ff6c0(0000) GS:ffff97683efc0000(0000) knlGS:0000000000000000
[ 1088.788766] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1088.788767] CR2: 00007ef226b27020 CR3: 00000001067f8000 CR4: 0000000000750ee0
[ 1088.788768] PKRU: 55555554
[ 1088.788769] Call Trace:
[ 1088.788772]  <IRQ>
[ 1088.788777]  ? watchdog_timer_fn+0x1a8/0x210
[ 1088.788782]  ? __pfx_watchdog_timer_fn+0x10/0x10
[ 1088.788784]  ? __hrtimer_run_queues+0x112/0x2b0
[ 1088.788787]  ? hrtimer_interrupt+0xf8/0x230
[ 1088.788790]  ? __sysvec_apic_timer_interrupt+0x61/0x130
[ 1088.788793]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[ 1088.788796]  </IRQ>
[ 1088.788796]  <TASK>
[ 1088.788797]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 1088.788802]  ? xas_descend+0xa/0x70
[ 1088.788804]  xas_load+0x41/0x50
[ 1088.788807]  filemap_get_read_batch+0x179/0x270
[ 1088.788810]  filemap_get_pages+0xab/0x690
[ 1088.788813]  ? touch_atime+0x48/0x1b0
[ 1088.788816]  ? filemap_read+0x33f/0x350
[ 1088.788818]  filemap_read+0xdf/0x350
[ 1088.788822]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
[ 1088.788945]  xfs_file_read_iter+0x74/0xe0 [xfs]
[ 1088.789030]  vfs_read+0x240/0x310
[ 1088.789034]  __x64_sys_pread64+0x98/0xd0
[ 1088.789037]  do_syscall_64+0x60/0x90
[ 1088.789040]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 1088.789044] RIP: 0033:0x7f3a71721115
[ 1088.789065] Code: e8 48 89 75 f0 89 7d f8 48 89 4d e0 e8 84 99 f8 ff 4c 8b 55 e0 48 8b 55 e8 41 89 c0 48 8b 75 f0 8b 7d f8 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2b 44 89 c7 48 89 45 f8 e8 d7 99 f8 ff 48 8b
[ 1088.789066] RSP: 002b:00007f3a499f9390 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
[ 1088.789068] RAX: ffffffffffffffda RBX: 00007f3a499f94e0 RCX: 00007f3a71721115
[ 1088.789069] RDX: 0000000000000ea7 RSI: 00007f3a43452000 RDI: 0000000000000427
[ 1088.789070] RBP: 00007f3a499f93b0 R08: 0000000000000000 R09: 00007f3a499f9528
[ 1088.789071] R10: 000000000f600496 R11: 0000000000000293 R12: 000000000f600496
[ 1088.789072] R13: 0000000000000ea7 R14: 00007f3a43452000 R15: 00007f3a3d20b940
[ 1088.789074]  </TASK>
[ 1116.788144] watchdog: BUG: soft lockup - CPU#23 stuck for 53s! [rocksdb:low:1855]
[ 1116.788177] Modules linked in: nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill tcp_bbr ip_set nf_tables nfnetlink tun nct6775 nct6775_core hwmon_vid vfat ipmi_ssif fat intel_rapl_msr intel_rapl_common edac_mce_amd snd_hda_intel kvm_amd snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core kvm snd_hwdep snd_pcm acpi_ipmi snd_timer ipmi_si snd cdc_ether usbnet ipmi_devintf irqbypass wmi_bmof soundcore k10temp i2c_piix4 mii ipmi_msghandler rapl joydev fuse loop xfs raid1 igb nvme ast crct10dif_pclmul crc32_pclmul dca crc32c_intel nvme_core i2c_algo_bit polyval_clmulni polyval_generic ghash_clmulni_intel ccp sp5100_tco nvme_common wmi sha512_ssse3
[ 1116.788225] CPU: 23 PID: 1855 Comm: rocksdb:low Tainted: G             L     6.3.6-200.fc38.x86_64 #1
[ 1116.788228] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021
[ 1116.788229] RIP: 0010:xas_start+0x50/0xc0
[ 1116.788238] Code: 48 89 c1 83 e1 03 48 83 f9 02 75 08 48 3d 00 10 00 00 77 12 48 85 d2 75 1a 48 c7 47 18 00 00 00 00 c3 cc cc cc cc 0f b6 48 fe <48> d3 ea 48 83 fa 3f 76 e6 48 c7 47 18 01 00 00 00 31 c0 c3 cc cc
[ 1116.788240] RSP: 0018:ffffbe2701ecfbc8 EFLAGS: 00000282
[ 1116.788242] RAX: ffff97493820824a RBX: ffffbe2701ecfbe8 RCX: 000000000000000c
[ 1116.788244] RDX: 000000000000f601 RSI: ffff974ae7875da8 RDI: ffffbe2701ecfbe8
[ 1116.788245] RBP: 000000000000f601 R08: 0000000000000000 R09: 0000000000000ea7
[ 1116.788246] R10: ffff9749f24fdab0 R11: 0000000000000000 R12: 000000000000f601
[ 1116.788247] R13: 000000000000f600 R14: 000000000000f600 R15: ffffbe2701ecfe80
[ 1116.788248] FS:  00007f3a499ff6c0(0000) GS:ffff97683efc0000(0000) knlGS:0000000000000000
[ 1116.788250] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1116.788252] CR2: 00007ef226b27020 CR3: 00000001067f8000 CR4: 0000000000750ee0
[ 1116.788253] PKRU: 55555554
[ 1116.788254] Call Trace:
[ 1116.788257]  <IRQ>
[ 1116.788261]  ? watchdog_timer_fn+0x1a8/0x210
[ 1116.788267]  ? __pfx_watchdog_timer_fn+0x10/0x10
[ 1116.788268]  ? __hrtimer_run_queues+0x112/0x2b0
[ 1116.788272]  ? hrtimer_interrupt+0xf8/0x230
[ 1116.788274]  ? __sysvec_apic_timer_interrupt+0x61/0x130
[ 1116.788277]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[ 1116.788279]  </IRQ>
[ 1116.788279]  <TASK>
[ 1116.788280]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 1116.788285]  ? xas_start+0x50/0xc0
[ 1116.788287]  xas_load+0xe/0x50
[ 1116.788289]  filemap_get_read_batch+0x179/0x270
[ 1116.788293]  filemap_get_pages+0xab/0x690
[ 1116.788295]  ? touch_atime+0x48/0x1b0
[ 1116.788298]  ? filemap_read+0x33f/0x350
[ 1116.788300]  filemap_read+0xdf/0x350
[ 1116.788304]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
[ 1116.788404]  xfs_file_read_iter+0x74/0xe0 [xfs]
[ 1116.788474]  vfs_read+0x240/0x310
[ 1116.788477]  __x64_sys_pread64+0x98/0xd0
[ 1116.788479]  do_syscall_64+0x60/0x90
[ 1116.788482]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 1116.788485] RIP: 0033:0x7f3a71721115
[ 1116.788506] Code: e8 48 89 75 f0 89 7d f8 48 89 4d e0 e8 84 99 f8 ff 4c 8b 55 e0 48 8b 55 e8 41 89 c0 48 8b 75 f0 8b 7d f8 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2b 44 89 c7 48 89 45 f8 e8 d7 99 f8 ff 48 8b
[ 1116.788507] RSP: 002b:00007f3a499f9390 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
[ 1116.788508] RAX: ffffffffffffffda RBX: 00007f3a499f94e0 RCX: 00007f3a71721115
[ 1116.788510] RDX: 0000000000000ea7 RSI: 00007f3a43452000 RDI: 0000000000000427
[ 1116.788510] RBP: 00007f3a499f93b0 R08: 0000000000000000 R09: 00007f3a499f9528
[ 1116.788511] R10: 000000000f600496 R11: 0000000000000293 R12: 000000000f600496
[ 1116.788512] R13: 0000000000000ea7 R14: 00007f3a43452000 R15: 00007f3a3d20b940
[ 1116.788515]  </TASK>
[ 1123.770238] rcu: INFO: rcu_preempt self-detected stall on CPU
[ 1123.770243] rcu: 	23-....: (59997 ticks this GP) idle=aaac/1/0x4000000000000000 softirq=172344/172344 fqs=13940
[ 1123.770248] rcu: 	(t=60000 jiffies g=479441 q=95799 ncpus=32)
[ 1123.770250] CPU: 23 PID: 1855 Comm: rocksdb:low Tainted: G             L     6.3.6-200.fc38.x86_64 #1
[ 1123.770253] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021
[ 1123.770254] RIP: 0010:xas_start+0xa/0xc0
[ 1123.770262] Code: c0 eb a5 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 8b 57 18 48 89 d0 83 e0 03 <74> 5c 48 81 fa 05 c0 ff ff 76 06 48 83 f8 02 74 46 48 8b 07 48 8b
[ 1123.770264] RSP: 0018:ffffbe2701ecfbc8 EFLAGS: 00000206
[ 1123.770266] RAX: 0000000000000003 RBX: ffffbe2701ecfbe8 RCX: 0000000000000000
[ 1123.770268] RDX: 0000000000000003 RSI: ffff974ae7875da8 RDI: ffffbe2701ecfbe8
[ 1123.770269] RBP: 000000000000f601 R08: 0000000000000000 R09: 0000000000000ea7
[ 1123.770270] R10: ffff9749f24fdab0 R11: 0000000000000000 R12: 000000000000f601
[ 1123.770271] R13: 000000000000f600 R14: 000000000000f600 R15: ffffbe2701ecfe80
[ 1123.770272] FS:  00007f3a499ff6c0(0000) GS:ffff97683efc0000(0000) knlGS:0000000000000000
[ 1123.770274] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1123.770275] CR2: 00007ef226b27020 CR3: 00000001067f8000 CR4: 0000000000750ee0
[ 1123.770276] PKRU: 55555554
[ 1123.770277] Call Trace:
[ 1123.770279]  <IRQ>
[ 1123.770282]  ? rcu_dump_cpu_stacks+0xc4/0x100
[ 1123.770287]  ? rcu_sched_clock_irq+0x4f2/0x1170
[ 1123.770289]  ? sched_slice+0x87/0x140
[ 1123.770293]  ? task_tick_fair+0x2fc/0x400
[ 1123.770295]  ? trigger_load_balance+0x72/0x350
[ 1123.770298]  ? update_process_times+0x74/0xb0
[ 1123.770301]  ? tick_sched_handle+0x22/0x60
[ 1123.770304]  ? tick_sched_timer+0x67/0x80
[ 1123.770306]  ? __pfx_tick_sched_timer+0x10/0x10
[ 1123.770308]  ? __hrtimer_run_queues+0x112/0x2b0
[ 1123.770310]  ? hrtimer_interrupt+0xf8/0x230
[ 1123.770312]  ? __sysvec_apic_timer_interrupt+0x61/0x130
[ 1123.770315]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[ 1123.770317]  </IRQ>
[ 1123.770318]  <TASK>
[ 1123.770318]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 1123.770326]  ? xas_start+0xa/0xc0
[ 1123.770329]  xas_load+0xe/0x50
[ 1123.770331]  filemap_get_read_batch+0x179/0x270
[ 1123.770335]  filemap_get_pages+0xab/0x690
[ 1123.770337]  ? touch_atime+0x48/0x1b0
[ 1123.770341]  ? filemap_read+0x33f/0x350
[ 1123.770342]  filemap_read+0xdf/0x350
[ 1123.770347]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
[ 1123.770477]  xfs_file_read_iter+0x74/0xe0 [xfs]
[ 1123.770560]  vfs_read+0x240/0x310
[ 1123.770564]  __x64_sys_pread64+0x98/0xd0
[ 1123.770566]  do_syscall_64+0x60/0x90
[ 1123.770569]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 1123.770572] RIP: 0033:0x7f3a71721115
[ 1123.770595] Code: e8 48 89 75 f0 89 7d f8 48 89 4d e0 e8 84 99 f8 ff 4c 8b 55 e0 48 8b 55 e8 41 89 c0 48 8b 75 f0 8b 7d f8 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2b 44 89 c7 48 89 45 f8 e8 d7 99 f8 ff 48 8b
[ 1123.770596] RSP: 002b:00007f3a499f9390 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
[ 1123.770598] RAX: ffffffffffffffda RBX: 00007f3a499f94e0 RCX: 00007f3a71721115
[ 1123.770599] RDX: 0000000000000ea7 RSI: 00007f3a43452000 RDI: 0000000000000427
[ 1123.770600] RBP: 00007f3a499f93b0 R08: 0000000000000000 R09: 00007f3a499f9528
[ 1123.770601] R10: 000000000f600496 R11: 0000000000000293 R12: 000000000f600496
[ 1123.770601] R13: 0000000000000ea7 R14: 00007f3a43452000 R15: 00007f3a3d20b940
[ 1123.770603]  </TASK>

Comment 2 Ivan Mironov 2023-06-15 16:03:34 UTC
Happened yet again on Fedora 38 with kernel 6.3.6-200.fc38.x86_64. This is unusable, I will try to downgrade to 6.2.16-300.fc38.

Comment 3 Ivan Mironov 2023-06-18 15:26:35 UTC
It looks like I am not the only one affected by this: https://www.spinics.net/lists/kernel/msg4783004.html

Comment 4 Ivan Mironov 2023-07-03 14:05:22 UTC
Yet another possibly related bug report: https://bugzilla.kernel.org/show_bug.cgi?id=217572

I am not 100% sure that this happens only on 6.3.* kernels, but downgrading to 6.2.16-300.fc38.x86_64 fixed this for me. Or it was caused by some very specific conditions that somehow existed only before 2023-06-15.

Comment 5 Dave Chinner 2023-07-05 21:58:26 UTC
This is not an XFS bug, nor is it in any way related to recent XFS issues in early 6.3 kernels.  As I commented in https://bugzilla.kernel.org/show_bug.cgi?id=217572:

"No, that has nothing to do with the problem you are seeing on 6.1.31
kernels. That was a fix for a regression introduced in 6.3-rc1, and
hence does not exist in 6.1.y kernels.

The problem you are tripping over appears to be a livelock in the
page cache iterator infrastructure, not an issue with the filesystem
itself. This has been seen occasionally (maybe once every couple of
months of testing across the entire dev community) during testing
since large folios were enabled in the page cache, but nobody has
been able to reproduce it reliably enough to be able to isolate the
root cause and fix it yet.

If you can reproduce it reliably and quickly, then putting together
a recipe that we can use to trigger it would be a great help."

The issue has been around since ~5.17 (IIRC) and it is largely impossible to reproduce, so any help you can providing in crafting a reliable reproducer that we can use to diagnose the root cause and test the fix would be appreciated.

-Dave.

Comment 6 Aoife Moloney 2023-11-23 01:46:29 UTC
This message is a reminder that Fedora Linux 37 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 37 on 2023-12-05.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '37'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 37 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 7 Aoife Moloney 2024-01-15 12:17:28 UTC
Fedora Linux 37 entered end-of-life (EOL) status on 2023-12-05.

Fedora Linux 37 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 8 Ivan Mironov 2024-01-15 12:23:05 UTC
Still reproducible on Fedora 39.


Note You need to log in before you can comment on or make changes to this bug.