Bug 2329581 - Installs frequently crash on a kernel GPF in VM with Nehalem CPU when doing grub2-mkconfig since kernel-6.13.0-0.rc0.20241125git9f16d5e6f220.8.fc42
Summary: Installs frequently crash on a kernel GPF in VM with Nehalem CPU when doing g...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: openqa
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-11-30 00:32 UTC by Adam Williamson
Modified: 2025-01-27 22:32 UTC (History)
18 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2025-01-27 22:32:33 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
video of a likely-related issue in UEFI installs (992.91 KB, application/octet-stream)
2024-12-02 18:28 UTC, Adam Williamson
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Linux Kernel 219554 0 P3 NEW Kernel 6.13 crashes when doing grub2-mkconfig during Fedora install in VM with Nehalem CPU config 2024-12-03 19:28:31 UTC

Description Adam Williamson 2024-11-30 00:32:40 UTC
Since kernel-6.13.0-0.rc0.20241125git9f16d5e6f220.8.fc42 appeared in Rawhide, openQA live install to BIOS tests are frequently failing. The initial symptom is that anaconda hangs in its bootloader install phase. The system is completely stuck, openQA cannot get to a console to upload logs. I tweaked openQA to log kernel messages to the serial console and it appears that we're hitting a GPF when this happens. Full GPF message extract:

[  762.160401] Oops: general protection fault, probably for non-canonical address 0xcd3e3ac14fbb0ec6: 0000 [#1] PREEMPT SMP PTI
[  762.164040] CPU: 1 UID: 0 PID: 3900 Comm: modprobe Not tainted 6.13.0-0.rc0.20241126git7eef7e306d3c.10.fc42.x86_64 #1
[  762.168442] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
[  762.172250] RIP: 0010:__kmalloc_cache_noprof+0x1e0/0x3e0
[  762.174080] Code: 84 08 ff ff ff 48 85 db 0f 84 ff fe ff ff 48 8b 03 48 c1 e8 36 41 39 c2 0f 85 ef fe ff ff 41 8b 44 24 28 49 8b 34 24 48 01 f8 <48> 8b 18 48 89 c1 49 33 9c 24 b8 00 00 00 48 89 f8 48 0f c9 48 31
[  762.181141] RSP: 0018:ffffb9e4c378ba20 EFLAGS: 00010286
[  762.182795] RAX: cd3e3ac14fbb0ec6 RBX: fffff15ac410c040 RCX: 0000000000000006
[  762.185835] RDX: 000000000f384001 RSI: 000000000003c040 RDI: cd3e3ac14fbb0e96
[  762.188016] RBP: ffffb9e4c378ba70 R08: ffff8c0483673000 R09: 0000000000000000
[  762.190200] R10: 00000000ffffffff R11: 6e72757465725f65 R12: ffff8c0580042200
[  762.192377] R13: 0000000000000dc0 R14: 0000000000000058 R15: ffffffffa46568b8
[  762.194625] FS:  00007f9440a36740(0000) GS:ffff8c05bbd00000(0000) knlGS:0000000000000000
[  762.197060] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  762.198846] CR2: 00007fe75fe67e30 CR3: 0000000001ed6000 CR4: 00000000000006f0
[  762.201026] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  762.203213] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  762.205395] Call Trace:
[  762.206200]  <TASK>
[  762.206903]  ? __die_body.cold+0x19/0x27
[  762.208149]  ? die_addr+0x3c/0x60
[  762.209208]  ? exc_general_protection+0x17d/0x400
[  762.210706]  ? asm_exc_general_protection+0x26/0x30
[  762.212212]  ? eventfs_create_dir+0x78/0x190
[  762.213589]  ? __kmalloc_cache_noprof+0x1e0/0x3e0
[  762.215326]  ? eventfs_create_dir+0x78/0x190
[  762.216721]  eventfs_create_dir+0x78/0x190
[  762.217995]  event_create_dir+0xc2/0x430
[  762.219213]  trace_module_notify+0x1fd/0x240
[  762.220582]  notifier_call_chain+0x5d/0xd0
[  762.221844]  blocking_notifier_call_chain_robust+0x65/0xc0
[  762.223567]  load_module+0x1cce/0x23e0
[  762.224751]  ? __do_sys_init_module+0x17a/0x1b0
[  762.226145]  __do_sys_init_module+0x17a/0x1b0
[  762.227470]  do_syscall_64+0x82/0x160
[  762.228577]  ? __count_memcg_events+0xc0/0x180
[  762.229851]  ? count_memcg_events.constprop.0+0x1a/0x30
[  762.231309]  ? handle_mm_fault+0x21b/0x330
[  762.232517]  ? do_user_addr_fault+0x55a/0x7b0
[  762.233807]  ? exc_page_fault+0x7e/0x180
[  762.234938]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  762.236268] RIP: 0033:0x7f944030069e
[  762.237260] Code: 48 8b 0d 75 37 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 42 37 0f 00 f7 d8 64 89 01 48
[  762.242088] RSP: 002b:00007ffcb41abd48 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  762.244092] RAX: ffffffffffffffda RBX: 000055e9cb561d90 RCX: 00007f944030069e
[  762.245893] RDX: 000055e9aaa51715 RSI: 0000000000751c7e RDI: 00007f943eb21010
[  762.247667] RBP: 00007ffcb41abe00 R08: 000055e9cb561010 R09: 0000000000000007
[  762.249378] R10: 0000000000000001 R11: 0000000000000246 R12: 000055e9aaa51715
[  762.251190] R13: 0000000000040000 R14: 000055e9cb561ec0 R15: 0000000000000000
[  762.252954]  </TASK>
[  762.253461] Modules linked in: xfs(+) libfc scsi_transport_fc iscsi_ibft dm_crypt vfat fat dm_round_robin dm_multipath raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 uinput snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables qrtr snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm pktcdvd snd_timer ppdev snd soundcore parport_pc pcspkr parport i2c_piix4 joydev i2c_smbus binfmt_misc nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci overlay squashfs isofs crc32c_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 virtio_blk floppy virtio_net virtio_scsi net_failover failover ata_generic pata_acpi virtio_gpu virtio_dma_buf serio_raw nvme_tcp
[  762.253590]  nvme_fabrics nvme_keyring nvme_core nvme_auth sunrpc be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi loop fuse i2c_dev qemu_fw_cfg virtio_console
[  762.277591] ---[ end trace 0000000000000000 ]---
[  762.278469] RIP: 0010:__kmalloc_cache_noprof+0x1e0/0x3e0
[  762.279570] Code: 84 08 ff ff ff 48 85 db 0f 84 ff fe ff ff 48 8b 03 48 c1 e8 36 41 39 c2 0f 85 ef fe ff ff 41 8b 44 24 28 49 8b 34 24 48 01 f8 <48> 8b 18 48 89 c1 49 33 9c 24 b8 00 00 00 48 89 f8 48 0f c9 48 31
[  762.283164] RSP: 0018:ffffb9e4c378ba20 EFLAGS: 00010286
[  762.284223] RAX: cd3e3ac14fbb0ec6 RBX: fffff15ac410c040 RCX: 0000000000000006
[  762.285611] RDX: 000000000f384001 RSI: 000000000003c040 RDI: cd3e3ac14fbb0e96
[  762.286948] RBP: ffffb9e4c378ba70 R08: ffff8c0483673000 R09: 0000000000000000
[  762.288276] R10: 00000000ffffffff R11: 6e72757465725f65 R12: ffff8c0580042200
[  762.289672] R13: 0000000000000dc0 R14: 0000000000000058 R15: ffffffffa46568b8
[  762.291002] FS:  00007f9440a36740(0000) GS:ffff8c05bbd00000(0000) knlGS:0000000000000000
[  762.292485] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  762.293626] CR2: 00007fe75fe67e30 CR3: 0000000001ed6000 CR4: 00000000000006f0
[  762.294886] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  762.296170] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  762.303122] ------------[ cut here ]------------
[  762.303943] UBSAN: shift-out-of-bounds in lib/radix-tree.c:88:31
[  762.305842] shift exponent 254 is too large for 64-bit type 'long unsigned int'
[  762.308224] CPU: 0 UID: 0 PID: 3902 Comm: modprobe Tainted: G      D           -------  ---  6.13.0-0.rc0.20241126git7eef7e306d3c.10.fc42.x86_64 #1
[  762.312288] Tainted: [D]=DIE
[  762.313226] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
[  762.315954] Call Trace:
[  762.316776]  <TASK>
[  762.317428]  dump_stack_lvl+0x5d/0x80
[  762.318621]  ubsan_epilogue+0x5/0x30
[  762.319685]  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xe6
[  762.321338]  __radix_tree_lookup.cold+0x16/0x56
[  762.324022]  find_extent_buffer_nolock+0x33/0x70
[  762.325350]  find_extent_buffer+0x12/0xb0
[  762.326575]  read_block_for_search+0x10b/0x400
[  762.327881]  btrfs_search_slot+0x33d/0x10c0
[  762.329101]  btrfs_lookup_dir_item+0x98/0xf0
[  762.330337]  btrfs_lookup_dentry+0xff/0x650
[  762.331606]  ? d_alloc_parallel+0x237/0x400
[  762.332781]  btrfs_lookup+0x12/0x30
[  762.333749]  __lookup_slow+0x89/0x130
[  762.334756]  ? __legitimize_path+0x2a/0x60
[  762.335863]  walk_component+0xdb/0x150
[  762.336897]  path_lookupat+0x6a/0x1a0
[  762.337899]  filename_lookup+0xf2/0x200
[  762.338953]  ? __pfx_page_put_link+0x10/0x10
[  762.340113]  vfs_statx+0x79/0xe0
[  762.341015]  ? strncpy_from_user+0x24/0x100
[  762.342147]  vfs_fstatat+0x6b/0xa0
[  762.343081]  __do_sys_newfstatat+0x3c/0x80
[  762.344114]  do_syscall_64+0x82/0x160
[  762.345061]  ? __count_memcg_events+0xc0/0x180
[  762.346179]  ? count_memcg_events.constprop.0+0x1a/0x30
[  762.347471]  ? handle_mm_fault+0x21b/0x330
[  762.348537]  ? do_user_addr_fault+0x55a/0x7b0
[  762.349653]  ? exc_page_fault+0x7e/0x180
[  762.350657]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  762.351876] RIP: 0033:0x7f273e6ee87e
[  762.352749] Code: 0f 1f 40 00 48 8b 15 91 55 10 00 f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 41 89 ca b8 06 01 00 00 0f 05 <3d> 00 f0 ff ff 77 0b 31 c0 c3 0f 1f 84 00 00 00 00 00 48 8b 15 59
[  762.357063] RSP: 002b:00007ffe04476c38 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
[  762.358842] RAX: ffffffffffffffda RBX: 00007ffe04477d90 RCX: 00007f273e6ee87e
[  762.360470] RDX: 00007ffe04476c60 RSI: 000055e5f65ebe40 RDI: 00000000ffffff9c
[  762.362094] RBP: 00007ffe04477d30 R08: 000055e5f65ebea0 R09: 00007f273e7f4b20
[  762.363678] R10: 0000000000000000 R11: 0000000000000246 R12: 000055e5f65ebe40
[  762.365211] R13: 000055e5f65eb2a0 R14: 00007ffe04477d88 R15: 0000000000000000
[  762.366805]  </TASK>
[  762.367332] ---[ end trace ]---
[  762.368087] Oops: general protection fault, probably for non-canonical address 0xf00002000fd8175: 0000 [#2] PREEMPT SMP PTI
[  762.369778] ------------[ cut here ]------------
[  762.370386] CPU: 0 UID: 0 PID: 3902 Comm: modprobe Tainted: G      D           -------  ---  6.13.0-0.rc0.20241126git7eef7e306d3c.10.fc42.x86_64 #1
[  762.371459] UBSAN: shift-out-of-bounds in lib/xarray.c:146:16
[  762.374193] Tainted: [D]=DIE
[  762.375599] shift exponent 249 is too large for 64-bit type 'long unsigned int'
[  762.376152] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
[  762.376155] RIP: 0010:find_extent_buffer_nolock+0x3b/0x70
[  762.377972] CPU: 1 UID: 0 PID: 36 Comm: kcompactd0 Tainted: G      D           -------  ---  6.13.0-0.rc0.20241126git7eef7e306d3c.10.fc42.x86_64 #1
[  762.379705] Code: ff 8b 8b f0 0c 00 00 83 f9 3f 0f 87 7f 90 ae 00 48 89 ee 48 8d bb 60 0a 00 00 48 d3 ee e8 ed 28 a9 00 48 89 c3 48 85 c0 74 20 <8b> 40 2c 85 c0 74 19 8d 50 01 f0 0f b1 53 2c 75 f2 e8 5f 67 b1 ff
[  762.380957] Tainted: [D]=DIE
[  762.383429] RSP: 0018:ffffb9e4c377b818 EFLAGS: 00010202
[  762.387687] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
[  762.388205] 
[  762.388207] RAX: 0f00002000fd8149 RBX: 0f00002000fd8149 RCX: 00000000000000fe
[  762.389340] Call Trace:
[  762.390928] RDX: 0000000000000004 RSI: ffff8c05bbc21900 RDI: ffffffffa5ee9659
[  762.391268]  <TASK>
[  762.392550] RBP: 0000000005418000 R08: ffff8c0580984a68 R09: ffff8c05842b0270
[  762.393083]  dump_stack_lvl+0x5d/0x80
[  762.394335] R10: ffffb9e4c377b4f0 R11: 646e65205b2d2d2d R12: ffff8c0482cdfaf0
[  762.394842]  ubsan_epilogue+0x5/0x30
[  762.395862] R13: 00000000000001c7 R14: 0000000000003b0c R15: 0000000000000001
[  762.396644]  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xe6
[  762.397679] FS:  00007f273ecd0740(0000) GS:ffff8c05bbc00000(0000) knlGS:0000000000000000
[  762.398395]  xas_create.cold+0x15/0x7b
[  762.399405] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  762.400633]  xas_store+0x5d/0x8d0
[  762.401789] CR2: 00007ffe04476d08 CR3: 000000001761a000 CR4: 00000000000006f0
[  762.402532]  __folio_migrate_mapping+0x27b/0x6f0
[  762.403337] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  762.404021]  ? __call_rcu_common.constprop.0+0xb6/0x7f0
[  762.405056] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  762.405974]  __migrate_folio.isra.0+0x71/0x100
[  762.406987] Call Trace:
[  762.406989]  <TASK>
[  762.408027]  move_to_new_folio+0x64/0x1a0
[  762.409066]  ? __die_body.cold+0x19/0x27
[  762.409910]  migrate_pages_batch+0x972/0xd10
[  762.410258]  ? die_addr+0x3c/0x60
[  762.410701]  ? __pfx_compaction_free+0x10/0x10
[  762.411254]  ? exc_general_protection+0x17d/0x400
[  762.412013]  ? lru_gen_add_folio+0x302/0x4b0
[  762.412670]  ? asm_exc_general_protection+0x26/0x30
[  762.413270]  migrate_pages+0xae8/0xde0
[  762.413976]  ? find_extent_buffer_nolock+0x3b/0x70
[  762.414834]  ? __pfx_compaction_alloc+0x10/0x10
[  762.415430]  ? find_extent_buffer_nolock+0x33/0x70
[  762.416303]  ? __pfx_compaction_free+0x10/0x10
[  762.416875]  find_extent_buffer+0x12/0xb0
[  762.417742]  compact_zone+0xa1a/0x1110
[  762.418365]  read_block_for_search+0x10b/0x400
[  762.419244]  ? psi_group_change+0x1c6/0x4b0
[  762.419908]  btrfs_search_slot+0x33d/0x10c0
[  762.420531]  compact_node+0xb1/0x140
[  762.421071]  btrfs_lookup_dir_item+0x98/0xf0
[  762.421761]  kcompactd+0x35b/0x4b0
[  762.422336]  btrfs_lookup_dentry+0xff/0x650
[  762.422973]  ? __pfx_autoremove_wake_function+0x10/0x10
[  762.423471]  ? d_alloc_parallel+0x237/0x400
[  762.424100]  ? __pfx_kcompactd+0x10/0x10
[  762.424627]  btrfs_lookup+0x12/0x30
[  762.425204]  kthread+0xd2/0x100
[  762.425985]  __lookup_slow+0x89/0x130
[  762.426612]  ? __pfx_kthread+0x10/0x10
[  762.427184]  ? __legitimize_path+0x2a/0x60
[  762.427736]  ret_from_fork+0x34/0x50
[  762.428181]  walk_component+0xdb/0x150
[  762.428763]  ? __pfx_kthread+0x10/0x10
[  762.429286]  path_lookupat+0x6a/0x1a0
[  762.429904]  ret_from_fork_asm+0x1a/0x30
[  762.430402]  filename_lookup+0xf2/0x200
[  762.430994]  </TASK>
[  762.431562]  ? __pfx_page_put_link+0x10/0x10
[  762.432072] ---[ end trace ]---
[  762.434701]  vfs_statx+0x79/0xe0
[  762.435160]  ? strncpy_from_user+0x24/0x100
[  762.435789]  vfs_fstatat+0x6b/0xa0
[  762.436269]  __do_sys_newfstatat+0x3c/0x80
[  762.436920]  do_syscall_64+0x82/0x160
[  762.437437]  ? __count_memcg_events+0xc0/0x180
[  762.438102]  ? count_memcg_events.constprop.0+0x1a/0x30
[  762.438878]  ? handle_mm_fault+0x21b/0x330
[  762.439451]  ? do_user_addr_fault+0x55a/0x7b0
[  762.440143]  ? exc_page_fault+0x7e/0x180
[  762.440756]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  762.441457] RIP: 0033:0x7f273e6ee87e
[  762.442006] Code: 0f 1f 40 00 48 8b 15 91 55 10 00 f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 41 89 ca b8 06 01 00 00 0f 05 <3d> 00 f0 ff ff 77 0b 31 c0 c3 0f 1f 84 00 00 00 00 00 48 8b 15 59
[  762.444677] RSP: 002b:00007ffe04476c38 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
[  762.445781] RAX: ffffffffffffffda RBX: 00007ffe04477d90 RCX: 00007f273e6ee87e
[  762.446805] RDX: 00007ffe04476c60 RSI: 000055e5f65ebe40 RDI: 00000000ffffff9c
[  762.447853] RBP: 00007ffe04477d30 R08: 000055e5f65ebea0 R09: 00007f273e7f4b20
[  762.448899] R10: 0000000000000000 R11: 0000000000000246 R12: 000055e5f65ebe40
[  762.449956] R13: 000055e5f65eb2a0 R14: 00007ffe04477d88 R15: 0000000000000000
[  762.450983]  </TASK>
[  762.451299] Modules linked in: xfs(+) libfc scsi_transport_fc iscsi_ibft dm_crypt vfat fat dm_round_robin dm_multipath raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 uinput snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables qrtr snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm pktcdvd snd_timer ppdev snd soundcore parport_pc pcspkr parport i2c_piix4 joydev i2c_smbus binfmt_misc nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci overlay squashfs isofs crc32c_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 virtio_blk floppy virtio_net virtio_scsi net_failover failover ata_generic pata_acpi virtio_gpu virtio_dma_buf serio_raw nvme_tcp
[  762.451368]  nvme_fabrics nvme_keyring nvme_core nvme_auth sunrpc be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi loop fuse i2c_dev qemu_fw_cfg virtio_console
[  762.467324] ---[ end trace 0000000000000000 ]---
[  762.468042] RIP: 0010:__kmalloc_cache_noprof+0x1e0/0x3e0
[  762.468852] Code: 84 08 ff ff ff 48 85 db 0f 84 ff fe ff ff 48 8b 03 48 c1 e8 36 41 39 c2 0f 85 ef fe ff ff 41 8b 44 24 28 49 8b 34 24 48 01 f8 <48> 8b 18 48 89 c1 49 33 9c 24 b8 00 00 00 48 89 f8 48 0f c9 48 31
[  762.471497] RSP: 0018:ffffb9e4c378ba20 EFLAGS: 00010286
[  762.472264] RAX: cd3e3ac14fbb0ec6 RBX: fffff15ac410c040 RCX: 0000000000000006
[  762.473317] RDX: 000000000f384001 RSI: 000000000003c040 RDI: cd3e3ac14fbb0e96
[  762.474400] RBP: ffffb9e4c378ba70 R08: ffff8c0483673000 R09: 0000000000000000
[  762.475451] R10: 00000000ffffffff R11: 6e72757465725f65 R12: ffff8c0580042200
[  762.476540] R13: 0000000000000dc0 R14: 0000000000000058 R15: ffffffffa46568b8
[  762.477589] FS:  00007f273ecd0740(0000) GS:ffff8c05bbc00000(0000) knlGS:0000000000000000
[  762.478797] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  762.479666] CR2: 00007ffe04476d08 CR3: 000000001761a000 CR4: 00000000000006f0
[  762.480698] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  762.481754] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  762.496883] ------------[ cut here ]------------
[  762.497632] UBSAN: shift-out-of-bounds in ./include/linux/xarray.h:1604:27
[  762.498680] shift exponent 249 is too large for 64-bit type 'long unsigned int'
[  762.499751] CPU: 0 UID: 0 PID: 3904 Comm: modprobe Tainted: G      D           -------  ---  6.13.0-0.rc0.20241126git7eef7e306d3c.10.fc42.x86_64 #1
[  762.501637] Tainted: [D]=DIE
[  762.502047] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
[  762.503288] Call Trace:
[  762.503689]  <TASK>
[  762.503996]  dump_stack_lvl+0x5d/0x80
[  762.504562]  ubsan_epilogue+0x5/0x30
[  762.505069]  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xe6
[  762.505920]  filemap_get_entry.cold+0x16/0x20
[  762.506577]  __filemap_get_folio+0x2e/0x2f0
[  762.507162]  ? filemap_add_folio+0xc4/0xe0
[  762.507779]  alloc_extent_buffer+0x34f/0xa00
[  762.508379]  read_block_for_search+0x1f8/0x400
[  762.509046]  btrfs_search_slot+0x33d/0x10c0
[  762.509700]  btrfs_lookup_dir_item+0x98/0xf0
[  762.510300]  btrfs_lookup_dentry+0xff/0x650
[  762.510969]  ? d_alloc_parallel+0x237/0x400
[  762.511604]  btrfs_lookup+0x12/0x30
[  762.512096]  __lookup_slow+0x89/0x130
[  762.512655]  ? __legitimize_path+0x2a/0x60
[  762.513231]  walk_component+0xdb/0x150
[  762.513802]  path_lookupat+0x6a/0x1a0
[  762.514318]  filename_lookup+0xf2/0x200
[  762.514917]  ? __pfx_page_put_link+0x10/0x10
[  762.515564]  vfs_statx+0x79/0xe0
[  762.516022]  ? strncpy_from_user+0x24/0x100
[  762.516666]  vfs_fstatat+0x6b/0xa0
[  762.517149]  __do_sys_newfstatat+0x3c/0x80
[  762.517782]  do_syscall_64+0x82/0x160
[  762.518300]  ? __count_memcg_events+0xc0/0x180
[  762.518965]  ? count_memcg_events.constprop.0+0x1a/0x30
[  762.519734]  ? handle_mm_fault+0x21b/0x330
[  762.520309]  ? do_user_addr_fault+0x55a/0x7b0
[  762.520978]  ? exc_page_fault+0x7e/0x180
[  762.522692]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  762.523647] RIP: 0033:0x7fe85ceee87e
[  762.524155] Code: 0f 1f 40 00 48 8b 15 91 55 10 00 f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 41 89 ca b8 06 01 00 00 0f 05 <3d> 00 f0 ff ff 77 0b 31 c0 c3 0f 1f 84 00 00 00 00 00 48 8b 15 59
[  762.526829] RSP: 002b:00007ffdc5a4fe68 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
[  762.527932] RAX: ffffffffffffffda RBX: 00007ffdc5a50fc0 RCX: 00007fe85ceee87e
[  762.528956] RDX: 00007ffdc5a4fe90 RSI: 000055730d30bf40 RDI: 00000000ffffff9c
[  762.529978] RBP: 00007ffdc5a50f60 R08: 000055730d30bfa0 R09: 00007fe85cff4b20
[  762.531000] R10: 0000000000000000 R11: 0000000000000246 R12: 000055730d30bf40
[  762.532021] R13: 000055730d30b2a0 R14: 00007ffdc5a50fb8 R15: 0000000000000000
[  762.533068]  </TASK>
[  762.533400] ---[ end trace ]---

The first update where this started happening was the kernel-6.13.0-0.rc0.20241125git9f16d5e6f220.8.fc42 update itself. Unfortunately I didn't spot that this was a new problem and just restarted the tests a few times till they passed, which allowed the update to go stable. Now it's happening frequently on every Rawhide update. I can't find a single case of it happening *before* the kernel-6.13.0-0.rc0.20241125git9f16d5e6f220.8.fc42 update.

Comment 1 Adam Williamson 2024-12-01 01:24:34 UTC
Update: the crash doesn't always happen in the same code, so the event code likely has nothing to do with it. Here are two other traces:

[  663.835239] Call Trace:
[  663.835798]  <TASK>
[  663.836339]  ? __die_body.cold+0x19/0x27
[  663.837232]  ? die_addr+0x3c/0x60
[  663.837969]  ? exc_general_protection+0x17d/0x400
[  663.839060]  ? asm_exc_general_protection+0x26/0x30
[  663.840212]  ? event_init+0x1d/0x70
[  663.840965]  trace_module_notify+0x19a/0x240
[  663.841965]  notifier_call_chain+0x5d/0xd0
[  663.842885]  blocking_notifier_call_chain_robust+0x65/0xc0
[  663.844113]  load_module+0x1cce/0x23e0
[  663.845068]  ? __do_sys_init_module+0x17a/0x1b0
[  663.846355]  __do_sys_init_module+0x17a/0x1b0
[  663.847320]  do_syscall_64+0x82/0x160
[  663.848111]  ? __vm_munmap+0xb9/0x170
[  663.849155]  ? syscall_exit_to_user_mode_prepare+0x173/0x1b0
[  663.850608]  ? syscall_exit_to_user_mode+0x10/0x210
[  663.851758]  ? do_syscall_64+0x8e/0x160
[  663.852695]  ? __handle_mm_fault+0xb55/0xfd0
[  663.853712]  ? syscall_exit_to_user_mode+0x10/0x210
[  663.854862]  ? __count_memcg_events+0xc0/0x180
[  663.855878]  ? count_memcg_events.constprop.0+0x1a/0x30
[  663.857056]  ? handle_mm_fault+0x21b/0x330
[  663.858196]  ? do_user_addr_fault+0x55a/0x7b0
[  663.859222]  ? exc_page_fault+0x7e/0x180
[  663.860088]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  663.861371] RIP: 0033:0x7f07ac6d269e

[  650.995898] Call Trace:
[  650.996516]  <TASK>
[  650.996973]  ? __die_body.cold+0x19/0x27
[  650.997888]  ? die_addr+0x3c/0x60
[  650.998662]  ? exc_general_protection+0x17d/0x400
[  650.999749]  ? asm_exc_general_protection+0x26/0x30
[  651.000898]  ? avc_lookup+0x47/0x70
[  651.001721]  avc_has_perm_noaudit+0x3d/0xf0
[  651.002667]  selinux_inode_permission+0x13a/0x1e0
[  651.003823]  security_inode_permission+0x42/0xf0
[  651.004897]  link_path_walk.part.0.constprop.0+0xad/0x390
[  651.006164]  path_lookupat+0x3e/0x1a0
[  651.007003]  filename_lookup+0xf2/0x200
[  651.007904]  user_path_at+0x56/0x90
[  651.008717]  inotify_find_inode+0x21/0x80
[  651.009647]  __x64_sys_inotify_add_watch+0xbe/0x140
[  651.010770]  do_syscall_64+0x82/0x160
[  651.011666]  ? syscall_exit_to_user_mode_prepare+0x173/0x1b0
[  651.012928]  ? syscall_exit_to_user_mode+0x10/0x210
[  651.014083]  ? do_syscall_64+0x8e/0x160
[  651.014987]  ? syscall_exit_to_user_mode+0x1d5/0x210
[  651.016122]  ? do_syscall_64+0x8e/0x160
[  651.017104]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  651.018284] RIP: 0033:0x7fa0592366cb

Steven Rostedt suggests it looks like memory corruption since it crashes in different places each time.

Comment 2 Adam Williamson 2024-12-02 16:43:39 UTC
Looks like this doesn't only affect live installs, https://openqa.fedoraproject.org/tests/3064100 is a non-live install to BIOS that looks like it was also affected. I really need to try and get the system/anaconda logs for a failure, though, so we can see exactly what's happening when the GPF hits. Will twiddle with that today.

Comment 3 Adam Williamson 2024-12-02 18:28:41 UTC
Created attachment 2060849 [details]
video of a likely-related issue in UEFI installs

Huh. There is actually a bug happening on UEFI installs as well which may be the equivalent of this there, but it's even weirder.

What happens there is that, while the installer is at the "installing bootloader" stage, the system suddenly reboots, then gets stuck at shim/grub stage (not surprisingly, I guess, as it doesn't look like the installer actually completed cleanly). You can see this happen around 0:56 of the attached video - the installer is running...and suddenly we're rebooting. This is not normal, usually the installer would run until it showed a 'complete' screen, the test system is waiting to see that screen at this point.

I'm guessing we actually hit something similar to the BIOS path on the UEFI path too, but somehow rather than causing the system to hang, it causes it to *reboot*, and we attempt boot to an incomplete install that fails to boot properly.

Comment 4 Adam Williamson 2024-12-02 18:30:39 UTC
Oh, one thing that may be relevant here - we set the qemu CPU model to Nehalem. We do this intentionally to make sure Fedora isn't inadvertently progressing beyond its intended CPU baseline.

Comment 5 Adam Williamson 2024-12-02 18:49:58 UTC
ah, yeah, I think that *is* relevant. I hadn't been able to reproduce this locally before, but I just set my local VM's CPU to Nehalem, and reproduced the UEFI version of this first try. I noticed the VM did indeed seem to be stuck at the bootloader install stage - it was not responding to any kind of input, couldn't switch to a VT - then after it sat like that for a few seconds or minutes (don't know how long it was sitting like that before I saw it), it spontaneously rebooted and then failed to boot properly from the disk, just like the openQA test in the video.

It's good news that I can reproduce it locally now, I can fiddle about with settings to try and get the anaconda logs to the serial console without having to 'fly by wire' with openQA.

Comment 6 Adam Williamson 2024-12-02 23:57:51 UTC
OK, here we go, managed to get combined kernel and anaconda logs when reproducing this locally. This is the UEFI case. Here's what we get:

Dec 02 18:55:26 localhost-live org.fedoraproject.Anaconda.Modules.Storage[5364]: INFO:program:Running in chroot '/mnt/sysroot'... grub2-mkconfig -o /boot/grub2/grub.cfg
[ 1243.766337] SGI XFS with ACLs, security attributes, realtime, scrub, quota, no debug enabled
[ 1243.801236] JFS: nTxBlock = 8192, nTxLock = 65536
[ 1244.145085] list_del corruption. next->prev should be ffff9c39c4489298, but was 7b8d4840438d4840. (next=ffff9c3a0233d640)
[ 1244.145273] ------------[ cut here ]------------
[ 1244.145276] kernel BUG at lib/list_debug.c:65!
[ 1244.145340] Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
[ 1244.145352] CPU: 0 UID: 0 PID: 92 Comm: kworker/u9:4 Not tainted 6.13.0-0.rc0.20241126git7eef7e306d3c.10.fc42.x86_64 #1
[ 1244.145360] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20240813-2.fc42 08/13/2024
[ 1244.145371] Workqueue: btrfs-delayed-meta btrfs_work_helper
[ 1244.145423] RIP: 0010:__list_del_entry_valid_or_report.cold+0x23/0x6f
[ 1244.145446] Code: e8 a5 8a fb ff 0f 0b 48 89 fe 48 c7 c7 18 7a e3 87 e8 94 8a fb ff 0f 0b 48 89 d1 48 c7 c7 38 7b e3 87 48 89 c2 e8 80 8a fb ff <0f> 0b 48 89 f2 48 89 fe 48 c7 c7 e8 7a e3 87 e8 6c 8a fb ff 0f 0b
[ 1244.145449] RSP: 0018:ffffbbf68030fd90 EFLAGS: 00010246
[ 1244.145460] RAX: 000000000000006d RBX: ffff9c38cdabc7a8 RCX: 0000000000000000
[ 1244.145464] RDX: 0000000000000000 RSI: ffff9c3a02e21900 RDI: ffff9c3a02e21900
[ 1244.145465] RBP: ffff9c39d61b08c0 R08: 0000000000000000 R09: 0000000000000000
[ 1244.145467] R10: 6e28202e30343834 R11: 666666663d747865 R12: ffff9c393e54f800
[ 1244.145468] R13: ffff9c39cb28bcc0 R14: ffff9c39c4489298 R15: ffff9c393c9b4c78
[ 1244.145469] FS:  0000000000000000(0000) GS:ffff9c3a02e00000(0000) knlGS:0000000000000000
[ 1244.145471] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1244.145472] CR2: 0000562d47c53900 CR3: 000000010218e000 CR4: 00000000000006f0
[ 1244.145477] Call Trace:
[ 1244.145497]  <TASK>
[ 1244.145502]  ? __die_body.cold+0x19/0x27
[ 1244.145514]  ? die+0x2e/0x50
[ 1244.145540]  ? do_trap+0xca/0x110
[ 1244.145546]  ? do_error_trap+0x6a/0x90
[ 1244.145548]  ? __list_del_entry_valid_or_report.cold+0x23/0x6f
[ 1244.145551]  ? exc_invalid_op+0x50/0x70
[ 1244.145565]  ? __list_del_entry_valid_or_report.cold+0x23/0x6f
[ 1244.145567]  ? asm_exc_invalid_op+0x1a/0x20
[ 1244.145591]  ? __list_del_entry_valid_or_report.cold+0x23/0x6f
[ 1244.145593]  btrfs_async_run_delayed_root+0x93/0x2c0
[ 1244.145605]  btrfs_work_helper+0xe8/0x380
[ 1244.145611]  ? drm_atomic_state_default_clear+0x1c3/0x2e0
[ 1244.145634]  process_one_work+0x179/0x330
[ 1244.145653]  worker_thread+0x252/0x390
[ 1244.145656]  ? __pfx_worker_thread+0x10/0x10
[ 1244.145657]  kthread+0xd2/0x100
[ 1244.145664]  ? __pfx_kthread+0x10/0x10
[ 1244.145667]  ret_from_fork+0x34/0x50
[ 1244.145674]  ? __pfx_kthread+0x10/0x10
[ 1244.145676]  ret_from_fork_asm+0x1a/0x30
[ 1244.145694]  </TASK>
[ 1244.145697] Modules linked in: ufs hfsplus hfs minix msdos jfs nls_ucs2_utils xfs libfc scsi_transport_fc iscsi_ibft dm_crypt vfat fat dm_round_robin dm_multipath raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 nvme_tcp nvme_fabrics uinput snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set rfkill nf_tables qrtr snd_hda_codec_generic snd_hda_intel joydev snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore pktcdvd iTCO_wdt i2c_i801 intel_pmc_bxt iTCO_vendor_support virtio_balloon i2c_smbus pcspkr lpc_ich binfmt_misc nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci overlay squashfs isofs crc32c_intel sha512_ssse3 sha256_ssse3 virtio_net sha1_ssse3 virtio_scsi
[ 1244.145764]  net_failover virtio_console virtio_blk failover virtio_gpu virtio_dma_buf nvme_keyring nvme_core nvme_auth serio_raw sunrpc be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi loop fuse qemu_fw_cfg [last unloaded: nvme_fabrics]
[ 1244.145798] ---[ end trace 0000000000000000 ]---
[ 1244.145802] RIP: 0010:__list_del_entry_valid_or_report.cold+0x23/0x6f
[ 1244.145804] Code: e8 a5 8a fb ff 0f 0b 48 89 fe 48 c7 c7 18 7a e3 87 e8 94 8a fb ff 0f 0b 48 89 d1 48 c7 c7 38 7b e3 87 48 89 c2 e8 80 8a fb ff <0f> 0b 48 89 f2 48 89 fe 48 c7 c7 e8 7a e3 87 e8 6c 8a fb ff 0f 0b
[ 1244.145806] RSP: 0018:ffffbbf68030fd90 EFLAGS: 00010246
[ 1244.145808] RAX: 000000000000006d RBX: ffff9c38cdabc7a8 RCX: 0000000000000000
[ 1244.145809] RDX: 0000000000000000 RSI: ffff9c3a02e21900 RDI: ffff9c3a02e21900
[ 1244.145810] RBP: ffff9c39d61b08c0 R08: 0000000000000000 R09: 0000000000000000
[ 1244.145811] R10: 6e28202e30343834 R11: 666666663d747865 R12: ffff9c393e54f800
[ 1244.145812] R13: ffff9c39cb28bcc0 R14: ffff9c39c4489298 R15: ffff9c393c9b4c78
[ 1244.145814] FS:  0000000000000000(0000) GS:ffff9c3a02e00000(0000) knlGS:0000000000000000
[ 1244.145815] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1244.145846] CR2: 0000562d47c53900 CR3: 000000010218e000 CR4: 00000000000006f0
[ 1244.145859] note: kworker/u9:4[92] exited with preempt_count 1
[ 1245.822413] show_signal_msg: 17 callbacks suppressed
[ 1245.822421] Isolated Web Co[5650]: segfault at 0 ip 00007fcb20e99fee sp 00007fff2b2f21e8 error 6 in libxul.so[51e9fee,7fcb1bcb0000+51f7000] likely on CPU 1 (core 0, socket 1)
[ 1245.822446] Code: ff e8 66 c7 e1 fa 66 0f ef ed 0f 28 f5 e9 a3 fb ff ff 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 8b 05 ed c7 de 02 48 89 10 <c7> 04 25 00 00 00 00 00 00 00 00 0f 0b 0f 1f 44 00 00 f3 0f 1e fa

...so we were running grub2-mkconfig in a chroot to the installed system when it blew up. The:
list_del corruption. next->prev should be ffff9c39c4489298, but was 7b8d4840438d4840. (next=ffff9c3a0233d640)
message looks interesting.

Comment 7 Adam Williamson 2024-12-02 23:58:52 UTC
hmm, that's actually different from the others. Not sure if it's UEFI vs. BIOS, this is my first time seeing the actual death on UEFI. I'll reproduce this test process on BIOS and see what we get there.

Comment 8 Adam Williamson 2024-12-03 08:06:06 UTC
OK, here's what I got on a BIOS reproducer. Seems like some other stuff happened before the GPF...

Dec 03 01:16:49 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3508]: INFO:program:Running in chroot '/mnt/sysroot'... grub2-mkconfig -o /boot/grub2/grub.cfg
[  588.823333] ------------[ cut here ]------------
[  588.823338] WARNING: CPU: 0 PID: 4325 at arch/x86/kernel/ftrace.c:100 ftrace_verify_code+0x4c/0x90
[  588.823359] Modules linked in: libfc scsi_transport_fc iscsi_ibft dm_crypt vfat fat dm_round_robin dm_multipath raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 uinput snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set rfkill nf_tables qrtr snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec joydev snd_hda_core snd_hwdep snd_seq snd_seq_device iTCO_wdt intel_pmc_bxt snd_pcm iTCO_vendor_support pktcdvd i2c_i801 lpc_ich snd_timer snd soundcore pcspkr virtio_balloon i2c_smbus binfmt_misc nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci overlay squashfs isofs crc32c_intel sha512_ssse3 sha256_ssse3 virtio_net sha1_ssse3 virtio_blk virtio_console virtio_scsi virtio_gpu net_failover failover virtio_dma_buf
[  588.823418]  nvme_tcp nvme_fabrics nvme_keyring serio_raw nvme_core nvme_auth sunrpc be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi loop fuse i2c_dev qemu_fw_cfg
[  588.823442] CPU: 0 UID: 0 PID: 4325 Comm: modprobe Not tainted 6.13.0-0.rc0.20241126git7eef7e306d3c.10.fc42.x86_64 #1
[  588.823445] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-3.fc41 04/01/2014
[  588.823450] RIP: 0010:ftrace_verify_code+0x4c/0x90
[  588.823453] Code: fe c6 44 24 07 00 48 89 c7 c7 44 24 03 00 00 00 00 e8 98 6e 30 00 48 85 c0 75 3e 8b 03 39 44 24 03 74 28 48 89 1d fc b3 7e 03 <0f> 0b b8 ea ff ff ff 48 8b 54 24 08 65 48 2b 14 25 28 00 00 00 75
[  588.823454] RSP: 0018:ffffbb83833ef860 EFLAGS: 00010206
[  588.823459] RAX: 00000000b898a7e8 RBX: ffffffffbd7c2e08 RCX: 0000000000000010
[  588.823460] RDX: ffffa07bd290a880 RSI: 0000000000000005 RDI: ffffffffc1514064
[  588.823461] RBP: ffffffffc1514064 R08: 0000000000000000 R09: ffffa07c3bc3ee20
[  588.823462] R10: ffffbb83833ef8a0 R11: dead000000000040 R12: ffffffffbb60b5ea
[  588.823463] R13: 0000000002000000 R14: ffffa07bcbe50ce0 R15: 0000000000000f3e
[  588.823465] FS:  00007f65a27c8740(0000) GS:ffffa07c3bc00000(0000) knlGS:0000000000000000
[  588.823466] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  588.823467] CR2: 00007f3ef9753f10 CR3: 00000000027a8000 CR4: 00000000000006f0
[  588.823471] Call Trace:
[  588.823478]  <TASK>
[  588.823479]  ? ftrace_verify_code+0x4c/0x90
[  588.823482]  ? __warn.cold+0x93/0xfa
[  588.823484]  ? ftrace_verify_code+0x4c/0x90
[  588.823493]  ? report_bug+0xff/0x140
[  588.823496]  ? handle_bug+0x58/0x90
[  588.823498]  ? exc_invalid_op+0x17/0x70
[  588.823500]  ? asm_exc_invalid_op+0x1a/0x20
[  588.823516]  ? ftrace_verify_code+0x4c/0x90
[  588.823518]  ftrace_modify_code_direct+0xf/0x70
[  588.823522]  ftrace_process_locs+0x313/0x5c0
[  588.823532]  ftrace_module_init+0x32/0x50
[  588.823537]  load_module+0x1b1a/0x23e0
[  588.823552]  ? __do_sys_init_module+0x17a/0x1b0
[  588.823554]  __do_sys_init_module+0x17a/0x1b0
[  588.823557]  do_syscall_64+0x82/0x160
[  588.823565]  ? __vm_munmap+0xb9/0x170
[  588.823580]  ? syscall_exit_to_user_mode+0x10/0x210
[  588.823583]  ? do_syscall_64+0x8e/0x160
[  588.823585]  ? __alloc_pages_noprof+0x184/0x330
[  588.823591]  ? __mod_memcg_lruvec_state+0xdf/0x220
[  588.823599]  ? __count_memcg_events+0xc0/0x180
[  588.823601]  ? __lruvec_stat_mod_folio+0x83/0xd0
[  588.823604]  ? set_ptes.isra.0+0x41/0x90
[  588.823606]  ? do_anonymous_page+0xfc/0x8f0
[  588.823609]  ? __pte_offset_map+0x1b/0x180
[  588.823613]  ? __handle_mm_fault+0xb55/0xfd0
[  588.823616]  ? __count_memcg_events+0xc0/0x180
[  588.823619]  ? count_memcg_events.constprop.0+0x1a/0x30
[  588.823621]  ? handle_mm_fault+0x21b/0x330
[  588.823624]  ? do_user_addr_fault+0x55a/0x7b0
[  588.823631]  ? exc_page_fault+0x7e/0x180
[  588.823633]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  588.823637] RIP: 0033:0x7f65a210069e
[  588.823649] Code: 48 8b 0d 75 37 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 42 37 0f 00 f7 d8 64 89 01 48
[  588.823653] RSP: 002b:00007ffdb33ac7f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  588.823655] RAX: ffffffffffffffda RBX: 0000557dfc420ca0 RCX: 00007f65a210069e
[  588.823656] RDX: 0000557dbeee3715 RSI: 0000000000751c7e RDI: 00007f65a0b79010
[  588.823657] RBP: 00007ffdb33ac8b0 R08: 0000557dfc420010 R09: 0000000000000007
[  588.823658] R10: 0000000000000001 R11: 0000000000000246 R12: 0000557dbeee3715
[  588.823659] R13: 0000000000040000 R14: 0000557dfc420db0 R15: 0000000000000000
[  588.823661]  </TASK>
[  588.823662] ---[ end trace 0000000000000000 ]---
[  588.823665] ------------[ ftrace bug ]------------
[  588.823666] ftrace failed to modify 
[  588.823667] [<ffffffffc1514064>] 0xffffffffc1514064
[  588.823671]  actual:   e8:a7:d8:bc:f8
[  588.823673]  expected: e8:a7:98:b8:f8
[  588.823675] Initializing ftrace call sites
[  588.823675] ftrace record flags: 2000000
[  588.823676]  (0)    
                expected tramp: ffffffffba09d940
[  588.823704] ------------[ cut here ]------------
[  588.823705] WARNING: CPU: 0 PID: 4325 at kernel/trace/ftrace.c:2234 ftrace_bug+0x23c/0x264
[  588.823711] Modules linked in: libfc scsi_transport_fc iscsi_ibft dm_crypt vfat fat dm_round_robin dm_multipath raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 uinput snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set rfkill nf_tables qrtr snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec joydev snd_hda_core snd_hwdep snd_seq snd_seq_device iTCO_wdt intel_pmc_bxt snd_pcm iTCO_vendor_support pktcdvd i2c_i801 lpc_ich snd_timer snd soundcore pcspkr virtio_balloon i2c_smbus binfmt_misc nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci overlay squashfs isofs crc32c_intel sha512_ssse3 sha256_ssse3 virtio_net sha1_ssse3 virtio_blk virtio_console virtio_scsi virtio_gpu net_failover failover virtio_dma_buf
[  588.823752]  nvme_tcp nvme_fabrics nvme_keyring serio_raw nvme_core nvme_auth sunrpc be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi loop fuse i2c_dev qemu_fw_cfg
[  588.823765] CPU: 0 UID: 0 PID: 4325 Comm: modprobe Tainted: G        W         -------  ---  6.13.0-0.rc0.20241126git7eef7e306d3c.10.fc42.x86_64 #1
[  588.823768] Tainted: [W]=WARN
[  588.823769] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-3.fc41 04/01/2014
[  588.823770] RIP: 0010:ftrace_bug+0x23c/0x264
[  588.823774] Code: ff 84 c0 74 d4 eb b8 48 c7 c7 99 27 ec bb e8 ab 71 ff ff 48 89 ef e8 03 e3 0e ff 48 c7 c7 aa 27 ec bb 48 89 c6 e8 94 71 ff ff <0f> 0b c7 05 d0 28 ad 01 01 00 00 00 5b 31 c0 5d 41 5c 89 05 d4 28
[  588.823776] RSP: 0018:ffffbb83833ef880 EFLAGS: 00010246
[  588.823777] RAX: 0000000000000022 RBX: ffffffffc1514064 RCX: 0000000000000027
[  588.823778] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffa07c3bc21900
[  588.823779] RBP: ffffa07bc832f3e0 R08: 0000000000000000 R09: 0000000000000000
[  588.823780] R10: 203a706d61727420 R11: 6465746365707865 R12: 00000000ffffffea
[  588.823781] R13: 0000000002000000 R14: ffffa07bcbe50ce0 R15: 0000000000000f3e
[  588.823782] FS:  00007f65a27c8740(0000) GS:ffffa07c3bc00000(0000) knlGS:0000000000000000
[  588.823784] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  588.823785] CR2: 00007f3ef9753f10 CR3: 00000000027a8000 CR4: 00000000000006f0
[  588.823788] Call Trace:
[  588.823793]  <TASK>
[  588.823793]  ? ftrace_bug+0x23c/0x264
[  588.823796]  ? __warn.cold+0x93/0xfa
[  588.823798]  ? ftrace_bug+0x23c/0x264
[  588.823801]  ? report_bug+0xff/0x140
[  588.823804]  ? handle_bug+0x58/0x90
[  588.823805]  ? exc_invalid_op+0x17/0x70
[  588.823807]  ? asm_exc_invalid_op+0x1a/0x20
[  588.823810]  ? ftrace_bug+0x23c/0x264
[  588.823813]  ftrace_process_locs.cold+0x14/0xad
[  588.823817]  ftrace_module_init+0x32/0x50
[  588.823819]  load_module+0x1b1a/0x23e0
[  588.823823]  ? __do_sys_init_module+0x17a/0x1b0
[  588.823825]  __do_sys_init_module+0x17a/0x1b0
[  588.823828]  do_syscall_64+0x82/0x160
[  588.823831]  ? __vm_munmap+0xb9/0x170
[  588.823834]  ? syscall_exit_to_user_mode+0x10/0x210
[  588.823837]  ? do_syscall_64+0x8e/0x160
[  588.823839]  ? __alloc_pages_noprof+0x184/0x330
[  588.823842]  ? __mod_memcg_lruvec_state+0xdf/0x220
[  588.823844]  ? __count_memcg_events+0xc0/0x180
[  588.823847]  ? __lruvec_stat_mod_folio+0x83/0xd0
[  588.823849]  ? set_ptes.isra.0+0x41/0x90
[  588.823851]  ? do_anonymous_page+0xfc/0x8f0
[  588.823854]  ? __pte_offset_map+0x1b/0x180
[  588.823856]  ? __handle_mm_fault+0xb55/0xfd0
[  588.823859]  ? __count_memcg_events+0xc0/0x180
[  588.823862]  ? count_memcg_events.constprop.0+0x1a/0x30
[  588.823864]  ? handle_mm_fault+0x21b/0x330
[  588.823866]  ? do_user_addr_fault+0x55a/0x7b0
[  588.823869]  ? exc_page_fault+0x7e/0x180
[  588.823872]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  588.823874] RIP: 0033:0x7f65a210069e
[  588.823877] Code: 48 8b 0d 75 37 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 42 37 0f 00 f7 d8 64 89 01 48
[  588.823880] RSP: 002b:00007ffdb33ac7f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  588.823882] RAX: ffffffffffffffda RBX: 0000557dfc420ca0 RCX: 00007f65a210069e
[  588.823885] RDX: 0000557dbeee3715 RSI: 0000000000751c7e RDI: 00007f65a0b79010
[  588.823886] RBP: 00007ffdb33ac8b0 R08: 0000557dfc420010 R09: 0000000000000007
[  588.823887] R10: 0000000000000001 R11: 0000000000000246 R12: 0000557dbeee3715
[  588.823888] R13: 0000000000040000 R14: 0000557dfc420db0 R15: 0000000000000000
[  588.823890]  </TASK>
[  588.823890] ---[ end trace 0000000000000000 ]---
[  588.872305] unexpected static_call insn opcode 0x85 at xchk_da_btree+0x402/0x4e0 [xfs]
[  588.872498] ------------[ cut here ]------------
[  588.872499] kernel BUG at arch/x86/kernel/static_call.c:139!
[  588.872514] Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
[  588.872519] CPU: 1 UID: 0 PID: 4325 Comm: modprobe Tainted: G        W         -------  ---  6.13.0-0.rc0.20241126git7eef7e306d3c.10.fc42.x86_64 #1
[  588.872523] Tainted: [W]=WARN
[  588.872524] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-3.fc41 04/01/2014
[  588.872526] RIP: 0010:__static_call_validate.cold+0x12/0x14
[  588.872531] Code: 48 c7 c7 d0 94 dc bb e8 a6 ea 00 00 48 8b 43 08 4c 8b 33 e9 6b f7 ee fe 48 89 fa 0f b6 f0 48 c7 c7 00 95 dc bb e8 88 ea 00 00 <0f> 0b 65 8b 35 bb 01 ed 44 48 c7 c7 48 96 dc bb e8 73 ea 00 00 fa
[  588.872533] RSP: 0000:ffffbb83833ef7d8 EFLAGS: 00010246
[  588.872536] RAX: 000000000000004a RBX: ffffffffbb2353e0 RCX: 0000000000000000
[  588.872538] RDX: 0000000000000000 RSI: ffffa07c3bd21900 RDI: ffffa07c3bd21900
[  588.872539] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[  588.872541] R10: 65657274625f6164 R11: 302f32303478302b R12: ffffffffc1517162
[  588.872542] R13: 0000000000000000 R14: ffffffffc1274368 R15: ffffffffbc878220
[  588.872543] FS:  00007f65a27c8740(0000) GS:ffffa07c3bd00000(0000) knlGS:0000000000000000
[  588.872545] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  588.872546] CR2: 00007f8418c0f000 CR3: 00000000027a8000 CR4: 00000000000006f0
[  588.872551] Call Trace:
[  588.872559]  <TASK>
[  588.872562]  ? __die_body.cold+0x19/0x27
[  588.872566]  ? die+0x2e/0x50
[  588.872578]  ? do_trap+0xca/0x110
[  588.872583]  ? do_error_trap+0x6a/0x90
[  588.872585]  ? __static_call_validate.cold+0x12/0x14
[  588.872587]  ? exc_invalid_op+0x50/0x70
[  588.872589]  ? __static_call_validate.cold+0x12/0x14
[  588.872591]  ? asm_exc_invalid_op+0x1a/0x20
[  588.872598]  ? xchk_da_btree+0x402/0x4e0 [xfs]
[  588.872735]  ? __pfx___cond_resched+0x10/0x10
[  588.872739]  ? __static_call_validate.cold+0x12/0x14
[  588.872741]  ? __static_call_validate.cold+0x12/0x14
[  588.872743]  arch_static_call_transform+0x6b/0xa0
[  588.872749]  ? xchk_da_btree+0x402/0x4e0 [xfs]
[  588.872862]  __static_call_init+0x276/0x300
[  588.872870]  ? __is_insn_slot_addr+0x45/0x70
[  588.872876]  ? __SCT__tp_func_ipi_exit+0x8/0x8
[  588.872880]  ? __SCT__tp_func_ipi_exit+0x8/0x8
[  588.872881]  static_call_module_notify+0x11f/0x150
[  588.872884]  notifier_call_chain+0x5d/0xd0
[  588.872897]  blocking_notifier_call_chain_robust+0x65/0xc0
[  588.872902]  load_module+0x1cce/0x23e0
[  588.872909]  ? __do_sys_init_module+0x17a/0x1b0
[  588.872911]  __do_sys_init_module+0x17a/0x1b0
[  588.872915]  do_syscall_64+0x82/0x160
[  588.872920]  ? __vm_munmap+0xb9/0x170
[  588.872924]  ? syscall_exit_to_user_mode+0x10/0x210
[  588.872928]  ? do_syscall_64+0x8e/0x160
[  588.872931]  ? __alloc_pages_noprof+0x184/0x330
[  588.872934]  ? __mod_memcg_lruvec_state+0xdf/0x220
[  588.872938]  ? __count_memcg_events+0xc0/0x180
[  588.872941]  ? __lruvec_stat_mod_folio+0x83/0xd0
[  588.872944]  ? set_ptes.isra.0+0x41/0x90
[  588.872946]  ? do_anonymous_page+0xfc/0x8f0
[  588.872949]  ? __pte_offset_map+0x1b/0x180
[  588.872952]  ? __handle_mm_fault+0xb55/0xfd0
[  588.872956]  ? __count_memcg_events+0xc0/0x180
[  588.872959]  ? count_memcg_events.constprop.0+0x1a/0x30
[  588.872961]  ? handle_mm_fault+0x21b/0x330
[  588.872967]  ? do_user_addr_fault+0x55a/0x7b0
[  588.872971]  ? exc_page_fault+0x7e/0x180
[  588.872974]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  588.872978] RIP: 0033:0x7f65a210069e
[  588.872992] Code: 48 8b 0d 75 37 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 42 37 0f 00 f7 d8 64 89 01 48
[  588.872994] RSP: 002b:00007ffdb33ac7f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  588.872997] RAX: ffffffffffffffda RBX: 0000557dfc420ca0 RCX: 00007f65a210069e
[  588.872998] RDX: 0000557dbeee3715 RSI: 0000000000751c7e RDI: 00007f65a0b79010
[  588.872999] RBP: 00007ffdb33ac8b0 R08: 0000557dfc420010 R09: 0000000000000007
[  588.873001] R10: 0000000000000001 R11: 0000000000000246 R12: 0000557dbeee3715
[  588.873002] R13: 0000000000040000 R14: 0000557dfc420db0 R15: 0000000000000000
[  588.873005]  </TASK>
[  588.873006] Modules linked in: xfs(+) libfc scsi_transport_fc iscsi_ibft dm_crypt vfat fat dm_round_robin dm_multipath raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 uinput snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set rfkill nf_tables qrtr snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec joydev snd_hda_core snd_hwdep snd_seq snd_seq_device iTCO_wdt intel_pmc_bxt snd_pcm iTCO_vendor_support pktcdvd i2c_i801 lpc_ich snd_timer snd soundcore pcspkr virtio_balloon i2c_smbus binfmt_misc nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci overlay squashfs isofs crc32c_intel sha512_ssse3 sha256_ssse3 virtio_net sha1_ssse3 virtio_blk virtio_console virtio_scsi virtio_gpu net_failover failover
[  588.873072]  virtio_dma_buf nvme_tcp nvme_fabrics nvme_keyring serio_raw nvme_core nvme_auth sunrpc be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi loop fuse i2c_dev qemu_fw_cfg
[  588.873108] ---[ end trace 0000000000000000 ]---
[  588.873110] RIP: 0010:__static_call_validate.cold+0x12/0x14
[  588.873112] Code: 48 c7 c7 d0 94 dc bb e8 a6 ea 00 00 48 8b 43 08 4c 8b 33 e9 6b f7 ee fe 48 89 fa 0f b6 f0 48 c7 c7 00 95 dc bb e8 88 ea 00 00 <0f> 0b 65 8b 35 bb 01 ed 44 48 c7 c7 48 96 dc bb e8 73 ea 00 00 fa
[  588.873114] RSP: 0000:ffffbb83833ef7d8 EFLAGS: 00010246
[  588.873116] RAX: 000000000000004a RBX: ffffffffbb2353e0 RCX: 0000000000000000
[  588.873117] RDX: 0000000000000000 RSI: ffffa07c3bd21900 RDI: ffffa07c3bd21900
[  588.873119] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[  588.873120] R10: 65657274625f6164 R11: 302f32303478302b R12: ffffffffc1517162
[  588.873121] R13: 0000000000000000 R14: ffffffffc1274368 R15: ffffffffbc878220
[  588.873122] FS:  00007f65a27c8740(0000) GS:ffffa07c3bd00000(0000) knlGS:0000000000000000
[  588.873124] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  588.873125] CR2: 00007f8418c0f000 CR3: 00000000027a8000 CR4: 00000000000006f0
[  588.891057] Oops: general protection fault, probably for non-canonical address 0x578a7de75fdfe55b: 0000 [#2] PREEMPT SMP PTI
[  588.891066] CPU: 0 UID: 1000 PID: 2146 Comm: kwin_wayland Tainted: G      D W         -------  ---  6.13.0-0.rc0.20241126git7eef7e306d3c.10.fc42.x86_64 #1
[  588.891081] Tainted: [D]=DIE, [W]=WARN
[  588.891082] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-3.fc41 04/01/2014
[  588.891084] RIP: 0010:kmem_cache_alloc_node_noprof+0x1cf/0x400
[  588.891094] Code: 84 12 ff ff ff 48 85 db 0f 84 09 ff ff ff 48 8b 03 48 c1 e8 36 41 39 c3 0f 85 f9 fe ff ff 41 8b 44 24 28 49 8b 34 24 48 01 f8 <48> 8b 18 48 89 c1 49 33 9c 24 b8 00 00 00 48 89 f8 48 0f c9 48 31
[  588.891096] RSP: 0018:ffffbb8381873690 EFLAGS: 00010202
[  588.891099] RAX: 578a7de75fdfe55b RBX: ffffee6f8418d740 RCX: 00000000ffffffff
[  588.891103] RDX: 00000000af4a6000 RSI: 0000000000040160 RDI: 578a7de75fdfe4eb
[  588.891106] RBP: ffffbb83818736e8 R08: 0000000000400cc0 R09: 0000000000000003
[  588.891108] R10: ffffffffbae8686a R11: 00000000ffffffff R12: ffffa07bc02af100
[  588.891109] R13: 0000000000400cc0 R14: 00000000ffffffff R15: 00000000000000e8
[  588.891111] FS:  00007f3f4367c280(0000) GS:ffffa07c3bc00000(0000) knlGS:0000000000000000
[  588.891113] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  588.891114] CR2: 00007f3ef9b60324 CR3: 000000000128c000 CR4: 00000000000006f0
[  588.891119] Call Trace:
[  588.891121]  <TASK>
[  588.891125]  ? __die_body.cold+0x19/0x27
[  588.891130]  ? die_addr+0x3c/0x60
[  588.891133]  ? exc_general_protection+0x17d/0x400
[  588.891137]  ? asm_exc_general_protection+0x26/0x30
[  588.891141]  ? __alloc_skb+0x14a/0x1a0
[  588.891151]  ? kmem_cache_alloc_node_noprof+0x1cf/0x400
[  588.891155]  ? __alloc_skb+0x14a/0x1a0
[  588.891156]  __alloc_skb+0x14a/0x1a0
[  588.891159]  alloc_skb_with_frags+0x67/0x2d0
[  588.891163]  sock_alloc_send_pskb+0x1f7/0x240
[  588.891172]  ? avc_has_perm+0x5d/0xe0
[  588.891185]  unix_stream_sendmsg+0x17d/0x680
[  588.891190]  ____sys_sendmsg+0x3a0/0x3d0
[  588.891195]  ___sys_sendmsg+0x9a/0xe0
[  588.891199]  __sys_sendmsg+0x87/0xe0
[  588.891202]  do_syscall_64+0x82/0x160
[  588.891206]  ? _copy_to_user+0x36/0x50
[  588.891217]  ? drm_ioctl+0x2b7/0x530
[  588.891231]  ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
[  588.891242]  ? syscall_exit_to_user_mode+0x10/0x210
[  588.891246]  ? do_syscall_64+0x8e/0x160
[  588.891249]  ? syscall_exit_to_user_mode+0x10/0x210
[  588.891252]  ? do_syscall_64+0x8e/0x160
[  588.891255]  ? syscall_exit_to_user_mode+0x10/0x210
[  588.891257]  ? do_syscall_64+0x8e/0x160
[  588.891260]  ? drm_mode_addfb2+0xda/0xf0
[  588.891262]  ? __pfx_drm_mode_addfb2_ioctl+0x10/0x10
[  588.891264]  ? _copy_from_user+0x29/0x70
[  588.891265]  ? __check_object_size+0x58/0x230
[  588.891270]  ? sync_file_ioctl+0xac/0x5b0
[  588.891282]  ? syscall_exit_to_user_mode+0x10/0x210
[  588.891285]  ? do_syscall_64+0x8e/0x160
[  588.891290]  ? syscall_exit_to_user_mode+0x10/0x210
[  588.891293]  ? do_syscall_64+0x8e/0x160
[  588.891295]  ? do_syscall_64+0x8e/0x160
[  588.891298]  ? syscall_exit_to_user_mode+0x10/0x210
[  588.891300]  ? do_syscall_64+0x8e/0x160
[  588.891303]  ? exc_page_fault+0x7e/0x180
[  588.891306]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  588.891309] RIP: 0033:0x7f3f49ceadd2
[  588.891321] Code: 08 0f 85 51 3d ff ff 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 55 bf 01 00
[  588.891323] RSP: 002b:00007ffc3f949158 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  588.891325] RAX: ffffffffffffffda RBX: 00007f3f4367c280 RCX: 00007f3f49ceadd2
[  588.891327] RDX: 0000000000004040 RSI: 00007ffc3f9491f0 RDI: 00000000000000d4
[  588.891328] RBP: 00007ffc3f949180 R08: 0000000000000000 R09: 0000000000000000
[  588.891329] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000003f4
[  588.891330] R13: 00007ffc3f9491f0 R14: 00005647557ee1c0 R15: ffffffffffffffff
[  588.891333]  </TASK>
[  588.891334] Modules linked in: xfs(+) libfc scsi_transport_fc iscsi_ibft dm_crypt vfat fat dm_round_robin dm_multipath raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 uinput snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set rfkill nf_tables qrtr snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec joydev snd_hda_core snd_hwdep snd_seq snd_seq_device iTCO_wdt intel_pmc_bxt snd_pcm iTCO_vendor_support pktcdvd i2c_i801 lpc_ich snd_timer snd soundcore pcspkr virtio_balloon i2c_smbus binfmt_misc nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci overlay squashfs isofs crc32c_intel sha512_ssse3 sha256_ssse3 virtio_net sha1_ssse3 virtio_blk virtio_console virtio_scsi virtio_gpu net_failover failover
[  588.891393]  virtio_dma_buf nvme_tcp nvme_fabrics nvme_keyring serio_raw nvme_core nvme_auth sunrpc be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi loop fuse i2c_dev qemu_fw_cfg
[  588.891416] ---[ end trace 0000000000000000 ]---
[  588.891418] RIP: 0010:__static_call_validate.cold+0x12/0x14
[  588.891421] Code: 48 c7 c7 d0 94 dc bb e8 a6 ea 00 00 48 8b 43 08 4c 8b 33 e9 6b f7 ee fe 48 89 fa 0f b6 f0 48 c7 c7 00 95 dc bb e8 88 ea 00 00 <0f> 0b 65 8b 35 bb 01 ed 44 48 c7 c7 48 96 dc bb e8 73 ea 00 00 fa
[  588.891422] RSP: 0000:ffffbb83833ef7d8 EFLAGS: 00010246
[  588.891424] RAX: 000000000000004a RBX: ffffffffbb2353e0 RCX: 0000000000000000
[  588.891425] RDX: 0000000000000000 RSI: ffffa07c3bd21900 RDI: ffffa07c3bd21900
[  588.891427] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[  588.891428] R10: 65657274625f6164 R11: 302f32303478302b R12: ffffffffc1517162
[  588.891429] R13: 0000000000000000 R14: ffffffffc1274368 R15: ffffffffbc878220
[  588.891430] FS:  00007f3f4367c280(0000) GS:ffffa07c3bc00000(0000) knlGS:0000000000000000
[  588.891432] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  588.891433] CR2: 00007f3ef9b60324 CR3: 000000000128c000 CR4: 00000000000006f0
[  589.059598] Oops: general protection fault, probably for non-canonical address 0x578a7de75fdfe55b: 0000 [#3] PREEMPT SMP PTI
[  589.059608] CPU: 0 UID: 0 PID: 3472 Comm: gdbus Tainted: G      D W         -------  ---  6.13.0-0.rc0.20241126git7eef7e306d3c.10.fc42.x86_64 #1
[  589.059613] Tainted: [D]=DIE, [W]=WARN
[  589.059615] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-3.fc41 04/01/2014
[  589.059617] RIP: 0010:kmem_cache_alloc_node_noprof+0x1cf/0x400
[  589.059624] Code: 84 12 ff ff ff 48 85 db 0f 84 09 ff ff ff 48 8b 03 48 c1 e8 36 41 39 c3 0f 85 f9 fe ff ff 41 8b 44 24 28 49 8b 34 24 48 01 f8 <48> 8b 18 48 89 c1 49 33 9c 24 b8 00 00 00 48 89 f8 48 0f c9 48 31
[  589.059626] RSP: 0018:ffffbb838286f990 EFLAGS: 00010202
[  589.059630] RAX: 578a7de75fdfe55b RBX: ffffee6f8418d740 RCX: 00000000ffffffff
[  589.059631] RDX: 00000000af4a6000 RSI: 0000000000040160 RDI: 578a7de75fdfe4eb
[  589.059633] RBP: ffffbb838286f9e8 R08: 0000000000400cc0 R09: 0000000000000003
[  589.059634] R10: ffffffffbae8686a R11: 00000000ffffffff R12: ffffa07bc02af100
[  589.059636] R13: 0000000000400cc0 R14: 00000000ffffffff R15: 00000000000000e8
[  589.059638] FS:  00007f842ebce6c0(0000) GS:ffffa07c3bc00000(0000) knlGS:0000000000000000
[  589.059640] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  589.059642] CR2: 000055a734042708 CR3: 000000010c3c2000 CR4: 00000000000006f0
[  589.059647] Call Trace:
[  589.059662]  <TASK>
[  589.059665]  ? __die_body.cold+0x19/0x27
[  589.059671]  ? die_addr+0x3c/0x60
[  589.059675]  ? exc_general_protection+0x17d/0x400
[  589.059682]  ? asm_exc_general_protection+0x26/0x30
[  589.059686]  ? __alloc_skb+0x14a/0x1a0
[  589.059693]  ? kmem_cache_alloc_node_noprof+0x1cf/0x400
[  589.059696]  ? __alloc_skb+0x14a/0x1a0
[  589.059699]  __alloc_skb+0x14a/0x1a0
[  589.059702]  alloc_skb_with_frags+0x67/0x2d0
[  589.059706]  sock_alloc_send_pskb+0x1f7/0x240
[  589.059710]  ? avc_has_perm+0x5d/0xe0
[  589.059713]  unix_stream_sendmsg+0x17d/0x680
[  589.059718]  ____sys_sendmsg+0x3a0/0x3d0
[  589.059721]  ___sys_sendmsg+0x9a/0xe0
[  589.059755]  __sys_sendmsg+0x87/0xe0
[  589.059761]  do_syscall_64+0x82/0x160
[  589.059767]  ? syscall_exit_to_user_mode+0x10/0x210
[  589.059772]  ? do_syscall_64+0x8e/0x160
[  589.059776]  ? syscall_exit_to_user_mode+0x10/0x210
[  589.059779]  ? do_syscall_64+0x8e/0x160
[  589.059782]  ? do_syscall_64+0x8e/0x160
[  589.059785]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  589.059789] RIP: 0033:0x7f84400b0dd2
[  589.059802] Code: 08 0f 85 51 3d ff ff 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 55 bf 01 00
[  589.059804] RSP: 002b:00007f842ebcda18 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  589.059811] RAX: ffffffffffffffda RBX: 00007f842ebce6c0 RCX: 00007f84400b0dd2
[  589.059813] RDX: 0000000000004000 RSI: 00007f842ebcdae0 RDI: 0000000000000011
[  589.059814] RBP: 00007f842ebcda40 R08: 0000000000000000 R09: 0000000000000000
[  589.059816] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f842ebcdae0
[  589.059817] R13: 0000000000004000 R14: 0000000000000000 R15: 00007f842ebcdb90
[  589.059820]  </TASK>
[  589.059822] Modules linked in: xfs(+) libfc scsi_transport_fc iscsi_ibft dm_crypt vfat fat dm_round_robin dm_multipath raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 uinput snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set rfkill nf_tables qrtr snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec joydev snd_hda_core snd_hwdep snd_seq snd_seq_device iTCO_wdt intel_pmc_bxt snd_pcm iTCO_vendor_support pktcdvd i2c_i801 lpc_ich snd_timer snd soundcore pcspkr virtio_balloon i2c_smbus binfmt_misc nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci overlay squashfs isofs crc32c_intel sha512_ssse3 sha256_ssse3 virtio_net sha1_ssse3 virtio_blk virtio_console virtio_scsi virtio_gpu net_failover failover
[  589.059892]  virtio_dma_buf nvme_tcp nvme_fabrics nvme_keyring serio_raw nvme_core nvme_auth sunrpc be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi loop fuse i2c_dev qemu_fw_cfg
[  589.059943] ---[ end trace 0000000000000000 ]---
[  589.059945] RIP: 0010:__static_call_validate.cold+0x12/0x14
[  589.059948] Code: 48 c7 c7 d0 94 dc bb e8 a6 ea 00 00 48 8b 43 08 4c 8b 33 e9 6b f7 ee fe 48 89 fa 0f b6 f0 48 c7 c7 00 95 dc bb e8 88 ea 00 00 <0f> 0b 65 8b 35 bb 01 ed 44 48 c7 c7 48 96 dc bb e8 73 ea 00 00 fa
[  589.059950] RSP: 0000:ffffbb83833ef7d8 EFLAGS: 00010246
[  589.059952] RAX: 000000000000004a RBX: ffffffffbb2353e0 RCX: 0000000000000000
[  589.059954] RDX: 0000000000000000 RSI: ffffa07c3bd21900 RDI: ffffa07c3bd21900
[  589.059955] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[  589.059956] R10: 65657274625f6164 R11: 302f32303478302b R12: ffffffffc1517162
[  589.059957] R13: 0000000000000000 R14: ffffffffc1274368 R15: ffffffffbc878220
[  589.059959] FS:  00007f842ebce6c0(0000) GS:ffffa07c3bc00000(0000) knlGS:0000000000000000
[  589.059960] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  589.059962] CR2: 000055a734042708 CR3: 000000010c3c2000 CR4: 00000000000006f0
Dec 03 01:16:50 localhost-live kernel: ------------[ cut here ]------------
Dec 03 01:16:50 localhost-live kernel: WARNING: CPU: 0 PID: 4325 at arch/x86/kernel/ftrace.c:100 ftrace_verify_code+0x4c/0x90
Dec 03 01:16:50 localhost-live kernel: Modules linked in: libfc scsi_transport_fc iscsi_ibft dm_crypt vfat fat dm_round_robin dm_multipath raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 uinput snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set rfkill nf_tables qrtr snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec joydev snd_hda_core snd_hwdep snd_seq snd_seq_device iTCO_wdt intel_pmc_bxt snd_pcm iTCO_vendor_support pktcdvd i2c_i801 lpc_ich snd_timer snd soundcore pcspkr virtio_balloon i2c_smbus binfmt_misc nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci overlay squashfs isofs crc32c_intel sha512_ssse3 sha256_ssse3 virtio_net sha1_ssse3 virtio_blk virtio_console virtio_scsi virtio_gpu net_failover failover virtio_dma_buf
Dec 03 01:16:50 localhost-live kernel:  nvme_tcp nvme_fabrics nvme_keyring serio_raw nvme_core nvme_auth sunrpc be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi loop fuse i2c_dev qemu_fw_cfg
Dec 03 01:16:50 localhost-live kernel: CPU: 0 UID: 0 PID: 4325 Comm: modprobe Not tainted 6.13.0-0.rc0.20241126git7eef7e306d3c.10.fc42.x86_64 #1
Dec 03 01:16:50 localhost-live kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-3.fc41 04/01/2014
Dec 03 01:16:50 localhost-live kernel: RIP: 0010:ftrace_verify_code+0x4c/0x90
Dec 03 01:16:50 localhost-live kernel: Code: fe c6 44 24 07 00 48 89 c7 c7 44 24 03 00 00 00 00 e8 98 6e 30 00 48 85 c0 75 3e 8b 03 39 44 24 03 74 28 48 89 1d fc b3 7e 03 <0f> 0b b8 ea ff ff ff 48 8b 54 24 08 65 48 2b 14 25 28 00 00 00 75
Dec 03 01:16:50 localhost-live kernel: RSP: 0018:ffffbb83833ef860 EFLAGS: 00010206
Dec 03 01:16:50 localhost-live kernel: RAX: 00000000b898a7e8 RBX: ffffffffbd7c2e08 RCX: 0000000000000010
Dec 03 01:16:50 localhost-live kernel: RDX: ffffa07bd290a880 RSI: 0000000000000005 RDI: ffffffffc1514064
Dec 03 01:16:50 localhost-live kernel: RBP: ffffffffc1514064 R08: 0000000000000000 R09: ffffa07c3bc3ee20
Dec 03 01:16:50 localhost-live kernel: R10: ffffbb83833ef8a0 R11: dead000000000040 R12: ffffffffbb60b5ea
Dec 03 01:16:50 localhost-live kernel: R13: 0000000002000000 R14: ffffa07bcbe50ce0 R15: 0000000000000f3e
Dec 03 01:16:51 localhost-live kernel: FS:  00007f65a27c8740(0000) GS:ffffa07c3bc00000(0000) knlGS:0000000000000000
Dec 03 01:16:51 localhost-live kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 03 01:16:51 localhost-live kernel: CR2: 00007f3ef9753f10 CR3: 00000000027a8000 CR4: 00000000000006f0
Dec 03 01:16:51 localhost-live kernel: Call Trace:
Dec 03 01:16:51 localhost-live kernel:  <TASK>
Dec 03 01:16:51 localhost-live kernel:  ? ftrace_verify_code+0x4c/0x90
Dec 03 01:16:51 localhost-live kernel:  ? __warn.cold+0x93/0xfa
Dec 03 01:16:51 localhost-live kernel:  ? ftrace_verify_code+0x4c/0x90
Dec 03 01:16:51 localhost-live kernel:  ? report_bug+0xff/0x140
Dec 03 01:16:51 localhost-live kernel:  ? handle_bug+0x58/0x90
Dec 03 01:16:51 localhost-live kernel:  ? exc_invalid_op+0x17/0x70
Dec 03 01:16:51 localhost-live kernel:  ? asm_exc_invalid_op+0x1a/0x20
Dec 03 01:16:51 localhost-live kernel:  ? ftrace_verify_code+0x4c/0x90
Dec 03 01:16:51 localhost-live kernel:  ftrace_modify_code_direct+0xf/0x70
Dec 03 01:16:51 localhost-live kernel:  ftrace_process_locs+0x313/0x5c0
Dec 03 01:16:51 localhost-live kernel:  ftrace_module_init+0x32/0x50
Dec 03 01:16:51 localhost-live kernel:  load_module+0x1b1a/0x23e0
Dec 03 01:16:51 localhost-live kernel:  ? __do_sys_init_module+0x17a/0x1b0
Dec 03 01:16:51 localhost-live kernel:  __do_sys_init_module+0x17a/0x1b0
Dec 03 01:16:51 localhost-live kernel:  do_syscall_64+0x82/0x160
Dec 03 01:16:51 localhost-live kernel:  ? __vm_munmap+0xb9/0x170
Dec 03 01:16:51 localhost-live kernel:  ? syscall_exit_to_user_mode+0x10/0x210
Dec 03 01:16:51 localhost-live kernel:  ? do_syscall_64+0x8e/0x160
Dec 03 01:16:51 localhost-live kernel:  ? __alloc_pages_noprof+0x184/0x330
Dec 03 01:16:51 localhost-live kernel:  ? __mod_memcg_lruvec_state+0xdf/0x220
Dec 03 01:16:51 localhost-live kernel:  ? __count_memcg_events+0xc0/0x180
Dec 03 01:16:51 localhost-live kernel:  ? __lruvec_stat_mod_folio+0x83/0xd0
Dec 03 01:16:51 localhost-live kernel:  ? set_ptes.isra.0+0x41/0x90
Dec 03 01:16:51 localhost-live kernel:  ? do_anonymous_page+0xfc/0x8f0
Dec 03 01:16:51 localhost-live kernel:  ? __pte_offset_map+0x1b/0x180
Dec 03 01:16:51 localhost-live kernel:  ? __handle_mm_fault+0xb55/0xfd0
Dec 03 01:16:51 localhost-live kernel:  ? __count_memcg_events+0xc0/0x180
Dec 03 01:16:51 localhost-live kernel:  ? count_memcg_events.constprop.0+0x1a/0x30
Dec 03 01:16:51 localhost-live kernel:  ? handle_mm_fault+0x21b/0x330
Dec 03 01:16:51 localhost-live kernel:  ? do_user_addr_fault+0x55a/0x7b0
Dec 03 01:16:51 localhost-live kernel:  ? exc_page_fault+0x7e/0x180
Dec 03 01:16:51 localhost-live kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
Dec 03 01:16:51 localhost-live kernel: RIP: 0033:0x7f65a210069e
Dec 03 01:16:51 localhost-live kernel: Code: 48 8b 0d 75 37 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 42 37 0f 00 f7 d8 64 89 01 48
Dec 03 01:16:51 localhost-live kernel: RSP: 002b:00007ffdb33ac7f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
Dec 03 01:16:51 localhost-live kernel: RAX: ffffffffffffffda RBX: 0000557dfc420ca0 RCX: 00007f65a210069e
Dec 03 01:16:51 localhost-live kernel: RDX: 0000557dbeee3715 RSI: 0000000000751c7e RDI: 00007f65a0b79010
Dec 03 01:16:51 localhost-live kernel: RBP: 00007ffdb33ac8b0 R08: 0000557dfc420010 R09: 0000000000000007
Dec 03 01:16:51 localhost-live kernel: R10: 0000000000000001 R11: 0000000000000246 R12: 0000557dbeee3715
Dec 03 01:16:51 localhost-live kernel: R13: 0000000000040000 R14: 0000557dfc420db0 R15: 0000000000000000
Dec 03 01:16:51 localhost-live kernel:  </TASK>
Dec 03 01:16:51 localhost-live kernel: ---[ end trace 0000000000000000 ]---

Comment 9 Adam Williamson 2024-12-19 22:14:14 UTC
Welp, as this has got no traction anywhere, I guess I'll have to try and bisect it.

I think I have a workable procedure for this, but it'll take about 2.5 hours per step, and it'll be a 12 step bisect, so it's probably going to take me till next week.

Comment 10 Adam Williamson 2024-12-21 07:32:36 UTC
Current bisect progress:

[adamw@xps13a linux ((a8cd9d4ce35e...)|BISECTING)]$ git bisect visualize --oneline
82475d111de7 selftests/damon/_debugfs_common: hide expected error message from test_write_result()
e06a6b55ed3d selftests/damon/huge_count_read_write: remove unnecessary debugging message
45488345d4b6 selftests/damon/huge_count_read_write: provide sufficiently large buffer for DEPRECATED file read
2b1d55498b67 memcg: factor out mem_cgroup_stat_aggregate()
e8c1a296b806 mm/show_mem: use str_yes_no() helper in show_free_areas()
1bc542c6a0d1 mm/vmscan: wake up flushers conditionally to avoid cgroup OOM
33d7f15f916e mm: use page->private instead of page->index in percpu
544ec0ed3764 mm: remove references to page->index in huge_memory.c
0386aaa6e9c8 bootmem: stop using page->index
68158bfa3dbd mm: mass constification of folio/page pointers
713da0b33b3e mm: renovate page_address_in_vma()
7d3e93eca3ca mm: use page_pgoff() in more places
f7470591f8db mm: convert page_to_pgoff() to page_pgoff()
e664c2cd98cb mm/zsmalloc: use memcpy_from/to_page whereever possible
91d0ec834786 zsmalloc: replace kmap_atomic with kmap_local_page
b7fc16a16b08 mm/codetag: uninline and move pgalloc_tag_copy and pgalloc_tag_split
4835f747d3ed alloc_tag: support for page allocation tag compression
42895a861244 alloc_tag: introduce pgtag_ref_handle to abstract page tag references
0f9b685626da alloc_tag: populate memory for module tags as needed
0db6f8d7820a alloc_tag: load module tags into separate contiguous memory
3e09c500bb5b alloc_tag: introduce shutdown_mem_profiling helper function
7c8c76e446ca maple_tree: add mas_for_each_rev() helper
5185e7f9f3bd x86/module: enable ROX caches for module text on 64 bit
2e45474ab14f execmem: add support for cache of large ROX pages
9bfc4824fd48 x86/module: prepare module loading for ROX allocations of text
0c6378a71574 arch: introduce set_direct_map_valid_noflush()
0c133b1e78cd module: prepare to handle ROX allocations for text
0c3beacf681e asm-generic: introduce text-patching.h
c82be0be9576 mm: vmalloc: don't account for number of nodes for HUGE_VMAP allocations
beeb9220c730 mm: vmalloc: group declarations depending on CONFIG_MMU together
906c38ff52e9 memcg: workingset: remove folio_memcg_rcu usage
642c66d84cd4 mm/vma: the pgoff is correct if can_merge_right
5ac87a885aec mm: defer second attempt at merge on mmap()
5a689bac0bbc mm: remove unnecessary reset state logic on merge new VMA
0d11630cc50a mm: refactor __mmap_region()
52956b0d7fb9 mm: isolate mmap internal logic to mm/vma.c
c14f8046cd7c tools: testing: add additional vma_internal.h stubs
a29c0e4b2e86 memcg-v1: remove memcg move locking code
cf4a65539c13 memcg-v1: no need for memcg locking for MGLRU
568bcf414849 memcg-v1: no need for memcg locking for writeback tracking
a8cd9d4ce35e (HEAD) memcg-v1: no need for memcg locking for dirty tracking
6b611388b626 memcg-v1: remove charge move code
aa6b4fdf5940 memcg-v1: fully deprecate move_charge_at_immigrate
729881ffd390 mm: shmem: fallback to page size splice if large folio has poisoned pages
477327e10639 mm/damon/vaddr: add 'nr_piece == 1' check in damon_va_evenly_split_region()
f3c7a1ede435 mm/damon/vaddr: fix issue in damon_va_evenly_split_region()
ab505e8be024 mm/page_alloc: use str_off_on() helper in build_all_zonelists()
8717734fdcc8 mm/memcontrol: fix seq_buf size to save memory when PAGE_SIZE is large
628e1b8c4777 mm: add missing mmu_notifier_clear_young for !MMU_NOTIFIER
3f1f947a322d tools/mm: free the allocated memory
39ac99852fca mm/page-writeback: raise wb_thresh to prevent write blocking with strictlimit
722376934b6c mm/memory.c: simplify pfnmap_lockdep_assert
ed265529d39a mm/codetag: fix arg in pgalloc_tag_copy alloc_tag_sub
78c018e3942c maple_tree: fix outdated flag name in comment
a284cb8472ec mm: shmem: improve the tmpfs large folio read performance
f3650ef89b87 mm: shmem: update iocb->ki_pos directly to simplify tmpfs read logic
b7f058f82739 mm: remove unused has_isolate_pageblock
5bb6345cd2ed mm: remove redundant condition for THP folio
4b6b0a5188c2 mm/mremap: remove goto from mremap_to()
58f1069311db mm/mremap: cleanup vma_to_resize()
38dc8f495246 maple_tree: remove sanity check from mas_wr_slot_store()
61e9df7085cc maple_tree: calculate new_end when needed
0938b1614648 mm: don't set readahead flag on a folio when lookahead_size > nr_to_read
4a9a27fdf7bf mm: shmem: remove __shmem_huge_global_enabled()
9884efd795cc mm: huge_memory: move file_thp_enabled() into huge_memory.c
5a90c155defa tmpfs: don't enable large folios if not supported
7146de5ff504 tools: testing: fix phys_addr_t size on 64-bit systems
f1001f3d3b68 mm/mglru: reset page lru tier bits when activating
d3ea85c6c5f7 mm: swap: use str_true_false() helper function
4a7bba1df001 percpu: add a test case for the specific 64-bit value addition
6c2625e9c2ef x86/percpu: fix clang warning when dealing with unsigned types
e4137f08816b mm, kasan, kmsan: instrument copy_from/to_kernel_nofault
908378a30b09 maple_tree: simplify mas_push_node()
4223dd93bfc9 maple_tree: total is not changed for nomem_one case
e852cb1d00ce maple_tree: clear request_count for new allocated one
0cc8d68abe2f maple_tree: root node could be handled by !p_slot too
0f85eb3395c7 maple_tree: add some alloc node test case
5b2100f723bd maple_tree: fix alloc node fail issue
f69c2e4dc684 mm/vmstat: defer the refresh_zone_stat_thresholds after all CPUs bringup

Comment 11 Adam Williamson 2024-12-23 11:31:35 UTC
OK, this bisects to:

[adamw@xps13a linux ((5185e7f9f3bd...)|BISECTING)]$ git bisect bad
5185e7f9f3bd754ab60680814afd714e2673ef88 is the first bad commit
commit 5185e7f9f3bd754ab60680814afd714e2673ef88 (HEAD)
Author: Mike Rapoport (Microsoft) <rppt>
Date:   Wed Oct 23 19:27:11 2024 +0300

    x86/module: enable ROX caches for module text on 64 bit
    
    Enable execmem's cache of PMD_SIZE'ed pages mapped as ROX for module text
    allocations on 64 bit.
    
    Link: https://lkml.kernel.org/r/20241023162711.2579610-9-rppt@kernel.org
    Signed-off-by: Mike Rapoport (Microsoft) <rppt>
    Reviewed-by: Luis Chamberlain <mcgrof>
    Tested-by: kdevops <kdevops.dev>
    Cc: Andreas Larsson <andreas>
    Cc: Andy Lutomirski <luto>
    Cc: Ard Biesheuvel <ardb>
    Cc: Arnd Bergmann <arnd>
    Cc: Borislav Petkov (AMD) <bp>
    Cc: Brian Cain <bcain>
    Cc: Catalin Marinas <catalin.marinas>
    Cc: Christophe Leroy <christophe.leroy>
    Cc: Christoph Hellwig <hch>
    Cc: Dave Hansen <dave.hansen.com>
    Cc: Dinh Nguyen <dinguyen>
    Cc: Geert Uytterhoeven <geert>
    Cc: Guo Ren <guoren>
    Cc: Helge Deller <deller>
    Cc: Huacai Chen <chenhuacai>
    Cc: Ingo Molnar <mingo>
    Cc: Johannes Berg <johannes>
    Cc: John Paul Adrian Glaubitz <glaubitz.de>
    Cc: Kent Overstreet <kent.overstreet>
    Cc: Liam R. Howlett <Liam.Howlett>
    Cc: Mark Rutland <mark.rutland>
    Cc: Masami Hiramatsu (Google) <mhiramat>
    Cc: Matt Turner <mattst88>
    Cc: Max Filippov <jcmvbkbc>
    Cc: Michael Ellerman <mpe.au>
    Cc: Michal Simek <monstr>
    Cc: Oleg Nesterov <oleg>
    Cc: Palmer Dabbelt <palmer>
    Cc: Peter Zijlstra <peterz>
    Cc: Richard Weinberger <richard>
    Cc: Russell King <linux.uk>
    Cc: Song Liu <song>
    Cc: Stafford Horne <shorne>
    Cc: Steven Rostedt (Google) <rostedt>
    Cc: Suren Baghdasaryan <surenb>
    Cc: Thomas Bogendoerfer <tsbogend.de>
    Cc: Thomas Gleixner <tglx>
    Cc: Uladzislau Rezki (Sony) <urezki>
    Cc: Vineet Gupta <vgupta>
    Cc: Will Deacon <will>
    Signed-off-by: Andrew Morton <akpm>

 arch/x86/Kconfig   |  1 +
 arch/x86/mm/init.c | 37 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 1 deletion(-)

Comment 12 Kashyap Chamarthy 2025-01-13 11:37:12 UTC
(In reply to Adam Williamson from comment #5)

Hi,

I also responded on the kernel bug here: https://bugzilla.kernel.org/show_bug.cgi?id=219554#c7

> ah, yeah, I think that *is* relevant. I hadn't been able to reproduce this
> locally before, but I just set my local VM's CPU to Nehalem, and reproduced
> the UEFI version of this first try. I noticed the VM did indeed seem to be
> stuck at the bootloader install stage - it was not responding to any kind of
> input, couldn't switch to a VT - then after it sat like that for a few
> seconds or minutes (don't know how long it was sitting like that before I
> saw it), it spontaneously rebooted and then failed to boot properly from the
> disk, just like the openQA test in the video.

[...]


On "Nehalem" (copy/pasting from my kernel bug above):

The virtual CPU model "Nehalem" indeed looks not like a problem here.  I
think a reason "Nehalem" is used as the *default* because if your
underlying host OS is CentOS9 or RHEL9.

As you might know, RHEL and CentOS 9 use "x86-64-v2" 
micro-architecture[1].  Thus, the virtual CPU model, "Nehalem", is the
only common CPU model that works across both Intel and AMD hosts on
"x86-64-v2".  AMD had the necessary CPU features to run "Nehalem".  So, 
it is very handy in a mixed hardware (CI) setup that is running CentOS 9
or RHEL 9.  Upstream OpenStack Infrastructure team also uses "Nehalem" 
as default across its CI cluster for this reason[2].


In your openQA cluster with mixed CPUs, you may have to find a "lowest 
common denominator" CPU model that you can give to your guests.  More on 
it below; it's a bit involved.  An example here:

    https://kashyapc.fedorapeople.org/Calculate-CPU-hypervisor-baseline.html

[1] https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level

[2] "Use Nehalem CPU model by default" — 
    https://review.opendev.org/c/openstack/devstack/+/815020

Comment 13 Kashyap Chamarthy 2025-01-13 16:02:06 UTC
A slightly related note on "Haswell" model for ELN in openQA:

I see in your commit[1] here to use "Haswell" QEMU model for ELN 
buildroot.  I guess you're using it because CentOS Stream 10 is based on
the "x86_64-v3" (the CPU baseline arch) — it requires something higher
than "Nehalem".

As you might know, the moment you add an AMD node to your cluster,
"Haswell" will not work.

The CPU arch for  x86_64-v2 got "lucky" in that QEMU's "Nehalem" model
could work on both Intel *and* AMD.  For x86_64-v3 (CentOS Stream 10 
and above), there's no such option.

[1] https://pagure.io/fedora-qa/fedora_openqa/c/4a888ce — Force QEMUCPU
    to "Haswell" for ELN jobs

Comment 14 Adam Williamson 2025-01-13 17:18:59 UTC
No, Nehalem is not used as a default, I explicitly configure the openQA VMs to use Nehalem because we want to ensure Fedora doesn't inadvertently stop supporting it (this is explained above). There's been more than one case in the past where some package or other inadvertently started using features only available on newer CPUs, even though Nehalem is meant to be our baseline, so I do this as a check on that.

However, I have found that since setting openQA to use `-cpu host`, the bug is still happening sometimes, so that was a bit of a blind alley :/ I still couldn't reproduce it *locally* with `-cpu host`, though, I had to use `-cpu Nehalem` to reproduce it locally (on a Core i7-1250U). Haven't actually tried on my new laptop yet (which has a Ryzen AI 9 365).

"As you might know, the moment you add an AMD node to your cluster, "Haswell" will not work."

Yeah, we don't have any of those. It's all Intel.

Comment 15 Kashyap Chamarthy 2025-01-14 14:51:34 UTC
(In reply to Adam Williamson from comment #14)
> No, Nehalem is not used as a default, I explicitly configure the openQA VMs
> to use Nehalem because we want to ensure Fedora doesn't inadvertently stop
> supporting it (this is explained above). 

Sorry, my phrasing was poor; I did read your comments 4 & 5 above.  I 
just meant it in this sense of default in *your* cluster, configured by  
you.  That's what I was implying with my link to the OpenStack Infra
commit, where we also had to explicitly configure "Nehalem" because
there was a mixed AMD and Intel hosts — you confirm, it's non-problem 
for Fedora's openQA.

> There's been more than one case in
> the past where some package or other inadvertently started using features
> only available on newer CPUs, even though Nehalem is meant to be our
> baseline, so I do this as a check on that.

Yes, I can completely understand that.

> However, I have found that since setting openQA to use `-cpu host`, the bug
> is still happening sometimes, so that was a bit of a blind alley :/ I still
> couldn't reproduce it *locally* with `-cpu host`, though, I had to use `-cpu
> Nehalem` to reproduce it locally (on a Core i7-1250U). Haven't actually
> tried on my new laptop yet (which has a Ryzen AI 9 365).

Hmm, so many variables in the air for this big, let's see if I got it 
right so far:

  - you've bisected the issue to the commit that you link to in 
    comment#11 above;

  - the patch that adds a test for PSE flag didn't help (as predicted on 
    the kernel bug):
    https://lore.kernel.org/lkml/20250103065631.26459-1-jgross@suse.com/T/#u

  - you ruled out on the kernel bug that it's not a problem specific to 
    any specific named model:
    https://bugzilla.kernel.org/show_bug.cgi?id=219554#c5

  - in your *cluster*, it is happening "sometimes" —  I saw your comment
    on the upstream bug, where you list the hardware you got
    (https://bugzilla.kernel.org/show_bug.cgi?id=219554#c9)

  - you can still reproduce the issue *locally*, with `-cpu host`

  - live *and* unattended installs seem to be affected

  - BIOS *and* UEFI installs are affected, but in a slightly different 
    way (BIOS: hangs; UEFI: reboots the system)

    * * *

I'm now curious if the 'mm' patch from Mike Rappaport that you shared on
the upstream bug still works for you: 

    https://bugzilla.kernel.org/show_bug.cgi?id=219554#c9

> "As you might know, the moment you add an AMD node to your cluster,
> "Haswell" will not work."
> 
> Yeah, we don't have any of those. It's all Intel.

Great; that makes matter more bearable.

Comment 16 Adam Williamson 2025-01-14 17:26:37 UTC
>   - you've bisected the issue to the commit that you link to in 
>     comment#11 above;

Correct.

>  - the patch that adds a test for PSE flag didn't help (as predicted on 
>    the kernel bug):
>    https://lore.kernel.org/lkml/20250103065631.26459-1-jgross@suse.com/T/#u

Correct.

>  - you ruled out on the kernel bug that it's not a problem specific to 
>    any specific named model:
>    https://bugzilla.kernel.org/show_bug.cgi?id=219554#c5

Well, I ruled out the named models I tested. I didn't go higher, because I was testing on the cluster and reaching the top end of what the cluster actually *are*. I suspect if I tested on a newer CPU and went higher, I'd probably find a cut-off at some point.

>  - in your *cluster*, it is happening "sometimes" —  I saw your comment
>    on the upstream bug, where you list the hardware you got
>    (https://bugzilla.kernel.org/show_bug.cgi?id=219554#c9)

It is happening "sometimes", yes, and interestingly *seems* to happen only on Xeon Gold 5218 and Xeon Gold 6130 hosts, from a sample I did yesterday I couldn't find a case where the bug happened on the Xeon E5-2683v4 or Xeon E5-2680 hosts. On those systems, it happens whether we use `-cpu host` or `-cpu Nehalem`.

>  - you can still reproduce the issue *locally*, with `-cpu host`

No. Locally - on my i7-1250U - I can only reproduce with `-cpu Nehalem`, I cannot reproduce with `-cpu host`.

>  - live *and* unattended installs seem to be affected

Live and traditional installer, yeah, but it does seem to happen *more* on lives, at least that's my impression.

>  - BIOS *and* UEFI installs are affected, but in a slightly different 
>    way (BIOS: hangs; UEFI: reboots the system)

Yeah, although even in the UEFI case the system hangs for a long time, then *eventually* reboots. I'm guessing that's maybe some kinda cutout in the edk2 firmware, or something, I hadn't looked into it. There's a kernel trace in both cases.

> I'm now curious if the 'mm' patch from Mike Rappaport that you shared on
> the upstream bug still works for you: 

So far, yes, it still seems to be working. I've run 50 installs with it in openQA, and none of them hung.

Comment 17 Adam Williamson 2025-01-14 18:32:13 UTC
It looks like the whole execmem ROX thing is being disabled for final - assuming https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=a9bbe341333109465605e8733bab0b573cddcc8c is pulled - so this should stop being an issue whenever that lands, until it's re-enabled, I guess?

Comment 18 Kashyap Chamarthy 2025-01-20 10:52:44 UTC
(In reply to Adam Williamson from comment #16)

[...]

> >  - you ruled out on the kernel bug that it's not a problem specific to 
> >    any specific named model:
> >    https://bugzilla.kernel.org/show_bug.cgi?id=219554#c5
> 
> Well, I ruled out the named models I tested. I didn't go higher, because I
> was testing on the cluster and reaching the top end of what the cluster
> actually *are*. I suspect if I tested on a newer CPU and went higher, I'd
> probably find a cut-off at some point.

Yeah, it gets tedious to test these combinations.

> >  - in your *cluster*, it is happening "sometimes" —  I saw your comment
> >    on the upstream bug, where you list the hardware you got
> >    (https://bugzilla.kernel.org/show_bug.cgi?id=219554#c9)
> 
> It is happening "sometimes", yes, and interestingly *seems* to happen only
> on Xeon Gold 5218 and Xeon Gold 6130 hosts, from a sample I did yesterday I
> couldn't find a case where the bug happened on the Xeon E5-2683v4 or Xeon
> E5-2680 hosts. On those systems, it happens whether we use `-cpu host` or
> `-cpu Nehalem`.

I see, I don't know what to say about these Xeon Gold 5218, 6130

> >  - you can still reproduce the issue *locally*, with `-cpu host`
> 
> No. Locally - on my i7-1250U - I can only reproduce with `-cpu Nehalem`, I
> cannot reproduce with `-cpu host`.

Ah, thanks for correcting.

> >  - live *and* unattended installs seem to be affected
> 
> Live and traditional installer, yeah, but it does seem to happen *more* on
> lives, at least that's my impression.
> 
> >  - BIOS *and* UEFI installs are affected, but in a slightly different 
> >    way (BIOS: hangs; UEFI: reboots the system)
> 
> Yeah, although even in the UEFI case the system hangs for a long time, then
> *eventually* reboots. I'm guessing that's maybe some kinda cutout in the
> edk2 firmware, or something, I hadn't looked into it. There's a kernel trace
> in both cases.
> 
> > I'm now curious if the 'mm' patch from Mike Rappaport that you shared on
> > the upstream bug still works for you: 
> 
> So far, yes, it still seems to be working. I've run 50 installs with it in
> openQA, and none of them hung.

Nice; I hope the positive trend continues.

Overall, nice sleuthing.

Comment 19 Adam Williamson 2025-01-27 22:32:33 UTC
Since this got reverted upstream, I think let's close this for now. If the rox stuff comes back without a fix for this, I can re-open.


Note You need to log in before you can comment on or make changes to this bug.