Bug 2292434 - Xorg crashes and makes the box unusable, only hard reset
Summary: Xorg crashes and makes the box unusable, only hard reset
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: mesa
Version: 40
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Adam Jackson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-06-14 19:20 UTC by Ronald Warsow
Modified: 2025-05-20 09:11 UTC (History)
13 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2025-05-20 09:11:09 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
journal (379.31 KB, text/plain)
2024-06-14 19:26 UTC, Ronald Warsow
no flags Details

Description Ronald Warsow 2024-06-14 19:20:14 UTC
[   71.637709] BUG: unable to handle page fault for address: fffffffffffffe20
[   71.637713] #PF: supervisor read access in kernel mode
[   71.637715] #PF: error_code(0x0000) - not-present page
[   71.637716] PGD 34642d067 P4D 34642d067 PUD 34642f067 PMD 0 
[   71.637718] Oops: 0000 [#1] PREEMPT SMP NOPTI
[   71.637720] CPU: 6 PID: 2217 Comm: Xorg Tainted: G     U             6.9.4-200.fc40.x86_64 #1
[   71.637721] Hardware name: ASUS System Product Name/ROG STRIX B560-G GAMING WIFI, BIOS 2203 02/06/2024
[   71.637722] RIP: 0010:drm_suballoc_free+0x1f/0x120 [drm_suballoc_helper]
[   71.637727] Code: 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 48 85 ff 0f 84 e1 00 00 00 41 57 41 56 41 55 41 54 55 48 89 fd 53 <4c> 8b 67 20 48 89 f3 4c 89 e7 e8 12 91 b4 f3 48 85 db 0f 84 93 00
[   71.637728] RSP: 0018:ffffb1f00420f5a0 EFLAGS: 00010286
[   71.637730] RAX: fffffffffffffe00 RBX: ffffa0fa46beb040 RCX: 00000000007a2006
[   71.637731] RDX: 00000000007a0006 RSI: 0000000000000000 RDI: fffffffffffffe00
[   71.637731] RBP: fffffffffffffe00 R08: fffffffffffffe00 R09: 0000000000000000
[   71.637732] R10: 0000000000000100 R11: 0000000000000000 R12: 00000000fffffe00
[   71.637733] R13: fffffffffffffe00 R14: ffffa0fa48448028 R15: ffffa0fa49754000
[   71.637734] FS:  00007ff835d9eb00(0000) GS:ffffa0fd8f700000(0000) knlGS:0000000000000000
[   71.637735] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   71.637736] CR2: fffffffffffffe20 CR3: 0000000109674004 CR4: 0000000000770ef0
[   71.637737] PKRU: 55555554
[   71.637738] Call Trace:
[   71.637740]  <TASK>
[   71.637741]  ? __die_body.cold+0x19/0x27
[   71.637745]  ? page_fault_oops+0x15a/0x2c0
[   71.637747]  ? search_module_extables+0x19/0x60
[   71.637749]  ? search_bpf_extables+0x5f/0x80
[   71.637751]  ? exc_page_fault+0x170/0x180
[   71.637753]  ? asm_exc_page_fault+0x26/0x30
[   71.637755]  ? drm_suballoc_free+0x1f/0x120 [drm_suballoc_helper]
[   71.637758]  xe_migrate_update_pgtables+0x6a1/0x9d0 [xe]
[   71.637823]  ? __xe_pt_bind_vma+0x438/0xc60 [xe]
[   71.637873]  __xe_pt_bind_vma+0x471/0xc60 [xe]
[   71.637923]  xe_vm_bind_vma+0xc3/0x450 [xe]
[   71.637979]  xe_vm_bind+0x103/0x210 [xe]
[   71.638032]  __xe_vma_op_execute+0x2aa/0x550 [xe]
[   71.638085]  ? new_vma+0x388/0x630 [xe]
[   71.638137]  xe_vm_bind_ioctl+0x1dd5/0x2090 [xe]
[   71.638188]  ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[   71.638242]  drm_ioctl_kernel+0xb0/0x100
[   71.638246]  drm_ioctl+0x28b/0x540
[   71.638263]  ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[   71.638314]  __x64_sys_ioctl+0x94/0xd0
[   71.638317]  do_syscall_64+0x82/0x160
[   71.638319]  ? __handle_mm_fault+0x867/0xe10
[   71.638322]  ? mt_find+0x21c/0x580
[   71.638324]  ? __count_memcg_events+0x69/0x100
[   71.638326]  ? count_memcg_events.constprop.0+0x1a/0x30
[   71.638327]  ? handle_mm_fault+0x1f0/0x300
[   71.638329]  ? do_user_addr_fault+0x17f/0x620
[   71.638331]  ? clear_bhb_loop+0x45/0xa0
[   71.638333]  ? clear_bhb_loop+0x45/0xa0
[   71.638334]  ? clear_bhb_loop+0x45/0xa0
[   71.638335]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   71.638337] RIP: 0033:0x7ff8364a6d2d
[   71.638349] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
[   71.638351] RSP: 002b:00007ffc48a85160 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   71.638352] RAX: ffffffffffffffda RBX: 00007ffc48a85220 RCX: 00007ff8364a6d2d
[   71.638353] RDX: 00007ffc48a85220 RSI: 0000000040886445 RDI: 0000000000000010
[   71.638354] RBP: 00007ffc48a851b0 R08: 0000000000006000 R09: 0000000000000000
[   71.638355] R10: 000055a528762010 R11: 0000000000000246 R12: 0000000000000010
[   71.638356] R13: 0000000000000000 R14: 000055a5288180d0 R15: 000055a52a487160
[   71.638357]  </TASK>
[   71.638358] Modules linked in: snd_seq_dummy snd_hrtimer rfcomm nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables bnep sunrpc vfat fat snd_sof_pci_intel_tgl iwlmvm snd_sof_intel_hda_common soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_hda_codec_hdmi snd_sof mac80211 snd_sof_utils snd_soc_hdac_hda snd_soc_acpi_intel_match soundwire_generic_allocation snd_soc_acpi soundwire_bus snd_hda_codec_realtek snd_soc_avs snd_hda_codec_generic snd_soc_hda_codec intel_rapl_msr snd_hda_scodec_component snd_hda_ext_core intel_rapl_common snd_soc_core libarc4 intel_uncore_frequency intel_uncore_frequency_common x86_pkg_temp_thermal snd_usb_audio intel_powerclamp snd_compress ac97_bus snd_pcm_dmaengine coretemp snd_hda_intel snd_intel_dspcfg snd_usbmidi_lib snd_intel_sdw_acpi kvm_intel snd_ump snd_hda_codec btusb snd_rawmidi
[   71.638387]  snd_hda_core mc btrtl iwlwifi snd_hwdep spi_nor btintel kvm ledtrig_netdev snd_seq iTCO_wdt btbcm btmtk intel_pmc_bxt snd_seq_device mtd mei_pxp mei_hdcp iTCO_vendor_support bluetooth ee1004 snd_pcm cfg80211 intel_cstate asus_nb_wmi asus_wmi snd_timer joydev mei_me i2c_i801 sparse_keymap spi_intel_pci snd igc platform_profile wmi_bmof intel_uncore i2c_smbus spi_intel soundcore mei idma64 rfkill intel_pmc_core intel_vsec pmt_telemetry pmt_class acpi_pad acpi_tad fuse loop nfnetlink zram xe drm_ttm_helper gpu_sched drm_suballoc_helper drm_gpuvm drm_exec i915 crct10dif_pclmul crc32_pclmul crc32c_intel i2c_algo_bit polyval_clmulni drm_buddy polyval_generic ttm nvme ghash_clmulni_intel drm_display_helper sha512_ssse3 nvme_core sha256_ssse3 sha1_ssse3 cec nvme_auth video wmi pinctrl_tigerlake ecryptfs
[   71.638421] CR2: fffffffffffffe20
[   71.638423] ---[ end trace 0000000000000000 ]---
[   71.638424] RIP: 0010:drm_suballoc_free+0x1f/0x120 [drm_suballoc_helper]
[   71.638427] Code: 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 48 85 ff 0f 84 e1 00 00 00 41 57 41 56 41 55 41 54 55 48 89 fd 53 <4c> 8b 67 20 48 89 f3 4c 89 e7 e8 12 91 b4 f3 48 85 db 0f 84 93 00
[   71.638428] RSP: 0018:ffffb1f00420f5a0 EFLAGS: 00010286
[   71.638429] RAX: fffffffffffffe00 RBX: ffffa0fa46beb040 RCX: 00000000007a2006
[   71.638430] RDX: 00000000007a0006 RSI: 0000000000000000 RDI: fffffffffffffe00
[   71.638431] RBP: fffffffffffffe00 R08: fffffffffffffe00 R09: 0000000000000000
[   71.638432] R10: 0000000000000100 R11: 0000000000000000 R12: 00000000fffffe00
[   71.638432] R13: fffffffffffffe00 R14: ffffa0fa48448028 R15: ffffa0fa49754000
[   71.638433] FS:  00007ff835d9eb00(0000) GS:ffffa0fd8f700000(0000) knlGS:0000000000000000
[   71.638434] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   71.638435] CR2: fffffffffffffe20 CR3: 0000000109674004 CR4: 0000000000770ef0
[   71.638436] PKRU: 55555554
[   71.638437] note: Xorg[2217] exited with irqs disabled


Reproducible: Always

Steps to Reproduce:
1. run on F40 a Xorg session with 6.9.4 kernel and Xe GPU driver enabled
2. randomly fast click different cells in libreOffice calc sheet
3.
Actual Results:  
box is nearly dead, mouse is movable but you can't open/close apps, etc. and no reactions on keyboard input

after hard reset journalctl -b1 shows a kernel crash


to me it seems this bug was introduced with mesa-24.1.x

why ?
I do test (homebrewed) stable kernels and I haven't seen the above bug through the whole 6.9.x stable serie until mesa-24.1.x touched the scene. 

with an downgrade to mesa-24.0.x I haven't been able to trigger the bug.

now since F40 released a distro 6.9.x Kernel I see this bug with this kernel too

Comment 1 Ronald Warsow 2024-06-14 19:26:16 UTC
Created attachment 2037357 [details]
journal

Comment 2 Ronald Warsow 2024-06-14 19:32:28 UTC
it seems the bug is triggerable by just clicking on a random button on a random app, e.g. it occured during trying to upload the attached journal

Comment 3 Ronald Warsow 2024-06-19 20:54:55 UTC
opened an issue at freedesktop.org with reference to here

https://gitlab.freedesktop.org/mesa/mesa/-/issues/11365

Comment 4 Ronald Warsow 2024-07-12 03:48:55 UTC
seems to be fixed
see:
https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2120

Comment 5 Aoife Moloney 2025-04-25 11:00:38 UTC
This message is a reminder that Fedora Linux 40 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 40 on 2025-05-13.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '40'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 40 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 6 Aoife Moloney 2025-05-20 09:11:09 UTC
Fedora Linux 40 entered end-of-life (EOL) status on 2025-05-13.

Fedora Linux 40 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.