Bug 2155067

Summary: Xorg crash on i915 gpu hang
Product: [Fedora] Fedora Reporter: Adam Pribyl <covex>
Component: mesaAssignee: Adam Jackson <ajax>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 36CC: ajax, bskeggs, igor.raits, jglisse, j, lyude, mail, mrmazda, ofourdan, rhughes, rstrode, sandmann, tstellar, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-25 15:26:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sys gpu error
none
coredumpctl info none

Description Adam Pribyl 2022-12-19 20:18:11 UTC
Description of problem:
Lately experiencing seldom crash of complete Xorg server. The crash seems to happen becuse of GPU hang (kernel reported) caused by Firefox browser. Thus I am not sure which component should be used for the report.

Dec 19 20:52:59 me kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Dec 19 20:52:59 me kernel: i915 0000:00:02.0: [drm] Xorg[1685265] context reset due to GPU hang
Dec 19 20:52:59 me kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:85dffffb, in Xorg [1685265]


Version-Release number of selected component (if applicable):
xorg-x11-drv-intel-2.99.917-52.20200205.fc36.x86_64

How reproducible:
Seldom

Steps to Reproduce:
Not sure.

Additional info:
Attaching /sys/class/drm/card0/error and coredump info.

Comment 1 Adam Pribyl 2022-12-19 20:19:06 UTC
Created attachment 1933655 [details]
sys gpu error

Comment 2 Adam Pribyl 2022-12-19 20:20:03 UTC
Created attachment 1933656 [details]
coredumpctl info

Comment 3 Felix Miata 2022-12-30 06:22:09 UTC
Upstream, the xorg-x11-drv-intel package is an optional package. If you remove it, does any Xorg problem remain?

Comment 4 Adam Pribyl 2023-01-02 17:18:30 UTC
Removed it, but also kernel was updated from 6.0.8 to 6.0.15 and xorg-x11-server-Xorg from 1.20.14-9 to 1.20.14-12. I'll see if the crash happens again (crashed today, with older package versions and xorg-x11-drv-intel, thus I did the updates etc.).

Comment 5 Adam Pribyl 2023-01-08 18:38:39 UTC
Crashed again today:

Jan  8 19:22:26 me kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Jan  8 19:22:26 me kernel: i915 0000:00:02.0: [drm] Xorg[2595] context reset due to GPU hang
Jan  8 19:22:26 me kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:85dffffb, in Xorg [2595]
Jan  8 19:22:27 me audit[2595]: ANOM_ABEND auid=4294967295 uid=0 gid=0 ses=4294967295 subj=kernel pid=2595 comm="Xorg:gdrv0" exe="/usr/libexec/Xorg" sig=6 res=1
Jan  8 19:22:27 me kernel: audit: type=1701 audit(1673202147.325:709): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=kernel pid=2595 comm="Xorg:gdrv0" exe="/usr/libexec/Xorg" sig=6 res=1
Jan  8 19:22:27 me audit: BPF prog-id=94 op=LOAD
Jan  8 19:22:27 me audit: BPF prog-id=95 op=LOAD

Xorg.0.log
[342543.604] (EE) Backtrace:
[342543.606] (EE) 0: /usr/libexec/Xorg (OsLookupColor+0x13d) [0x5626dfa4a09d]
[342543.607] (EE) 1: /lib64/libc.so.6 (__sigaction+0x50) [0x7f7c0cc3ea30]
[342543.608] (EE) 2: /lib64/libc.so.6 (__pthread_kill_implementation+0x11c) [0x7f7c0cc8ec0c]
[342543.609] (EE) 3: /lib64/libc.so.6 (raise+0x16) [0x7f7c0cc3e986]
[342543.609] (EE) 4: /lib64/libc.so.6 (abort+0xcf) [0x7f7c0cc287f4]
[342543.612] (EE) unw_get_proc_name failed: no unwind info found [-10]
[342543.612] (EE) 5: /usr/lib64/dri/iris_dri.so (?+0x0) [0x7f7c0a6a4967]
[342543.612] (EE) 6: /usr/lib64/dri/iris_dri.so (nouveau_drm_screen_create+0x6e41ad) [0x7f7c0b613e4d]
[342543.613] (EE) 7: /usr/lib64/dri/iris_dri.so (__driDriverGetExtensions_zink+0x55a8b6) [0x7f7c0ac0a7d6]
[342543.614] (EE) 8: /usr/lib64/dri/iris_dri.so (__driDriverGetExtensions_zink+0x55a499) [0x7f7c0ac0a3b9]
[342543.614] (EE) 9: /usr/lib64/dri/iris_dri.so (__driDriverGetExtensions_zink+0x12de4) [0x7f7c0a6c2d04]
[342543.614] (EE) 10: /usr/lib64/dri/iris_dri.so (__driDriverGetExtensions_zink+0x1296b) [0x7f7c0a6c288b]
[342543.615] (EE) 11: /lib64/libc.so.6 (start_thread+0x2cd) [0x7f7c0cc8cded]
[342543.616] (EE) 12: /lib64/libc.so.6 (__clone3+0x30) [0x7f7c0cd12370]
[342543.616] (EE)
[342543.616] (EE)
Fatal server error:
[342543.616] (EE) Caught signal 6 (Aborted). Server aborting

there is no more a xorg-x11-drv-intel package installed. Reassigninging Xorg?

Comment 6 Adam Pribyl 2023-03-05 16:48:37 UTC
Seems like this could be a mesa problem
https://gitlab.freedesktop.org/drm/intel/-/issues/6916
is the Fedora going to move further with mesa to 22.3.x?

Comment 7 Ben Cotton 2023-04-25 18:15:43 UTC
This message is a reminder that Fedora Linux 36 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 36 on 2023-05-16.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '36'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 36 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 8 Ludek Smid 2023-05-25 15:26:12 UTC
Fedora Linux 36 entered end-of-life (EOL) status on 2023-05-16.

Fedora Linux 36 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.