Bug 1547612
Summary: | gpu hang | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Robert Story <rs> |
Component: | xorg-x11-drv-intel | Assignee: | Adam Jackson <ajax> |
Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 27 | CC: | airlied, ajax, bskeggs, bugzillaredhat-56f0, ewk, hdegoede, ichavero, itamar, jarodwilson, jglisse, john.j5live, jonathan, josef, kernel-maint, labbott, linux, linville, martineau, mchehab, mjg59, steved, williams, xgl-maint |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-11-30 23:44:06 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Robert Story
2018-02-21 15:55:01 UTC
so I spent some more time trying to reproduce this, and came up with the sequence of events that causes the gpu hang every time: 1) start emacs 2) open a file with at least 4 'pages' of data, where a 'page' is the number of lines that youre current emacs window displays. For testing I created a file of 500 80 character lines and tested with emacs window sizes of 33 and 96. 3) press page down once 4) press ctrl-space to start selection 5) press page down twice to select 2 'pages' 6) press ctrl-w to 'cut' selection 7) press up arrow to scroll up one line at this point my screen hangs and /var/log/messages will report Feb 21 11:24:09 titan kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang Feb 21 11:24:17 titan kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang Feb 21 11:24:25 titan kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang ... Feb 21 11:24:41 titan kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang Feb 21 11:24:41 titan at-spi-bus-launcher[812]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0" Feb 21 11:24:41 titan at-spi-bus-launcher[812]: after 54 requests (54 known processed) with 0 events remaining. and X crashes. Moving to the graphics team for tracking I have also experienced this bug, initially with the 4.15.3-300.fc27.x86_64 kernel. Using Robert's steps, I can also reproduce the bug using 4.14.18-300.fc27.x86_64 I'm using i915 graphics on this CPU: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 94 model name : Intel(R) Core(TM) i7-6770HQ CPU @ 2.60GHz stepping : 3 microcode : 0xc2 I'm hitting this on an i7-6600U - it seems to hit me when I run lynx in an xterm and scroll rapidly (I don't have quite as reliable of a test case as above). I have tried updating to the latest packages from F27 updates-testing and still have the same issue. Updating to mesa 17.3.5-1.fc27 did not fix the problem, but reverting to mesa 17.2.4-3.fc27 did. Upstream is asking if someone one coulde try Mesa 18.0.0.rc4. I can't, but if anyone else can please post results upstream (or here and I'll share upstream). I grabbed mesa-18.0.0-0.1.rc4.fc28 from koji and rebuilt it for F27 in mock. It does appear to have fixed the lockups for me. *** Bug 1550679 has been marked as a duplicate of this bug. *** Started happening to me on F27 after upgrading from kernel-4.18.9-100.fc27.x86_64 to kernel-4.18.16-100.fc27.x86_64 while keeping mesa at mesa-dri-drivers-17.3.9-1.fc27. This was quite easily triggered by running an ancient executable with dubious quality (apparently takes some kind of X locks while doing disk or network operations). In any case crashing the X server only started to happen with the new kernel. After reverting to 4.18.9 the problem has not happened again (let's hope it will not...). Fully updated F27 on a Lenovo P50. 00:02.0 VGA compatible controller: Intel Corporation HD Graphics 530 (rev 06) Nov 5 09:14:19 localhost kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0 Nov 5 09:14:27 localhost kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0 Nov 5 09:14:35 localhost kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0 Nov 5 09:14:43 localhost kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0 Nov 5 09:14:46 localhost kernel: asynchronous wait on fence i915:kwin_x11[168968]/1:ac1b timed out Nov 5 09:14:51 localhost kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0 Nov 5 09:14:51 localhost kdeinit5[171604]: The X11 connection broke (error 1). Did the X11 server die? After more investigation, it doesn't depend on kernel versions. I have an ancient closed source x11 app that is doing some CPU intensive work in the wrong place (taking locks or something similar). This can apparently reliably trigger the GPU hang detection and crash the X session. I'm now running this shameful app inside a nested Xephyr and I get no crashes anymore. Of course, even if the app is bad, the X11 server should not crash so easily. This message is a reminder that Fedora 27 is nearing its end of life. On 2018-Nov-30 Fedora will stop maintaining and issuing updates for Fedora 27. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '27'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 27 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 27 changed to end-of-life (EOL) status on 2018-11-30. Fedora 27 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |