Description of problem: System hangs for few seconds. uname: 5.1.15-300.fc30.x86_64 #1 SMP Tue Jun 25 14:07:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Additional info: dmesg dump: [ 6933.833583] perf: interrupt took too long (5219 > 5210), lowering kernel.perf_event_max_sample_rate to 38000 [10417.191008] i915 0000:00:02.0: GPU HANG: ecode 7:1:0xfffffffe, in Xwayland [2033], hang on rcs0 [10417.191010] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [10417.191011] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [10417.191011] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [10417.191012] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [10417.191013] [drm] GPU crash dump saved to /sys/class/drm/card0/error [10417.191065] i915 0000:00:02.0: Resetting chip for hang on rcs0 hwinfo: H/W path Device Class Description =============================================================== system Aspire S7-191 (Aspire S7-191_0746_2.09) /0 bus Helium /0/0 memory 128KiB BIOS /0/4 processor Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz /0/4/6 memory 32KiB L1 cache /0/4/7 memory 256KiB L2 cache /0/4/8 memory 3MiB L3 cache /0/5 memory 32KiB L1 cache /0/e memory 4GiB System Memory /0/e/0 memory 2GiB SODIMM DDR3 Synchronous 1333 MHz (0.8 ns) /0/e/1 memory 2GiB SODIMM DDR3 Synchronous 1333 MHz (0.8 ns) /0/100 bridge 3rd Gen Core processor DRAM Controller /0/100/2 display 3rd Gen Core processor Graphics Controller /0/100/14 bus 7 Series/C210 Series Chipset Family USB xHCI Host Controller /0/100/14/0 usb2 bus xHCI Host Controller /0/100/14/0/1 scsi6 storage DataTraveler 2.0 /0/100/14/0/1/0.0.0 /dev/sdc disk 8074MB DataTraveler 2.0 /0/100/14/0/1/0.0.0/0 /dev/sdc disk 8074MB /0/100/14/0/1/0.0.0/0/2 volume 15EiB Windows FAT volume /0/100/14/0/1/0.0.0/0/3 volume 20MiB Empty partition /0/100/14/0/2 bus USB2.0 Hub /0/100/14/0/2/3 input USB Keyboard /0/100/14/0/2/4 input USB Receiver /0/100/14/0/4 input Touchscreen /0/100/14/1 usb3 bus xHCI Host Controller /0/100/16 communication 7 Series/C216 Chipset Family MEI Controller #1 /0/100/1b multimedia 7 Series/C216 Chipset Family High Definition Audio Controller /0/100/1c bridge 7 Series/C216 Chipset Family PCI Express Root Port 1 /0/100/1c.3 bridge 7 Series/C216 Chipset Family PCI Express Root Port 4 /0/100/1c.3/0 wlp2s0 network AR9462 Wireless Network Adapter /0/100/1d bus 7 Series/C216 Chipset Family USB Enhanced Host Controller #1 /0/100/1d/1 usb1 bus EHCI Host Controller /0/100/1d/1/1 bus Integrated Rate Matching Hub /0/100/1d/1/1/6 communication Bluetooth wireless interface /0/100/1d/1/1/7 multimedia HD WebCam /0/100/1f bridge HM77 Express Chipset LPC Controller /0/100/1f.2 scsi0 storage 82801 Mobile SATA Controller [RAID mode] /0/100/1f.2/0 /dev/sda disk 64GB LITEONIT CMT-64L /0/100/1f.2/0/1 volume 199MiB System partition /0/100/1f.2/0/2 volume 1023MiB EFI partition /0/100/1f.2/0/3 volume 118GiB LVM Physical Volume /0/100/1f.2/1 /dev/sdb disk 64GB LITEONIT CMT-64L /0/100/1f.3 bus 7 Series/C216 Chipset Family SMBus Controller /0/1 generic PnP device ETD0504 /0/2 system PnP device PNP0c02 /0/3 system PnP device PNP0b00 /0/6 generic PnP device INT3f0d /0/7 input PnP device PNP0303 /0/8 system PnP device PNP0c02 /0/9 system PnP device PNP0c01 /1 virbr0-nic network Ethernet interface /2 virbr0 network Ethernet interface
More information: Happens frequently if cpu is under hard load (used pigz for compressing big amount of files)
From bug 1780800 (opened in December 2019): > The patch was submitted to stable and rejected because it doesn't apply to 5.4. [...] > Upstream issue reports that backporting the fix from 5.5 to 5.4 is non-trivial. And now there are a few attempts at reverting the change that introduced the problem, so even the revert is apparently > not straightforward. Skylake and Kabylake CPUs are affected, but I'm not sure if it's all or a subset of those.
This also occurs in 5.4.12-200.fc31.x86_64
(In reply to Anthony Messina from comment #3) > This also occurs in 5.4.12-200.fc31.x86_64 Since that's a Fedora 31 kernel, you may want to follow the Fedora 31 Bugzilla ticket for it: BZ 1794064 - i915 0000:00:02.0: GPU HANG: ecode 9:1:0x00000000, hang on rcs0 <https://bugzilla.redhat.com/show_bug.cgi?id=1794064>
Same problem on Fedora 31 ( 5.5.15-200.fc31.x86_64 ) I get the problem when i move huge files on network ( scp/sftp/rsync+ssh ). The only entry into logs are: ``` kernel: Asynchronous wait on fence i915:xfwm4[1494]:33ad0 timed out (hint:intel_atomic_commit_ready+0x0/0x50 [i915]) kernel: Asynchronous wait on fence i915:xfwm4[1494]:33ad0 timed out (hint:intel_atomic_commit_ready+0x0/0x50 [i915]) ``` The result is that computer become completely freezed but it works via SSH, but if I exec a reboot command, i need to wait about 20min to unlock and see reboot works.
Created attachment 1679859 [details] dmesg for gpu hang I am getting something similar on Rawhide with the 5.7 kernel series. I kept 5.6.0-0.rc7.git1.1.fc33.x86_64 around which works fine. I can log into Openbox, but then it hangs. Sometimes I can log into Sway, but it quickly hangs. The last time I tried Sway, all I got was alternating black and grey screens. The attached is "dmesg | grep i915" which shows both Sway and X (Openbox) failing. Thanks.
Created attachment 1680923 [details] dmesg for gpu hang 20200422 I am still getting this with kernel-5.7.0-0.rc2.1.fc33.x86_64.
Still present with kernel 5.5.17-200.fc31.x86_64 I also encountered the problem with file transfers to storage devices as well as network traffic.
(In reply to bitchecker from comment #8) > Still present with kernel 5.5.17-200.fc31.x86_64 > > I also encountered the problem with file transfers to storage devices as > well as network traffic. Always completely freezed...much time to reboot ( ~ 20/30 min ).
Created attachment 1682555 [details] dmesg for gpu hang April 28,2020 Still getting this with kernel-5.7.0-0.rc3.1.fc33.x86_64.
Created attachment 1682556 [details] gpu error to accompany the dmesg April28,2020
This message is a reminder that Fedora 30 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 30 on 2020-05-26. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '30'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 30 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
I'm seeing this with Rawhide (would be Fedora 33), so can we change the version for the bug, or should I create a new bug report?
problems happens also with Fedora 31. Recently upgraded system from 31 to 32 and tested with a network file transfer. still present. Please, change version.
The problem is still present with a local copy of a large file. Also tried to switch from LightDM to GDM but the situation does not change at all. Please give importance to this bug because it is not possible to have to completely kill the machine and restart it again and again in order to work.
Fedora 30 changed to end-of-life (EOL) status on 2020-05-26. Fedora 30 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.
I filed a new bug as the problem still persists. https://bugzilla.redhat.com/show_bug.cgi?id=1843274