Description of problem: After I upgraded to from mesa-10.1.5-1.20140607.fc20 to mesa-10.3.3-1.20141110.fc20 yesterday, my system *very* often freezes when playing videos in firefox (flash or html5) - 3 times in the last 2 hours. I see this in journal: Nov 24 08:50:21 titan kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x00000000000224a9 last fence id 0x00000000000224a6 on ring 0) Nov 24 08:50:21 titan kernel: radeon 0000:01:00.0: failed to get a new IB (-35) Nov 24 08:50:21 titan kernel: [drm:radeon_cs_ib_fill] *ERROR* Failed to get ib ! Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: Saved 12107 dwords of commands on ring 0. Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: GPU softreset: 0x0000006C Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: GRBM_STATUS = 0xA0003028 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000006 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000006 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00010000 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00400002 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x84010243 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83146 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44E84266 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00100140 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: GRBM_STATUS = 0x00003028 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000006 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000006 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume Nov 24 08:50:22 titan kernel: [drm] probing gen 2 caps for device 8086:c01 = 261ad03/e Nov 24 08:50:22 titan kernel: [drm] PCIE gen 3 link speeds already enabled Nov 24 08:50:22 titan kernel: [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000). Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: WB enabled Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0xffff88022fef8c00 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04 and cpu addr 0xffff88022fef8c04 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08 and cpu addr 0xffff88022fef8c08 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr 0xffff88022fef8c0c Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10 and cpu addr 0xffff88022fef8c10 Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18 and cpu addr 0xffffc900055b5a18 Nov 24 08:50:22 titan kernel: [drm] ring test on 0 succeeded in 4 usecs Nov 24 08:50:22 titan kernel: [drm] ring test on 1 succeeded in 1 usecs Nov 24 08:50:22 titan kernel: [drm] ring test on 2 succeeded in 1 usecs Nov 24 08:50:22 titan kernel: [drm] ring test on 3 succeeded in 6 usecs Nov 24 08:50:22 titan kernel: [drm] ring test on 4 succeeded in 5 usecs Nov 24 08:50:23 titan kernel: [drm] ring test on 5 succeeded in 1 usecs Nov 24 08:50:23 titan kernel: [drm] UVD initialized successfully. Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: ring 0 stalled for more than 10000msec Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x000000000002258d last fence id 0x00000000000224a6 on ring 0) Nov 24 08:50:33 titan kernel: [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35). Nov 24 08:50:33 titan kernel: [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-35). Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: ib ring test failed (-35). Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: GPU softreset: 0x00000048 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: GRBM_STATUS = 0xA0003028 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000006 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000006 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00010000 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00400002 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x84010243 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: GRBM_STATUS = 0x00003028 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000006 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000006 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume Nov 24 08:50:34 titan kernel: [drm] probing gen 2 caps for device 8086:c01 = 261ad03/e Nov 24 08:50:34 titan kernel: [drm] PCIE gen 3 link speeds already enabled Nov 24 08:50:34 titan kernel: [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000). Nov 24 08:50:34 titan kernel: radeon 0000:01:00.0: WB enabled Nov 24 08:50:34 titan kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0xffff88022fef8c00 Nov 24 08:50:34 titan kernel: radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04 and cpu addr 0xffff88022fef8c04 Nov 24 08:50:34 titan kernel: radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08 and cpu addr 0xffff88022fef8c08 Nov 24 08:50:34 titan kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr 0xffff88022fef8c0c Nov 24 08:50:34 titan kernel: radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10 and cpu addr 0xffff88022fef8c10 Nov 24 08:50:34 titan kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18 and cpu addr 0xffffc900055b5a18 Nov 24 08:50:34 titan kernel: [drm] ring test on 0 succeeded in 4 usecs Nov 24 08:50:34 titan kernel: [drm] ring test on 1 succeeded in 1 usecs Nov 24 08:50:34 titan kernel: [drm] ring test on 2 succeeded in 1 usecs Nov 24 08:50:34 titan kernel: [drm] ring test on 3 succeeded in 5 usecs Nov 24 08:50:34 titan kernel: [drm] ring test on 4 succeeded in 5 usecs Nov 24 08:50:34 titan kernel: [drm] ring test on 5 succeeded in 1 usecs Nov 24 08:50:34 titan kernel: [drm] UVD initialized successfully. Nov 24 08:50:34 titan kernel: [drm] ib test on ring 0 succeeded in 0 usecs Nov 24 08:50:34 titan kernel: [drm] ib test on ring 1 succeeded in 0 usecs Nov 24 08:50:34 titan kernel: [drm] ib test on ring 2 succeeded in 0 usecs Nov 24 08:50:34 titan kernel: [drm] ib test on ring 3 succeeded in 0 usecs Nov 24 08:50:34 titan kernel: [drm] ib test on ring 4 succeeded in 1 usecs Nov 24 08:50:44 titan kernel: radeon 0000:01:00.0: ring 5 stalled for more than 10000msec Nov 24 08:50:44 titan kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000004 last fence id 0x0000000000000002 on ring 5) Nov 24 08:50:44 titan kernel: [drm:uvd_v1_0_ib_test] *ERROR* radeon: fence wait failed (-35). Nov 24 08:50:44 titan kernel: [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on ring 5 (-35). Nov 24 08:50:44 titan kernel: switching from power state: Nov 24 08:50:44 titan kernel: ui class: none Nov 24 08:50:44 titan kernel: internal class: boot Nov 24 08:50:44 titan kernel: caps: Nov 24 08:50:44 titan kernel: uvd vclk: 0 dclk: 0 Nov 24 08:50:44 titan kernel: power level 0 sclk: 15000 mclk: 15000 vddc: 900 vddci: 950 pcie gen: 3 Nov 24 08:50:44 titan kernel: status: c b Nov 24 08:50:44 titan kernel: switching to power state: Nov 24 08:50:44 titan kernel: ui class: performance Nov 24 08:50:44 titan kernel: internal class: none Nov 24 08:50:44 titan kernel: caps: Nov 24 08:50:44 titan kernel: uvd vclk: 0 dclk: 0 Nov 24 08:50:44 titan kernel: power level 0 sclk: 30000 mclk: 15000 vddc: 875 vddci: 850 pcie gen: 3 Nov 24 08:50:44 titan kernel: power level 1 sclk: 45000 mclk: 140000 vddc: 950 vddci: 1000 pcie gen: 3 Nov 24 08:50:44 titan kernel: power level 2 sclk: 90000 mclk: 140000 vddc: 1150 vddci: 1000 pcie gen: 3 Nov 24 08:50:44 titan kernel: power level 3 sclk: 95500 mclk: 140000 vddc: 1188 vddci: 1000 pcie gen: 3 Nov 24 08:50:44 titan kernel: status: r Nov 24 08:50:44 titan kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x0fc24404 Nov 24 08:50:44 titan kernel: radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x000152FE Nov 24 08:50:44 titan kernel: radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02044004 Nov 24 08:50:44 titan kernel: VM fault (0x04, vmid 1) at page 86782, read from TC (68) Nov 24 08:50:46 titan kernel: radeon 0000:01:00.0: ring 5 stalled for more than 11940msec Nov 24 08:50:46 titan kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000003 last fence id 0x0000000000000002 on ring 5) Nov 24 08:50:46 titan kernel: radeon 0000:01:00.0: failed to get a new IB (-35) ... The display switches off, and powers back on in regular intervals, just to show a black screen and power off again. I'm not able to recover from it, even though sometimes I'm able to use sysrq kill signal and get a working VT, and then reboot safely. Downgrading back to mesa-10.1.5-1.20140607.fc20.x86_64 seems to fix the problem. I'm using: 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Curacao PRO [Radeon R9 270] [1002:6811] (prog-if 00 [VGA controller]) Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3050] Flags: bus master, fast devsel, latency 0, IRQ 28 Memory at e0000000 (64-bit, prefetchable) [size=256M] Memory at f0000000 (64-bit, non-prefetchable) [size=256K] I/O ports at e000 [size=256] Expansion ROM at f0040000 [disabled] [size=128K] Capabilities: [48] Vendor Specific Information: Len=08 <?> Capabilities: [50] Power Management version 3 Capabilities: [58] Express Legacy Endpoint, MSI 00 Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Capabilities: [150] Advanced Error Reporting Capabilities: [270] #19 Capabilities: [2b0] Address Translation Service (ATS) Capabilities: [2c0] #13 Capabilities: [2d0] #1b Kernel driver in use: radeon Kernel modules: radeon Version-Release number of selected component (if applicable): mesa-10.3.3-1.20141110.fc20 kernel-3.17.3-200.fc20.x86_64 How reproducible: extremely often - many times per day Steps to Reproduce: 1. open firefox and play some videos on youtube / other video sites 2. see computer hang and display power off
Is there any hope of fixing this in the foreseeable future? Because at the moment, I must avoid pulling mesa updates from F20 Updates repo.
I'll check tomorrow. Sorry for late response.
10.3.4 should fix this issue. I'll prepare new version. http://cgit.freedesktop.org/mesa/mesa/commit/?h=10.3&id=f02f0559c69daae6ca73e72d32dc329fcb2fd316
http://koji.fedoraproject.org/koji/buildinfo?buildID=596516 please test this build.
Hi, Igor, 10.3.4 is looking good! :-) In an hour, I haven't seen a single system freeze. I'll continue testing a bit more and report back. Thanks for the update.
mesa-10.3.4-1.20141202.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/mesa-10.3.4-1.20141202.fc20
mesa-10.3.4-1.20141202.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/mesa-10.3.4-1.20141202.fc21
(In reply to Kamil Páral from comment #5) > Hi, Igor, 10.3.4 is looking good! :-) In an hour, I haven't seen a single > system freeze. I'll continue testing a bit more and report back. Thanks for > the update. Cool! Would be good if we will ship new mesa with f21 release.
(In reply to Igor Gnatenko from comment #8) > Cool! Would be good if we will ship new mesa with f21 release. Unfortunately that's not really likely. We're frozen now and hopefully today's RC2 is the last release candidate for f21. The update would end up in 0-day updates, though, if it earns enough karma.
One more thing, could you please set a higher limit on karma auto-push for those bodhi updates? We don't want to repeat the last experience, when the update was accepted in a day, and therefore not tested properly. Ideally, I would completely turn off karma auto-push and manually inspect the updates in a week and push them if the feedback is good, but I understand that requires more time from you as package maintainers. So just speaking with my QA hat on. But at least increased auto-push limits would be nice, since this is a core system component. Thanks.
Package mesa-10.3.4-1.20141202.fc21: * should fix your issue, * was pushed to the Fedora 21 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing mesa-10.3.4-1.20141202.fc21' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2014-16183/mesa-10.3.4-1.20141202.fc21 then log in and leave karma (feedback).
After a few more days, I can confirm I see no freezing while playing videos. Thanks.
mesa-10.3.5-1.20141207.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report.