Created attachment 1009330 [details] Kernel log of recursive error in radeon device driver module Description of problem: Recursive error in radeon device driver module after resume from hibernation. See attached kernel log for details. Version-Release number of selected component (if applicable): Linux 3.18.9-200.fc21.x86_64 #1 SMP Mon Mar 9 15:10:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux (and later) Steps to reproduce: 1. Run X.Org X Server 1.16.3 with radeon X.Org driver 7.5.0 (plus GNOME 3.14 or whatever else you like) on AMD A10-7800 Radeon R7 2. Suspend to disk with i.e. "systemctl hibernate". 3. Resume from hibernation. Actual results: Current login session is killed and X.Org server recovers into AIGLX software rendering. Expected results: No error in radeon device driver module. Additional info: See attached cpuinfo and meminfo. Booting with radeon default kernel boot parameters.
Created attachment 1009331 [details] lspci
Created attachment 1009332 [details] cpuinfo
Created attachment 1009333 [details] meminfo
Created attachment 1009352 [details] iomem
Created attachment 1009353 [details] ioports
Why is this bug report linked to bug FreeDesktop.org 89829? As far as I can tell these bugs have nothing in common.
Not sure why I linked this specific fdo bug report as it is clearly not the same bug (as you noticed as well). I think I saw a very similar bug report upstream but can't find it now. Anyhow: From my experience it's best if you file your bug report upstream if you are willing to bisect/test new patches from the AMD developers. This is very likely a genuine upstream bug and Fedora developers are usually busy fixing all the integration stuff.
Linked to bug report upstream.
This bug is obviously also linked to Linux kernel bugs 77181 and 60827. Mantas Mikulėnas has determined that git commit 4474f3a91f95 was the last known good to work. archiesix has determined that this bug persists even since kernel version 3.9.11. It's a pity that actually *users* have to do the digging for this kind of information. Its all there but kernel developers are obviously too tired or too lazy to do actual work after they have spent countless hours bragging about how genius they are in delivering fucked up work. If you can't do it, don't touch it. Oh and another "secret" has been revealed: The bug is caused by ring test failures. Wow! Who could have thought of that!?
Jakob: I think it's best not to assign this to bugzilla's "kernel" component. At least the intel bugs get assigned to the xorg-x11-... because there are just too many "kernel" bugs in Fedora. I assume the AMD guys employ the same strategy. > It's a pity that actually *users* have to do the digging for this kind of > information. Its all there but kernel developers are obviously too tired or > too lazy to do actual work after they have spent countless hours bragging > about how genius they are in delivering fucked up work. If you can't do it, > don't touch it. I can understand your frustration but I think you should take some extra time before writing such statements. https://getfedora.org/code-of-conduct
(In reply to Jacob Wisor from comment #9) > This bug is obviously also linked to Linux kernel bugs 77181 and 60827. > Mantas Mikulėnas has determined that git commit 4474f3a91f95 was the last > known good to work. archiesix has determined that this bug persists even > since kernel version 3.9.11. > > It's a pity that actually *users* have to do the digging for this kind of > information. Its all there but kernel developers are obviously too tired or > too lazy to do actual work after they have spent countless hours bragging > about how genius they are in delivering fucked up work. If you can't do it, > don't touch it. > Oh and another "secret" has been revealed: The bug is caused by ring test > failures. Wow! Who could have thought of that!? You obviously assume that if you are hitting this bug so must all other people with same hardware. Well that's a wrong assumption, even for a same family of GPU each of the OEM (Asus, Saphire, ...) customize the video bios and select different components for their board notably memory chip. Add different motherboard, system memory, system bios, system PCIE chipset, ... to the mix and you end up with vastly different configurations in which each elements might trigger a bug that only happen with this specific configurations. So if you are hitting a bug such like this, it is likely because none of the dev are hitting it on their hardware. You might be unlucky or the dev might be lucky. But you should not assume that when it comes to the hardware, a bug affecting a GPU family affects all the GPU of that family. It is a lot more complex.
(In reply to Felix Schwarz from comment #10) > Jakob: I think it's best not to assign this to bugzilla's "kernel" > component. At least the intel bugs get assigned to the xorg-x11-... because > there are just too many "kernel" bugs in Fedora. I assume the AMD guys > employ the same strategy. I see, thank you for the info. I was compelled to move this bug back to the kernel component because it is a kernel-space bug, not a user-space bug. The xorg-x11-drv-ati component seems to be for the AMD driver module of the X11 server only, which of course runs in user-space. But, if xorg-x11-drv-ati is synonymous with the radeon kernel device driver module and the X11 server's AMD device driver module in this context then the bug should have probably stayed in xorg-x11-drv-ati. Do you want me to change it back? > > It's a pity that actually *users* have to do the digging for this kind of > > information. Its all there but kernel developers are obviously too tired or > > too lazy to do actual work after they have spent countless hours bragging > > about how genius they are in delivering fucked up work. If you can't do it, > > don't touch it. > > I can understand your frustration but I think you should take some extra > time before writing such statements. https://getfedora.org/code-of-conduct Okay, fair enough. However, I have no means to verify or to know that my problem is taken care of seriously. There are plenty of bug reports in both Bugzilla systems that linger around for a long time. And too be honest, I do not believe they are all taken care of. I have worked with bug trackers myself and fixed a lot of bugs in my course of work, even the hard interdependent ones. Many people said it was impossible or too much work to do but it can be done. You just need to be persistent and work with the reporters, otherwise you have little chance to actually fixing anything. So please work with me. Give me some modified kernel package, what ever, but work with me. If you do not have the exact hardware to replicate the error then this situation calls even more so for working with the reporter by testing code. For now, I have the feeling that we have not been doing much more than just juggling around a bug report in Bugzilla.
(In reply to Jerome Glisse from comment #11) > (In reply to Jacob Wisor from comment #9) > > This bug is obviously also linked to Linux kernel bugs 77181 and 60827. > > Mantas Mikulėnas has determined that git commit 4474f3a91f95 was the last > > known good to work. archiesix has determined that this bug persists even > > since kernel version 3.9.11. > > > > It's a pity that actually *users* have to do the digging for this kind of > > information. Its all there but kernel developers are obviously too tired or > > too lazy to do actual work after they have spent countless hours bragging > > about how genius they are in delivering fucked up work. If you can't do it, > > don't touch it. > > Oh and another "secret" has been revealed: The bug is caused by ring test > > failures. Wow! Who could have thought of that!? > > You obviously assume that if you are hitting this bug so must all other > people with same hardware. Well that's a wrong assumption, even for a same > family of GPU each of the OEM (Asus, Saphire, ...) customize the video bios > and select different components for their board notably memory chip. Add > different motherboard, system memory, system bios, system PCIE chipset, ... > to the mix and you end up with vastly different configurations in which each > elements might trigger a bug that only happen with this specific > configurations. I am not running on any OEM hardware, no dedicated GPU, just the A10's integrated GPU. They only component that I can think of to be adding any "magic sauce" here is the system BIOS. And since the system BIOS is usually involved while transitioning to the hibernation or suspend states it may be a problem indeed. However, I doubt that very much because the hibernation and suspend features work just fine on Windows (running the AMD graphics device driver) *and* when running in VESA mode on Linux. So it is pretty easy to rule out the system BIOS having any bug that all of the aforementioned software pieces might employ equally to workaround that bug. Hence, this bug can clearly be only attributed to the radeon kernel device driver module or the dynamically loaded Kaveri GPU firmware, which comes with the kernel. > So if you are hitting a bug such like this, it is likely because none of the > dev are hitting it on their hardware. You might be unlucky or the dev might > be lucky. But you should not assume that when it comes to the hardware, a > bug affecting a GPU family affects all the GPU of that family. It is a lot > more complex. First of all, I did not assume nor stated that this bug affects the entire family of GPUs. All I have said is that the A10 integrated GPU should have quite a notable market penetration by now, so that it should be relatively easy to find hardware for testing. Secondly, if this is a bug specific to a certain configuration only - which again, I highly doubt - then why not work together with the person who is experiencing the bug instead of just shrugging one's shoulders and dispensing platonic pity about the fact that the person is affected? I am sorry, but so far I have seen no serious attempt to work with me.
This message is a reminder that Fedora 21 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 21. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '21'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 21 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 21 changed to end-of-life (EOL) status on 2015-12-01. Fedora 21 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.