Bug 1941883
Summary: | Idle Radeon RX 550 has a very fast fan and high temperatures under 5.11.7-200.fc33.x86_64 | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Chris Siebenmann <cks-rhbugzilla> | ||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 34 | CC: | acaringi, adscvr, airlied, alciregi, bskeggs, hdegoede, jarodwilson, jeremy, jglisse, jonathan, josef, kernel-maint, lgoncalv, linville, masami256, mchehab, ngompa13, ptalbert, steved | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2022-06-07 22:28:31 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Chris Siebenmann
2021-03-23 02:21:48 UTC
I had an opportunity to inspect the physical machine today and it turns out that the fans are not running at all, despite what appears in hwmon/hwmon2/fan1_input (and is reported by 'sensors' from lm_sensors). hwmon/hwmon2/fan1_enable is 0, but setting it to '1' does nothing. It appears that the fan doesn't turn on at all, even under high load and high temperatures. I ran a GPU benchmark that raised GPU temperatures to over 80C and the fans were still not active. On 5.10.23, fan RPMs rise to a reported 1500 RPM by the time the GPU hits 64 C (and the listed GPU power consumption is only slightly under what lm_sensors lists as the cap). This issue is still present in the just-released 5.11.8-200.fc33.x86_64 kernel. This issue is still present in the just-released 5.11.9-200.fc33.x86_64 kernel. This issue is still present in the just-released 5.11.10-200.fc33.x86_64 kernel. This issue is still present in the just-released 5.11.11-200.fc33.x86_64 kernel. This issue is still present in the just-released 5.11.12-200.fc33.x86_64 kernel. This issue is still present in the just-released 5.11.20-200.fc33.x86_64 (and has been present in the few intermediate kernels I also checked). This issue is still present in the just-released 5.12.6-200.fc33.x86_64 kernel. Examining boot time messages between 5.10 (working) and 5.11 and 5.12 (not), the 5.11 and 5.12 kernels report: amdgpu 0000:0a:00.0: amdgpu: Using BACO for runtime pm The 5.10 kernel(s) also report values for clocks from DM PPLIB, while 5.12 and 5.11 don't: hawkwind.cs kernel: [drm] DM_PPLIB: values for Engine clock hawkwind.cs kernel: [drm] DM_PPLIB: 214000 hawkwind.cs kernel: [drm] DM_PPLIB: 551000 hawkwind.cs kernel: [drm] DM_PPLIB: 734000 hawkwind.cs kernel: [drm] DM_PPLIB: 980000 hawkwind.cs kernel: [drm] DM_PPLIB: 1046000 hawkwind.cs kernel: [drm] DM_PPLIB: 1098000 hawkwind.cs kernel: [drm] DM_PPLIB: 1124000 hawkwind.cs kernel: [drm] DM_PPLIB: 1206000 hawkwind.cs kernel: [drm] DM_PPLIB: Validation clocks: hawkwind.cs kernel: [drm] DM_PPLIB: engine_max_clock: 120600 hawkwind.cs kernel: [drm] DM_PPLIB: memory_max_clock: 175000 hawkwind.cs kernel: [drm] DM_PPLIB: level : 8 hawkwind.cs kernel: [drm] DM_PPLIB: values for Memory clock hawkwind.cs kernel: [drm] DM_PPLIB: 300000 hawkwind.cs kernel: [drm] DM_PPLIB: 625000 hawkwind.cs kernel: [drm] DM_PPLIB: 1750000 hawkwind.cs kernel: [drm] DM_PPLIB: Validation clocks: hawkwind.cs kernel: [drm] DM_PPLIB: engine_max_clock: 120600 hawkwind.cs kernel: [drm] DM_PPLIB: memory_max_clock: 175000 hawkwind.cs kernel: [drm] DM_PPLIB: level : 8 5.12 and 5.10 report different DRM display core initialization versions: [drm] Display Core initialized with v3.2.122! 5.10 reports v3.2.104. More poking in /sys and some (remote) experiments have revealed that setting pwm1_enable to 1 and writing a suitable non-zero value to pwm1 in /sys/devices/pci0000:00/0000:00:03.1/0000:0a:00.0/hwmon/hwmon2 will cause the fan to apparently spin up and the card to cool down. On 5.12, pwm1_enable's normal value is 2, but pwm1 itself sticks at zero, instead of the '81' that it normally is on 5.10. Changing pwm1_enable back to 2 after it was set to 1 (and pwm1 set to something) on 5.12.6 causes pwm1 to shift rapidly around in a range between 94 and 127 (so far) and the reported GPU temperature to hold steady around 30 C (which is somewhat cooler than 5.10 was holding the card; at the moment that was about 32 C, up from 28 C presumably due to summer heat arriving here and the ambient office temperature going up). > 5. Does this problem occur with the latest Rawhide kernel? To install the > Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by > ``sudo dnf update --enablerepo=rawhide kernel``: > > Have not tested. Sorry, I'm not running Rawhide kernels on a machine I need to work. This makes things a bit more difficult, because right now, the 5.13 rc kernels are in Rawhide and knowing whether this is still broken in an RC kernel is valuable so that it can be looked at to be fixed during this kernel cycle and backported to stable kernels. And if it's fixed in 5.13, then at least there's that as an option too. I tried to quickly test a Rawhide kernel, but discovered that OpenZFS isn't compatible with 5.13-rc at this point (its work for even 5.12 is still somewhat in progress in git tip). Since much of my data storage is in ZFS pools, I cannot even start to reboot my office machine remotely without ZFS available (at the moment and for the likely future we are not in the office). This message is a reminder that Fedora 33 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '33'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 33 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. This message is a reminder that Fedora 33 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '33'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 33 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. This message is a reminder that Fedora 33 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '33'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 33 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. This continues to be the case on Fedora 34 with kernels up to 5.14.15-200.fc34.x86_64. I've updated this to be a Fedora 34 bug. This message is a reminder that Fedora Linux 34 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 34 on 2022-06-07. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '34'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 34 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed. Fedora Linux 34 entered end-of-life (EOL) status on 2022-06-07. Fedora Linux 34 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. Thank you for reporting this bug and we are sorry it could not be fixed. |