Description of problem: /etc/fancontrol directly references the hwmons in /sys/class/hwmon. This doesn't work because these names are not persistent. For example, on this boot I have: lrwxrwxrwx 1 root root 0 май 30 21:38 hwmon0 -> ../../devices/pci0000:00/0000:00:02.0/0000:01:00.0/hwmon/hwmon0 lrwxrwxrwx 1 root root 0 май 30 21:38 hwmon1 -> ../../devices/pci0000:00/0000:00:18.4/hwmon/hwmon1 lrwxrwxrwx 1 root root 0 май 30 21:38 hwmon2 -> ../../devices/pci0000:00/0000:00:18.3/hwmon/hwmon2 lrwxrwxrwx 1 root root 0 май 30 21:38 hwmon3 -> ../../devices/platform/f71882fg.1152/hwmon/hwmon3 Version-Release number of selected component (if applicable): As you can see, hwmon1 is PCI 18.4, and hwmon2 is PCI 18.3. This is a wrong order, but after another boot they swap places randomly. This makes the fancontrol service to _silently_ fail, burning your hw. How reproducible: easily Steps to Reproduce: 1. See the order of hwmons 2. Reboot a couple of times 3. See the order changed 4. Your fancontrol didn't start (or started to use wrong hwmon) Actual results: fancontrol fails to start (or uses wrong hwmon) Expected results: fancontrol should _always_ work right, as it is an absolutely critical service. Additional info: I am not sure if it is an lm_sensors bug, or a kernel bug, or there are just no well-defined interfaces between the two, but this is quite a big problem, considering the bugs like this one: https://bugzilla.kernel.org/show_bug.cgi?id=119211 I am really having my HW constantly burning just because of the things like that.
Hello Stas. I believe this is a kernel bug. Changing the component.
(In reply to Jaromír Cápík from comment #1) > Hello Stas. > > I believe this is a kernel bug. Changing the component. There are 2 issues here: 1) fancontrol should never be a critical service, it may be used to make machines more silent in certain cases, but the kernel / firmware should always ensure that there is adequate cooling unless overridden by e.g. fancontrol, I agree fancontrol being necessary at all is a kernel / amdgpu bug 2) hwmon devices may get a different number between different boots, this is no different from how e.g. /dev/sda and /dev/sdb may get swapped every other boot, etc. With modern hotpluguable / dynamic discovery busses there simply is no fixed order, the fancontrol script really ought to be fixed to handle this, and to be able to use e.g some by-path way of specifying the hwmon device, note this may already be possible by specifying the device as /sys/bus/pci/devices/foo/hwmon instead of as /sys/class/hwmon/hwmon#, I'm not sure if fancontrol allows specifying a full sysfs path like this ...
OK Hans, I filled bug #1355881 for kernel then. I'll try the full hwmon names. I believe I already tried, but I can't remember the reason why this didn't succeed.
Is there any way to tell systemd to not boot the system when one particular service failed? I've got tired of constantly burning the HW.
(In reply to Hans de Goede from comment #2) > 2) hwmon devices may get a different number between different boots, this is > no different from how e.g. > /dev/sda and /dev/sdb may get swapped every other boot, etc. With modern > hotpluguable / dynamic discovery busses there simply is no fixed order, the > fancontrol script really ought to be fixed to handle this, and to be able to > use e.g some by-path way of specifying the hwmon device, note this may > already be possible by specifying the device as > /sys/bus/pci/devices/foo/hwmon instead of as /sys/class/hwmon/hwmon#, I'm This doesn't work for the following reason: /sys/bus/pci/devices/0000:00:18.4/hwmon/hwmon1 or: /sys/bus/pci/devices/0000:00:18.4/hwmon/hwmon2 So while the bus numbers are persistent, hwmon names still change. I think it is a bug, I don't see the need for hwmon/hwmonX naming.
(In reply to Stas Sergeev from comment #5) > (In reply to Hans de Goede from comment #2) > > 2) hwmon devices may get a different number between different boots, this is > > no different from how e.g. > > /dev/sda and /dev/sdb may get swapped every other boot, etc. With modern > > hotpluguable / dynamic discovery busses there simply is no fixed order, the > > fancontrol script really ought to be fixed to handle this, and to be able to > > use e.g some by-path way of specifying the hwmon device, note this may > > already be possible by specifying the device as > > /sys/bus/pci/devices/foo/hwmon instead of as /sys/class/hwmon/hwmon#, I'm > This doesn't work for the following reason: > /sys/bus/pci/devices/0000:00:18.4/hwmon/hwmon1 > or: > /sys/bus/pci/devices/0000:00:18.4/hwmon/hwmon2 Ah right, I already was afraid of that I just double checked and this is how all kernel classes work, you always get /sys/bus/pci/devices/0000:00:18.4/foo/foo# directory entries. So it looks like fancontrol needs some work here, e.g. it could be modified to accept a wildcard in the sysfs path, or to take an incomplete path and append hwmon0 hwmon1, etc. itself until the first hit. Re-assiging this to lm_sensors, as this part really is a fancontrol issue / shortcoming.
> So it looks like fancontrol needs some work here, e.g. it could be modified > to > accept a wildcard in the sysfs path, or to take an incomplete path and append > hwmon0 hwmon1, etc. itself until the first hit. Oh cmon, isn't this too messy, difficult and unreliable? Why not would the kernel just provide the convenient interfaces? Like, for instance, the symlinks: /sys/bus/pci/devices/0000:00:18.4/hwmon/k10temp -> ./hwmon1 And in a mean time, I am sure the band-aids are needed, like telling systemd to not boot the system if fancontrol failed, or the like. Burning the user's HW is completely unacceptable.
(In reply to Stas Sergeev from comment #7) > > So it looks like fancontrol needs some work here, e.g. it could be modified > > to > > accept a wildcard in the sysfs path, or to take an incomplete path and append > > hwmon0 hwmon1, etc. itself until the first hit. > > Oh cmon, isn't this too messy, difficult and unreliable? > Why not would the kernel just provide the convenient interfaces? > Like, for instance, the symlinks: > /sys/bus/pci/devices/0000:00:18.4/hwmon/k10temp -> ./hwmon1 > > And in a mean time, I am sure the band-aids are needed, > like telling systemd to not boot the system if fancontrol > failed, or the like. Burning the user's HW is completely > unacceptable. Agreed that burning the hardware is unacceptable, but that really is a kernel bug, fancontrol is for power users who want fine-grained control over their fans, things should "just work" without fancontrol. You want to file a bug with the upstream amdgpu developers to get the underlying kernel problem fixed.
> Agreed that burning the hardware is unacceptable, but that really is > a kernel bug, There are multiple bugs here, as well as the missing interfaces. > fancontrol is for power users who want fine-grained control over > their fans, things should "just work" without fancontrol. So in case the user didn't enable fancontrol - fine. But if he, for whatever reason, did - that should become a critical service, at least for a time being. I simply can't think of any other band-aid to at least stop burning the HW right now. This can't wait forever, or can it? > You want to file a bug with the upstream amdgpu developers to get the > underlying kernel problem fixed. But I already did, the URL is in comment #1
Hi, (In reply to Stas Sergeev from comment #9) > > You want to file a bug with the upstream amdgpu developers to get the > > underlying kernel problem fixed. > > But I already did, the URL is in comment #1 The proper place to file kernel gpu driver bugs is (*): https://bugs.freedesktop.org/enter_bug.cgi?product=DRI And then choose amdgpu as component, that way the right people will see it and I'm sure they will see that this is a serious issue and work towards a proper issue. Regards, Hans *) I know this is confusing, but that is the way it is.
> The proper place to file kernel gpu driver bugs is (*): OK, done: https://bugs.freedesktop.org/show_bug.cgi?id=96956
This message is a reminder that Fedora 24 is nearing its end of life. Approximately 2 (two) weeks from now Fedora will stop maintaining and issuing updates for Fedora 24. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '24'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 24 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
This message is a reminder that Fedora 25 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 25. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '25'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 25 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
This message is a reminder that Fedora 26 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 26. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '26'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 26 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
This message is a reminder that Fedora 27 is nearing its end of life. On 2018-Nov-30 Fedora will stop maintaining and issuing updates for Fedora 27. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '27'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 27 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
This message is a reminder that Fedora 29 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '29'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 29 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
This message is a reminder that Fedora 31 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '31'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 31 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
This message is a reminder that Fedora 32 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '32'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 32 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
This bug appears to have been reported against 'rawhide' during the Fedora 35 development cycle. Changing version to 35.
This message is a reminder that Fedora Linux 35 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 35 on 2022-12-13. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '35'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 35 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed.
Fedora Linux 35 entered end-of-life (EOL) status on 2022-12-13. Fedora Linux 35 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora Linux please feel free to reopen this bug against that version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see the version field. If you are unable to reopen this bug, please file a new report against an active release. Thank you for reporting this bug and we are sorry it could not be fixed.