Bug 1340949 - fancontrol unusable because of non-persistent hwmon
Summary: fancontrol unusable because of non-persistent hwmon
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: lm_sensors
Version: 35
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: aegorenk
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-05-30 20:45 UTC by Stas Sergeev
Modified: 2022-12-13 15:11 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-12-13 15:11:46 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Stas Sergeev 2016-05-30 20:45:41 UTC
Description of problem:
/etc/fancontrol directly references the hwmons in
/sys/class/hwmon.
This doesn't work because these names are not persistent.
For example, on this boot I have:

lrwxrwxrwx 1 root root 0 май 30 21:38 hwmon0 -> ../../devices/pci0000:00/0000:00:02.0/0000:01:00.0/hwmon/hwmon0
lrwxrwxrwx 1 root root 0 май 30 21:38 hwmon1 -> ../../devices/pci0000:00/0000:00:18.4/hwmon/hwmon1
lrwxrwxrwx 1 root root 0 май 30 21:38 hwmon2 -> ../../devices/pci0000:00/0000:00:18.3/hwmon/hwmon2
lrwxrwxrwx 1 root root 0 май 30 21:38 hwmon3 -> ../../devices/platform/f71882fg.1152/hwmon/hwmon3
Version-Release number of selected component (if applicable):

As you can see, hwmon1 is PCI 18.4, and hwmon2 is PCI 18.3.
This is a wrong order, but after another boot they swap
places randomly. This makes the fancontrol service to _silently_
fail, burning your hw.

How reproducible:
easily

Steps to Reproduce:
1. See the order of hwmons
2. Reboot a couple of times
3. See the order changed
4. Your fancontrol didn't start (or started to use wrong hwmon)

Actual results:
fancontrol fails to start (or uses wrong hwmon)

Expected results:
fancontrol should _always_ work right, as it is an
absolutely critical service.

Additional info:
I am not sure if it is an lm_sensors bug, or a kernel
bug, or there are just no well-defined interfaces between
the two, but this is quite a big problem, considering the
bugs like this one:
https://bugzilla.kernel.org/show_bug.cgi?id=119211
I am really having my HW constantly burning just because
of the things like that.

Comment 1 Jaromír Cápík 2016-07-11 19:20:01 UTC
Hello Stas.

I believe this is a kernel bug. Changing the component.

Comment 2 Hans de Goede 2016-07-11 21:28:46 UTC
(In reply to Jaromír Cápík from comment #1)
> Hello Stas.
> 
> I believe this is a kernel bug. Changing the component.

There are  2 issues here:

1) fancontrol should never be a critical service, it may be used to make machines more silent in certain cases, but the kernel / firmware should always ensure that there is adequate cooling unless overridden by e.g. fancontrol, I agree fancontrol being necessary at all is a kernel / amdgpu bug

2) hwmon devices may get a different number between different boots, this is no different from how e.g.
/dev/sda and /dev/sdb may get swapped every other boot, etc. With modern hotpluguable / dynamic discovery busses there simply is no fixed order, the fancontrol script really ought to be fixed to handle this, and to be able to use e.g some by-path way of specifying the hwmon device, note this may already be possible by specifying the device as /sys/bus/pci/devices/foo/hwmon instead of as /sys/class/hwmon/hwmon#, I'm not sure if fancontrol allows specifying a full sysfs path like this ...

Comment 3 Stas Sergeev 2016-07-12 19:40:10 UTC
OK Hans, I filled bug #1355881 for kernel then.
I'll try the full hwmon names. I believe I already
tried, but I can't remember the reason why this
didn't succeed.

Comment 4 Stas Sergeev 2016-07-13 21:03:29 UTC
Is there any way to tell systemd to not boot
the system when one particular service failed?
I've got tired of constantly burning the HW.

Comment 5 Stas Sergeev 2016-07-13 21:16:00 UTC
(In reply to Hans de Goede from comment #2)
> 2) hwmon devices may get a different number between different boots, this is
> no different from how e.g.
> /dev/sda and /dev/sdb may get swapped every other boot, etc. With modern
> hotpluguable / dynamic discovery busses there simply is no fixed order, the
> fancontrol script really ought to be fixed to handle this, and to be able to
> use e.g some by-path way of specifying the hwmon device, note this may
> already be possible by specifying the device as
> /sys/bus/pci/devices/foo/hwmon instead of as /sys/class/hwmon/hwmon#, I'm
This doesn't work for the following reason:
/sys/bus/pci/devices/0000:00:18.4/hwmon/hwmon1
or:
/sys/bus/pci/devices/0000:00:18.4/hwmon/hwmon2

So while the bus numbers are persistent, hwmon
names still change. I think it is a bug, I don't
see the need for hwmon/hwmonX naming.

Comment 6 Hans de Goede 2016-07-14 10:22:09 UTC
(In reply to Stas Sergeev from comment #5)
> (In reply to Hans de Goede from comment #2)
> > 2) hwmon devices may get a different number between different boots, this is
> > no different from how e.g.
> > /dev/sda and /dev/sdb may get swapped every other boot, etc. With modern
> > hotpluguable / dynamic discovery busses there simply is no fixed order, the
> > fancontrol script really ought to be fixed to handle this, and to be able to
> > use e.g some by-path way of specifying the hwmon device, note this may
> > already be possible by specifying the device as
> > /sys/bus/pci/devices/foo/hwmon instead of as /sys/class/hwmon/hwmon#, I'm
> This doesn't work for the following reason:
> /sys/bus/pci/devices/0000:00:18.4/hwmon/hwmon1
> or:
> /sys/bus/pci/devices/0000:00:18.4/hwmon/hwmon2

Ah right, I already was afraid of that I just double checked and this is how all kernel classes work,
you always get /sys/bus/pci/devices/0000:00:18.4/foo/foo# directory entries.

So it looks like fancontrol needs some work here, e.g. it could be modified to
accept a wildcard in the sysfs path, or to take an incomplete path and append
hwmon0 hwmon1, etc. itself until the first hit.

Re-assiging this to lm_sensors, as this part really is a fancontrol issue / shortcoming.

Comment 7 Stas Sergeev 2016-07-14 20:17:49 UTC
> So it looks like fancontrol needs some work here, e.g. it could be modified
> to
> accept a wildcard in the sysfs path, or to take an incomplete path and append
> hwmon0 hwmon1, etc. itself until the first hit.

Oh cmon, isn't this too messy, difficult and unreliable?
Why not would the kernel just provide the convenient interfaces?
Like, for instance, the symlinks:
/sys/bus/pci/devices/0000:00:18.4/hwmon/k10temp -> ./hwmon1

And in a mean time, I am sure the band-aids are needed,
like telling systemd to not boot the system if fancontrol
failed, or the like. Burning the user's HW is completely
unacceptable.

Comment 8 Hans de Goede 2016-07-15 09:39:27 UTC
(In reply to Stas Sergeev from comment #7)
> > So it looks like fancontrol needs some work here, e.g. it could be modified
> > to
> > accept a wildcard in the sysfs path, or to take an incomplete path and append
> > hwmon0 hwmon1, etc. itself until the first hit.
> 
> Oh cmon, isn't this too messy, difficult and unreliable?
> Why not would the kernel just provide the convenient interfaces?
> Like, for instance, the symlinks:
> /sys/bus/pci/devices/0000:00:18.4/hwmon/k10temp -> ./hwmon1
> 
> And in a mean time, I am sure the band-aids are needed,
> like telling systemd to not boot the system if fancontrol
> failed, or the like. Burning the user's HW is completely
> unacceptable.

Agreed that burning the hardware is unacceptable, but that really is a kernel bug, fancontrol is for power users who want fine-grained control over their fans, things should "just work" without fancontrol.

You want to file a bug with the upstream amdgpu developers to get the underlying kernel problem fixed.

Comment 9 Stas Sergeev 2016-07-15 21:12:32 UTC
> Agreed that burning the hardware is unacceptable, but that really is
> a kernel bug,

There are multiple bugs here, as well as the missing interfaces.

> fancontrol is for power users who want fine-grained control over
> their fans, things should "just work" without fancontrol.

So in case the user didn't enable fancontrol - fine.
But if he, for whatever reason, did - that should become a
critical service, at least for a time being. I simply can't
think of any other band-aid to at least stop burning the HW
right now. This can't wait forever, or can it?

> You want to file a bug with the upstream amdgpu developers to get the
> underlying kernel problem fixed.

But I already did, the URL is in comment #1

Comment 10 Hans de Goede 2016-07-16 10:35:42 UTC
Hi,

(In reply to Stas Sergeev from comment #9)
> > You want to file a bug with the upstream amdgpu developers to get the
> > underlying kernel problem fixed.
> 
> But I already did, the URL is in comment #1

The proper place to file kernel gpu driver bugs is (*):

https://bugs.freedesktop.org/enter_bug.cgi?product=DRI

And then choose amdgpu as component, that way the right people will see it and I'm sure they will see that this is a serious issue and work towards a proper issue.

Regards,

Hans


*) I know this is confusing, but that is the way it is.

Comment 11 Stas Sergeev 2016-07-16 13:28:05 UTC
> The proper place to file kernel gpu driver bugs is (*):

OK, done:
https://bugs.freedesktop.org/show_bug.cgi?id=96956

Comment 12 Fedora End Of Life 2017-07-25 20:56:55 UTC
This message is a reminder that Fedora 24 is nearing its end of life.
Approximately 2 (two) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 24. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '24'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 24 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 13 Fedora End Of Life 2017-11-16 19:54:12 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 14 Fedora End Of Life 2018-05-03 08:40:39 UTC
This message is a reminder that Fedora 26 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 26. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '26'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 26 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 15 Ben Cotton 2018-11-27 15:53:36 UTC
This message is a reminder that Fedora 27 is nearing its end of life.
On 2018-Nov-30  Fedora will stop maintaining and issuing updates for
Fedora 27. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora  'version' of '27'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 27 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 16 Ben Cotton 2019-10-31 20:26:08 UTC
This message is a reminder that Fedora 29 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '29'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 29 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 17 Ben Cotton 2020-11-03 14:57:56 UTC
This message is a reminder that Fedora 31 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '31'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 31 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 18 Fedora Program Management 2021-04-29 15:52:32 UTC
This message is a reminder that Fedora 32 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '32'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 32 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 19 Ben Cotton 2021-08-10 12:44:49 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 35 development cycle.
Changing version to 35.

Comment 20 Ben Cotton 2022-11-29 16:44:39 UTC
This message is a reminder that Fedora Linux 35 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 35 on 2022-12-13.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '35'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 35 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 21 Ben Cotton 2022-12-13 15:11:46 UTC
Fedora Linux 35 entered end-of-life (EOL) status on 2022-12-13.

Fedora Linux 35 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.