Bug 1599642 - Reported temperature of nvidia card with nouveau driver is wrong
Summary: Reported temperature of nvidia card with nouveau driver is wrong
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: lm_sensors
Version: 29
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Ondřej Lysoněk
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-10 09:36 UTC by Jirka Novak
Modified: 2019-10-23 07:20 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-10-23 07:20:55 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Jirka Novak 2018-07-10 09:36:01 UTC
Description of problem:
When I use sensors tool or any other tool for showing temperature of system components, I see obviously wrong temperature for my nvidia card controlled by nouveau driver:

nouveau-pci-0100
Adapter: PCI adapter
temp1:       +511.0°C  (high = +95.0°C, hyst =  +3.0°C)
                       (crit = +105.0°C, hyst =  +5.0°C)
                       (emerg = +135.0°C, hyst =  +5.0°C)

My guess is that temperature is multiplied by 10.

Version-Release number of selected component (if applicable):
Linux p3530 4.17.3-200.fc28.x86_64 #1 SMP Tue Jun 26 14:17:07 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
lm_sensors-3.4.0-13.fc28.x86_64
lm_sensors-libs-3.4.0-13.fc28.x86_64

How reproducible:
During system operation.

Steps to Reproduce:
1. Run the system

Actual results:
nouveau-pci-0100
Adapter: PCI adapter
temp1:       +511.0°C  (high = +95.0°C, hyst =  +3.0°C)
                       (crit = +105.0°C, hyst =  +5.0°C)
                       (emerg = +135.0°C, hyst =  +5.0°C)

Expected results:
Probably +51.0 or +51.1 in place of +511.0

Additional info:
My HW is:
Product Name:           Precision 3530
Vendor:                 Dell Inc.
BIOS Version:           1.2.5
System ID:              0x0820
Service Tag:            7Z694Q2

Comment 1 Ben Cotton 2019-05-02 21:20:24 UTC
This message is a reminder that Fedora 28 is nearing its end of life.
On 2019-May-28 Fedora will stop maintaining and issuing updates for
Fedora 28. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora 'version' of '28'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 28 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 2 Ondřej Lysoněk 2019-05-03 10:11:48 UTC
Can you still reproduce it? If so, let's determine if this is a problem on the lm_sensors side or the kernel side. What is the output of the following, when the reported temperature is wrong?
grep . $(dirname $(grep -l nouveau /sys/class/hwmon/*/name))/temp*

Comment 3 Jirka Novak 2019-10-23 05:22:29 UTC
Sorry I missed your comment.
In meantime I switched whole nouveau off as it was not able suspend/resume on latest kernels in FC30. Before I made it (mid of this year), I checked temperature and it was wrong after resume on 5.1.11-300.fc30.x86_64.
Now I switched to intel GPU therefore I can't validate it again.

Best regards,

Jirka Novak

Comment 4 Ondřej Lysoněk 2019-10-23 07:20:55 UTC
Ok. Let's close the bug for now. Note however that this is most likely a problem on the driver side.


Note You need to log in before you can comment on or make changes to this bug.