Bug 1873694 - 1-wire ds2490 and w1_therm stopped working properly
Summary: 1-wire ds2490 and w1_therm stopped working properly
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 33
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-29 08:11 UTC by Paweł
Modified: 2021-05-20 13:32 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-20 13:32:59 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Paweł 2020-08-29 08:11:52 UTC
1. Please describe the problem:
After update kernel to version 5.8.4-200.fc32.x86_64 1-wire subsystem has stopped working properly. I cannot read temperature from ds1820 sensors and only one sensor is detected.


2. What is the Version-Release number of the kernel:
5.8.4-200.fc32.x86


3. Did it work previously in Fedora?
Yes! Last working kernel is 5.7.17-200.fc32.x86_64


4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

The working kernel:

ls /sys/devices/w1_bus_master1/:
w1_bus_master1/
├── 28-0000025850bf
│   ├── driver -> ../../../bus/w1/drivers/w1_slave_driver
│   ├── hwmon
│   │   └── hwmon1
│   │       ├── device -> ../../../28-0000025850bf
│   │       ├── name
│   │       ├── power
│   │       │   ├── autosuspend_delay_ms
│   │       │   ├── control
│   │       │   ├── runtime_active_time
│   │       │   ├── runtime_status
│   │       │   └── runtime_suspended_time
│   │       ├── subsystem -> ../../../../../class/hwmon
│   │       ├── temp1_input
│   │       └── uevent
│   ├── id
│   ├── name
│   ├── power
│   │   ├── autosuspend_delay_ms
│   │   ├── control
│   │   ├── runtime_active_time
│   │   ├── runtime_status
│   │   └── runtime_suspended_time
│   ├── subsystem -> ../../../bus/w1
│   ├── uevent
│   └── w1_slave
├── 28-0000025893a3
│   ├── driver -> ../../../bus/w1/drivers/w1_slave_driver
│   ├── hwmon
│   │   └── hwmon0
│   │       ├── device -> ../../../28-0000025893a3
│   │       ├── name
│   │       ├── power
│   │       │   ├── autosuspend_delay_ms
│   │       │   ├── control
│   │       │   ├── runtime_active_time
│   │       │   ├── runtime_status
│   │       │   └── runtime_suspended_time
│   │       ├── subsystem -> ../../../../../class/hwmon
│   │       ├── temp1_input
│   │       └── uevent
│   ├── id
│   ├── name
│   ├── power
│   │   ├── autosuspend_delay_ms
│   │   ├── control
│   │   ├── runtime_active_time
│   │   ├── runtime_status
│   │   └── runtime_suspended_time
│   ├── subsystem -> ../../../bus/w1
│   ├── uevent
│   └── w1_slave
├── driver -> ../../bus/w1/drivers/w1_master_driver
├── power
│   ├── autosuspend_delay_ms
│   ├── control
│   ├── runtime_active_time
│   ├── runtime_status
│   └── runtime_suspended_time
├── subsystem -> ../../bus/w1
├── uevent
├── w1_master_add
├── w1_master_attempts
├── w1_master_max_slave_count
├── w1_master_name
├── w1_master_pointer
├── w1_master_pullup
├── w1_master_remove
├── w1_master_search
├── w1_master_slave_count
├── w1_master_slaves
├── w1_master_timeout
└── w1_master_timeout_us

dmesg:
[   57.899178] Driver for 1-wire Dallas network protocol.
[   57.902832] usbcore: registered new interface driver DS9490R
[   57.917470] w1_master_driver w1_bus_master1: Attaching one wire slave 28.0000025893a3 crc 69
[   57.931462] w1_master_driver w1_bus_master1: Attaching one wire slave 28.0000025850bf crc 23

debugfs:
drivers/w1/w1.c:1028 [wire]w1_search =_ "Abort w1_search\012"
drivers/w1/w1.c:983 [wire]w1_search =_ "No devices present on the wire.\012"
drivers/w1/w1.c:908 [wire]w1_reconnect_slaves =_ "Reconnecting slaves in device %s has been finished.\012"
drivers/w1/w1.c:883 [wire]w1_reconnect_slaves =_ "Reconnecting slaves in device %s for family %02x.\012"
drivers/w1/w1.c:793 [wire]w1_unref_slave =_ "%s: detaching %s [%p].\012"
drivers/w1/w1.c:692 [wire]__w1_attach_slave_device =_ "%s: registering %s as %p.\012"
drivers/w1/w1.c:598 [wire]w1_uevent =_ "Hotplug event for %s %s, bus_id=%s.\012"
drivers/w1/w1.c:594 [wire]w1_uevent =_ "Unknown event.\012"
drivers/w1/w1.c:82 [wire]w1_slave_release =_ "%s: Releasing %s [%p]\012"
drivers/w1/w1.c:73 [wire]w1_master_release =_ "%s: Releasing %s.\012"
drivers/w1/w1_netlink.c:390 [wire]w1_process_command_slave =_ "%s: %02x.%012llx.%02x: cmd=%02x, len=%u.\012"

###############################

Not working kernel:
ls /sys/devices/w1_bus_master1/:
w1_bus_master1/
├── 28-0000025893a3
│   ├── driver -> ../../../bus/w1/drivers/w1_slave_driver
│   ├── id
│   ├── name
│   ├── power
│   │   ├── autosuspend_delay_ms
│   │   ├── control
│   │   ├── runtime_active_time
│   │   ├── runtime_status
│   │   └── runtime_suspended_time
│   ├── subsystem -> ../../../bus/w1
│   └── uevent
├── driver -> ../../bus/w1/drivers/w1_master_driver
├── power
│   ├── autosuspend_delay_ms
│   ├── control
│   ├── runtime_active_time
│   ├── runtime_status
│   └── runtime_suspended_time
├── subsystem -> ../../bus/w1
├── therm_bulk_read
├── uevent
├── w1_master_add
├── w1_master_attempts
├── w1_master_max_slave_count
├── w1_master_name
├── w1_master_pointer
├── w1_master_pullup
├── w1_master_remove
├── w1_master_search
├── w1_master_slave_count
├── w1_master_slaves
├── w1_master_timeout
└── w1_master_timeout_us

dmesg:
[  111.986322] Driver for 1-wire Dallas network protocol.
[  111.999334] usbcore: registered new interface driver DS9490R
[  112.013690] w1_master_driver w1_bus_master1: Attaching one wire slave 28.0000025893a3 crc 69

debugfs:
drivers/w1/w1.c:1028 [wire]w1_search =_ "Abort w1_search\012"
drivers/w1/w1.c:983 [wire]w1_search =_ "No devices present on the wire.\012"
drivers/w1/w1.c:908 [wire]w1_reconnect_slaves =_ "Reconnecting slaves in device %s has been finished.\012"
drivers/w1/w1.c:883 [wire]w1_reconnect_slaves =_ "Reconnecting slaves in device %s for family %02x.\012"
drivers/w1/w1.c:793 [wire]w1_unref_slave =_ "%s: detaching %s [%p].\012"
drivers/w1/w1.c:692 [wire]__w1_attach_slave_device =_ "%s: registering %s as %p.\012"
drivers/w1/w1.c:598 [wire]w1_uevent =_ "Hotplug event for %s %s, bus_id=%s.\012"
drivers/w1/w1.c:594 [wire]w1_uevent =_ "Unknown event.\012"
drivers/w1/w1.c:82 [wire]w1_slave_release =_ "%s: Releasing %s [%p]\012"
drivers/w1/w1.c:73 [wire]w1_master_release =_ "%s: Releasing %s.\012"
drivers/w1/w1_netlink.c:390 [wire]w1_process_command_slave =_ "%s: %02x.%012llx.%02x: cmd=%02x, len=%u.\012"
drivers/w1/slaves/w1_therm.c:1431 [w1_therm]resolution_show =_ "%s: Resolution may be corrupted. err=%d\012"
drivers/w1/slaves/w1_therm.c:1410 [w1_therm]ext_power_show =_ "%s: Power_mode may be corrupted. err=%d\012"
drivers/w1/slaves/w1_therm.c:1386 [w1_therm]temperature_show =_ "%s: Temperature data may be corrupted. err=%d\012"
drivers/w1/slaves/w1_therm.c:1372 [w1_therm]temperature_show =_ "%s: Conversion in progress, retry later\012"
drivers/w1/slaves/w1_therm.c:1292 [w1_therm]w1_slave_show =_ "%s: Temperature data may be corrupted. err=%d\012"
drivers/w1/slaves/w1_therm.c:1278 [w1_therm]w1_slave_show =_ "%s: Conversion in progress, retry later\012"
drivers/w1/slaves/w1_therm.c:585 [w1_therm]w1_DS18S20_convert_temp =_ "%s: Invalid argument for conversion\012"

In the new kernel I cannot read (cat)
/sys/devices/w1_bus_master1/w1_master_slaves
/sys/devices/w1_bus_master1/w1_master_slave_count
It just hungs, Ctrl-C not working, can't kill process

All problems are probably due to recent changes to w1_therm
https://kernelnewbies.org/Linux_5.8#A1-Wire_.28W1.29

Comment 1 Paweł 2020-08-29 13:35:14 UTC
I've checked kernel-5.8.5-200.fc32.x86_64 and nothing has changed.

Comment 2 Paweł 2020-09-05 06:41:55 UTC
kernel-5.8.6-201.fc32.x86_64 didn't change anything.

Comment 3 Paweł 2020-09-08 17:26:43 UTC
kernel-5.8.7-200.fc32.x86_64 didn't change anything.

Comment 4 Paweł 2020-09-11 15:33:03 UTC
kernel-5.8.8-200.fc32.x86_64 no change.

Comment 5 Paweł 2020-09-13 10:00:54 UTC
I reverted all changes from the file w1_therm.c, compiled the new kernel and everything works fine.
So this is definitely a problem with this https://kernelnewbies.org/Linux_5.8#A1-Wire_.28W1.29

Comment 6 Paweł 2020-09-19 11:08:34 UTC
I've done a little debugging and found this

[  126.273347] w1_master_driver w1_bus_master1: Attaching one wire slave 28.0000025893a3 crc 69
[  126.280946] DEBUG: Passed w1_therm_init 1816
[  126.281357] DEBUG: Passed w1_therm_add_slave 798
[  126.281359] DEBUG: Passed device_family 670
[  126.281360] DEBUG: Passed bulk_read_support 720
[  126.281361] DEBUG: Passed read_powermode 1161
[  126.281362] DEBUG: Passed bus_mutex_lock 694

It stucks on func bus_mutex_lock() in func read_powermode() from w1_therm.c 

... 
static inline bool bus_mutex_lock(struct mutex *lock)
{
    int max_trying = W1_THERM_MAX_TRY;

    /* try to acquire the mutex, if not, sleep retry_delay before retry) */
    while (mutex_lock_interruptible(lock) != 0 && max_trying > 0) {      <--- stucks here
        unsigned long sleep_rem;
...

It stucks because mutex is locked and we have a deadlock.
I think we have some strange interactions between ds2490.c and a new w1_therm.c

Comment 7 Paweł 2020-12-28 19:12:06 UTC
We have a breakthrough.
Hans-Frieder Vogt has solved the problem. https://lkml.org/lkml/2020/12/28/2289
I've tested that patch and everything works!

Comment 8 Chris Murphy 2021-01-12 20:49:34 UTC
I suggest following up with upstream. I don't see that fix in 5.11-rc3. Worth asking whether it can go in 5.11, and submitted to stable as a fix.

Comment 10 Paweł 2021-05-20 13:32:59 UTC
Fixed in kernel 5.12.x


Note You need to log in before you can comment on or make changes to this bug.