Created attachment 1839601 [details] kernel log of USB hang Created attachment 1839601 [details] kernel log of USB hang We are coming from libfprint issue https://gitlab.freedesktop.org/libfprint/libfprint/-/issues/426, that bug is long. TL/DR: * bluetooth device on bus usb5 * we are resuming from suspend (S3) * bluetooth device is removed (probably thinkpad_acpi rfkill switch, a bit odd, but perfectly valid to happen) * btusb firmware load times out (command timeout after 2s, firmware load timeout after 10s) * fprintd/libusb enumerates devices during that time, is stuck querying root hub information for usb5 until the btusb timeout happens Some of the log messages (skipping plenty of debug messages, please download the full log): [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: enable port 4 USB2 hardware LPM ... [Mi Nov 3 11:55:12 2021] usb 5-4: udev 33, busnum 5, minor = 544 [Mi Nov 3 11:55:12 2021] usb 5-4: New USB device found, idVendor=8087, idProduct=0032, bcdDevice= 0.00 [Mi Nov 3 11:55:12 2021] usb 5-4: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [Mi Nov 3 11:55:12 2021] usb 5-4: usb_probe_device [Mi Nov 3 11:55:12 2021] usb 5-4: configuration #1 chosen from 1 choice ... [Mi Nov 3 11:55:12 2021] usb 5-4: adding 5-4:1.0 (config #1, interface 0) [Mi Nov 3 11:55:12 2021] btusb 5-4:1.0: usb_probe_interface [Mi Nov 3 11:55:12 2021] btusb 5-4:1.0: usb_probe_interface - got id [Mi Nov 3 11:55:12 2021] usb 5-4: adding 5-4:1.1 (config #1, interface 1) [Mi Nov 3 11:55:12 2021] Generic FE-GE Realtek PHY r8169-0-200:00: attached PHY driver (mii_bus:phy_addr=r8169-0-200:00, irq=MAC) [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: ep 0x81 - asked for 64 bytes, 32 bytes untransferred [Mi Nov 3 11:55:12 2021] Bluetooth: hci0: Device revision is 0 [Mi Nov 3 11:55:12 2021] Bluetooth: hci0: Secure boot is enabled [Mi Nov 3 11:55:12 2021] Bluetooth: hci0: OTP lock is enabled [Mi Nov 3 11:55:12 2021] Bluetooth: hci0: API lock is enabled [Mi Nov 3 11:55:12 2021] Bluetooth: hci0: Debug lock is disabled [Mi Nov 3 11:55:12 2021] Bluetooth: hci0: Minimum firmware build 1 week 10 2014 [Mi Nov 3 11:55:12 2021] Bluetooth: hci0: Bootloader timestamp 2019.40 buildtype 1 build 38 [Mi Nov 3 11:55:12 2021] psmouse serio1: synaptics: queried max coordinates: x [..5678], y [..4694] [Mi Nov 3 11:55:12 2021] Bluetooth: hci0: Found device firmware: intel/ibt-0041-0041.sfi [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: ep 0x82 - asked for 1028 bytes, 1022 bytes untransferred ... [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: Port change event, 5-4, id 4, portsc: 0x202a0 [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: handle_port_status: starting port polling. [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: Transfer error for slot 2 ep 3 on endpoint [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: // Ding dong! [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: Transfer error for slot 2 ep 4 on endpoint [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: // Ding dong! ... [Mi Nov 3 11:55:12 2021] usb 5-4: USB disconnect, device number 33 [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: Ignoring reset ep completion code of 1 ... [Mi Nov 3 11:55:12 2021] usb 5-4: unregistering device [Mi Nov 3 11:55:12 2021] usb 5-4: unregistering interface 5-4:1.0 [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: Transfer error for slot 2 ep 3 on endpoint [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: // Ding dong! [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: Transfer error for slot 2 ep 4 on endpoint [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: // Ding dong! [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: Ignoring reset ep completion code of 1 [Mi Nov 3 11:55:12 2021] xhci_hcd 0000:07:00.4: Ignoring reset ep completion code of 1 ... [Mi Nov 3 11:55:14 2021] Bluetooth: hci0: command 0xfc09 tx timeout ... [Mi Nov 3 11:55:23 2021] Bluetooth: hci0: Failed to send firmware data (-110) [Mi Nov 3 11:55:23 2021] Bluetooth: hci0: sending frame failed (-19) [Mi Nov 3 11:55:23 2021] Bluetooth: hci0: Intel reset sent to retry FW download [Mi Nov 3 11:55:23 2021] usb 5-4: unregistering interface 5-4:1.1 [Mi Nov 3 11:55:23 2021] xhci_hcd 0000:07:00.4: disable port 4 USB2 hardware LPM [Mi Nov 3 11:55:23 2021] xhci_hcd 0000:07:00.4: Set up evaluate context for LPM MEL change. [Mi Nov 3 11:55:23 2021] xhci_hcd 0000:07:00.4: // Ding dong! [Mi Nov 3 11:55:23 2021] xhci_hcd 0000:07:00.4: Successful evaluate context command [Mi Nov 3 11:55:23 2021] usb 5-4: usb_disable_device nuking all URBs So, what we can see is that: * At :12 the USB device disconnects: "usb 5-4: USB disconnect, device number 33" * At :14 there is a tx timeout on hci0 * At :23 the firmware load fails * And only then usb_disable_device appears to finish * At around this time, libusb/fprintd gets the usb5 HUB information and continues its enumeration
Hi Benjamin, I did not dive into the details here, but this sounds like something which should be discussed on some upstream kernel list, rather then in Red Hat's bugzilla ? Regards, Hans
Good point, I send a mail to the USB and bluetooth mailing lists. Lets hope someone has a fix for it.
Hrm, seems like the btusb error handling solely relies on timeouts. Which wouldn't be too bad, if it was not for the fact that btusb holds a lock on the USB device while sending the firmware. i.e. both hci_send_frame and btusb_tx_complete just swallow the error rather than returning/forwarding it for further handling. And even if hci_send_frame was returning it, none of the callers have any error handling.
If you want, you can try https://koji.fedoraproject.org/koji/taskinfo?taskID=78581875 (once build). It contains the patches I submitted at https://patchwork.kernel.org/project/bluetooth/list/?series=577565 (well, almost identical, but should be close enough in practice).
Perfect, thanks a lot! I will hopefully be able try it out tomorrow. Btw, in the meantime, I also found that uninstalling tlp also makes the problem go away. I did not look into the details, but that probably triggers the thinkpad_acpi rfkill switch.
I was able to test the kernel this morning and can confirm that it fixes the issue. Thanks a lot Benjamin!
*** Bug 2028872 has been marked as a duplicate of this bug. ***
For completeness sake, https://patchwork.kernel.org/project/bluetooth/list/?series=589973&state=* should fix the issue and has been merged AFAIK. This newer series only applies to bt-next.
*** Bug 2048819 has been marked as a duplicate of this bug. ***
Hi, I ran into a similar problem with Fedora 37 and a Lenovo ThinkPad T14s Gen 1, model 20UJS00K00. After resuming the gnome-shell hangs because of fprintd and I was able to track this down to usb device enumeration hanging on reading the "descriptors" sysfs file. Calls to lsusb also hang on the same file. I'm using the latest kernel: > Linux jbu-laptop 5.19.16-301.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Oct 21 15:55:37 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux Here is the backtrace from the hanging usb read: > (gdb) bt > #0 0x00007ffff7e77fac in read () from /lib64/libc.so.6 > #1 0x00007ffff7f90885 in read (__nbytes=256, __buf=<optimized out>, __fd=8) at /usr/include/bits/unistd.h:38 > #2 initialize_device (dev=dev@entry=0x5555555a87c0, busnum=busnum@entry=4 '\004', devaddr=devaddr@entry=1 '\001', sysfs_dir=sysfs_dir@entry=0x5555555a87a0 "usb4", wrapped_fd=wrapped_fd@entry=-1) at os/linux_usbfs.c:967 > #3 0x00007ffff7f92989 in linux_enumerate_device (ctx=0x55555559f610, busnum=<optimized out>, devaddr=<optimized out>, sysfs_dir=0x5555555a87a0 "usb4") at os/linux_usbfs.c:1117 > #4 0x00007ffff7f93115 in linux_udev_scan_devices (ctx=0x55555559f610) at os/linux_udev.c:299 > #5 linux_scan_devices (ctx=0x55555559f610) at os/linux_usbfs.c:458 > #6 op_init (ctx=ctx@entry=0x55555559f610) at os/linux_usbfs.c:410 > #7 0x00007ffff7f93e79 in libusb_init (ctx=ctx@entry=0x7fffffffe138) at /usr/src/debug/libusb1-1.0.25-9.fc37.x86_64/libusb/core.c:2353 > #8 0x0000555555565e23 in main (argc=<optimized out>, argv=0x7fffffffe398) at /usr/src/debug/usbutils-014-3.fc37.x86_64/lsusb.c:3895 USB4 seems to be the integrated card reader: > [root@jbu-laptop ~]# lsusb > Bus 007 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub > Bus 006 Device 003: ID 8087:0029 Intel Corp. AX200 Bluetooth > Bus 006 Device 002: ID 06cb:00bd Synaptics, Inc. Prometheus MIS Touch Fingerprint Reader > Bus 006 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub > Bus 005 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub > Bus 004 Device 002: ID 058f:9540 Alcor Micro Corp. AU9540 Smartcard Reader > Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub > Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub > Bus 002 Device 002: ID 13d3:5405 IMC Networks Integrated Camera > Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub > Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub The repoduction is not that easy and I'm not able to reliably reproduce the issue mostly it works if I do the following: 1. Connect to docking station with Monitors, Headset and webcam attached 2. Suspend while connected to the docking station 3. Disconnect from the docking station 4. Resume while disconnected from the docking station Please let me know if you need any further logs I can gather them if the next hang occures.
This message is a reminder that Fedora Linux 35 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 35 on 2022-12-13. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '35'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 35 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed.
Fedora Linux 35 entered end-of-life (EOL) status on 2022-12-13. Fedora Linux 35 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora Linux please feel free to reopen this bug against that version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see the version field. If you are unable to reopen this bug, please file a new report against an active release. Thank you for reporting this bug and we are sorry it could not be fixed.