Bug 2175534 - Kernel 6.2.2-300 does not provide USB devices on Renesas USB3 card
Summary: Kernel 6.2.2-300 does not provide USB devices on Renesas USB3 card
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 37
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-03-05 16:38 UTC by Brian Morrison
Modified: 2024-01-12 22:58 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-01-12 22:58:50 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
dmesg from kernel-6.2.2 (90.99 KB, text/plain)
2023-03-05 16:38 UTC, Brian Morrison
no flags Details
dmesg from kernel-6.1.15-200 (90.33 KB, text/plain)
2023-03-05 16:39 UTC, Brian Morrison
no flags Details
Output from sudo lsusb for kernel-6.2.2 (117.72 KB, text/plain)
2023-03-05 16:41 UTC, Brian Morrison
no flags Details
Output from sudo lsusb for kernel-6.1.15 (212.80 KB, text/plain)
2023-03-05 16:44 UTC, Brian Morrison
no flags Details
Output from lsmod for kernel-6.2.2 (7.86 KB, text/plain)
2023-03-05 16:45 UTC, Brian Morrison
no flags Details
Output from lsmod for kernel-6.1.15 (7.92 KB, text/plain)
2023-03-05 16:46 UTC, Brian Morrison
no flags Details
dmesg from kernel-6.2.2 with usbcore.dyndbg=+p xhci_hcd.dyndbg=+p added to command line (265.60 KB, text/plain)
2023-03-08 16:25 UTC, Brian Morrison
no flags Details
dmesg from kernel-6.2.5 with usbcore.dyndbg=+p xhci_hcd.dyndbg=+p added to command line (265.62 KB, text/plain)
2023-03-11 23:56 UTC, Brian Morrison
no flags Details

Description Brian Morrison 2023-03-05 16:38:45 UTC
Created attachment 1948179 [details]
dmesg from kernel-6.2.2

1. Please describe the problem:

Using the F37 6.2 test kernel, all USB connected devices on my Renesas-based USB3 card are not visible to the OS.

2. What is the Version-Release number of the kernel:

kernel-6.2.2-300.fc37.x86_64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

This is the first 6.x kernel showing the problem, all kernels up to and including kernel-6.1.15-200.fc37.x86_64 work correctly.

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Install kernel-6.2.2-300.fc37.x86_64, this requires the new kernel-modules-core package in addition to the usual kernel packages.

Boot system.

Open a terminal and run sudo lsusb, several of the usual devices are no longer found in comparison with previous kernels tested.

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:


6. Are you running any modules that not shipped with directly Fedora's kernel?:

No

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Various files attached to the bug showing differences between 6.1.x and 6.2.2 kernels

Comment 1 Brian Morrison 2023-03-05 16:39:45 UTC
Created attachment 1948180 [details]
dmesg from kernel-6.1.15-200

Comment 2 Brian Morrison 2023-03-05 16:41:08 UTC
Created attachment 1948181 [details]
Output from sudo lsusb for kernel-6.2.2

Comment 3 Brian Morrison 2023-03-05 16:44:26 UTC
Created attachment 1948182 [details]
Output from sudo lsusb for kernel-6.1.15

Note that these devices are found:

Silicon Labs CP210x UART Bridge     (x3)

Texas Instruments PCM2903C Audio CODEC

Texas Instruments PCM2901 Audio Codec

but are missing with kernel-6.2.2

Comment 4 Brian Morrison 2023-03-05 16:45:29 UTC
Created attachment 1948183 [details]
Output from lsmod for kernel-6.2.2

Comment 5 Brian Morrison 2023-03-05 16:46:08 UTC
Created attachment 1948184 [details]
Output from lsmod for kernel-6.1.15

Comment 6 Brian Morrison 2023-03-05 16:49:04 UTC
It appears that this could be caused by udev rules failing to identify hardware correctly.

I tried updating to the latest systemd packages (251.13-4) from updates-testing but this made no difference to this problem.

All other packages are as updated as I can make them at today's date.

Comment 7 Brian Morrison 2023-03-06 16:26:09 UTC
I have also checked that usbguard is not blocking access to these USB devices.

While usbguard list-rules show as the same size for both kernels, usbguard list-devices omits the devices that don't work under kernel-6.2.2 so there is about 1800 bytes less output:

[bdm@deangelis ~]$ ll usbguard_*
-rw-r--r--. 1 bdm bdm 5236 Mar  6 16:13 usbguard_devices_6.1.15
-rw-r--r--. 1 bdm bdm 3433 Mar  6 16:09 usbguard_devices_6.2.2
-rw-r--r--. 1 bdm bdm 5415 Mar  6 16:13 usbguard_rules_6.1.15
-rw-r--r--. 1 bdm bdm 5415 Mar  6 16:10 usbguard_rules_6.2.2

Comment 8 Hans de Goede 2023-03-07 09:49:42 UTC
I just checked the x86_64 config for 6.2.2-400. and it has:

CONFIG_USB_XHCI_PCI_RENESAS=y

set again (this was set and then unset before) setting this options adds support for loading Renesas firmware into the XHCI controller at boot for devices where the XHCI controller does not have a ROM to load the firmware from.

The problem with this option is that it breaks controllers where the firmware is actually present in some form of ROM and on x86 the firmware pretty much always is present in some form of ROM.

So we need to disable this option again on x86_64 to fix this. Until this is done you can stick with 6.1.x kernels as a workaround.

Comment 9 Hans de Goede 2023-03-07 10:06:10 UTC
(In reply to Hans de Goede from comment #8)
> I just checked the x86_64 config for 6.2.2-400. and it has:
> 
> CONFIG_USB_XHCI_PCI_RENESAS=y
> 
> set again (this was set and then unset before) setting this options adds
> support for loading Renesas firmware into the XHCI controller at boot for
> devices where the XHCI controller does not have a ROM to load the firmware
> from.
> 
> The problem with this option is that it breaks controllers where the
> firmware is actually present in some form of ROM and on x86 the firmware
> pretty much always is present in some form of ROM.
> 
> So we need to disable this option again on x86_64 to fix this. Until this is
> done you can stick with 6.1.x kernels as a workaround.

Correction, it seems that setting CONFIG_USB_XHCI_PCI_RENESAS=y does work fine now a days and this has been enabled for a while now. Quoting from your 6.1 dmesg:

[    1.076316] xhci_hcd 0000:04:00.0: failed to load firmware renesas_usb_fw.mem, fallback to ROM

And likewise your 6.2 dmesg has:

[    1.094512] xhci_hcd 0000:04:00.0: failed to load firmware renesas_usb_fw.mem, fallback to ROM

Looking at your lsusb output I also see that some devices on the USB-2 root of the Renesas XHCI are still being found, but less devices then before.

So this seems more of an USB enumeration problem then a problem with the XHCI driver, sorry for the noise.

I think it is best if you directly report this upstream by sending an email to: Mathias Nyman <mathias.nyman> with linux-usb.org in the Cc. You can point to this bug in the email with a note that it has dmesg and lsusb -v output for both working and non-working kernels. I would only include lsusb -t output between the 2 kernels in the email to clearly demonstrate the devices which have gone away from the USB tree beneath the Renesas 2.0 root port.

Comment 10 Brian Morrison 2023-03-07 11:33:50 UTC
Thanks Hans, I recall the original problem with ROM-based controllers and agree that the dmesg differences don't indicate that this problem is back.

I will follow your advice and notify Mathias, I will happily confess that I don't know how to debug this further myself,  but a brief look the the kernel 6.2 changelog brings up thousands of references to 'usb' so clearly a lot has changed.

Comment 11 Brian Morrison 2023-03-08 16:25:17 UTC
Created attachment 1949112 [details]
dmesg from kernel-6.2.2 with usbcore.dyndbg=+p xhci_hcd.dyndbg=+p added to command line

Extra debugging added to try and identify why the /dev/ttyUSB* devices are disconnected.

Seeing this:

[bdm@deangelis ~]$ grep usb_disable_device dmesg_6.2.2_debug.txt
[   18.349015] usb 2-1.1: usb_disable_device nuking all URBs
[   18.587034] usb 2-1.4: usb_disable_device nuking all URBs
[   18.589675] usb 2-1: usb_disable_device nuking non-ep0 URBs
[   19.280599] usb 2-2: usb_disable_device nuking non-ep0 URBs
[   19.288312] usb 2-4.1: usb_disable_device nuking all URBs
[   19.298113] usb 2-4.2: usb_disable_device nuking all URBs
[   19.386494] usb 2-4.4: usb_disable_device nuking all URBs
[   19.390100] usb 2-4: usb_disable_device nuking non-ep0 URBs

which seems to be the start of each disconnection.

Comment 12 Brian Morrison 2023-03-11 23:56:57 UTC
Created attachment 1949926 [details]
dmesg from kernel-6.2.5 with usbcore.dyndbg=+p xhci_hcd.dyndbg=+p added to command line

Updated F37 to kernels 6.1.18-200 and 6.2.5-300, the former works normally as previous 6.1 kernels.

With kernel 6.2.5 connecting a USB3/USB2 4-port hub (Realtek chipset) via a USB root hub on the motherboard is enumerated and a USB headset connected to it is also enumerated and appears in the lsusb listing.

This suggests that the problem is related to the xhci_hcd driver for the Renesas PCI USB card.

Comment 13 Brian Morrison 2023-03-14 14:43:56 UTC
The simple fix for this is to remove the following kernel commit:

4c2604a9a689 usb: xhci-pci: Set PROBE_PREFER_ASYNCHRONOUS

as the asynchronous probe causes the USB buses to enumerate in a different order and this breaks the function of the USB2 host in the Renesas PCI USB3 card.

It's been reported to linux-usb so there should be an official fix coming fairly soon.

Comment 14 Brian Morrison 2023-03-25 17:54:05 UTC
The latest 6.2 kernel-6.2.8-200 has a revert of the commit that caused this problem, not sure if it came via upstream or is a Fedora-specific change.

From what I have seen this also affected VL805/806 xhci host controllers but the fix is the same.

Comment 15 Hans de Goede 2023-03-26 11:25:51 UTC
(In reply to Brian Morrison from comment #14)
> The latest 6.2 kernel-6.2.8-200 has a revert of the commit that caused this
> problem, not sure if it came via upstream or is a Fedora-specific change.

Kate Hsuan has submitted a revert for this as a Fedora downstream patch for now (thank you Kate).

> From what I have seen this also affected VL805/806 xhci host controllers but
> the fix is the same.

Has this ("also affected VL805/806 xhci host controllers") also been told to the upstream developers looking into this ?  I have read parts of the upstream thread but I don't remember reading this.  If the upstream xhci maintainer does not know about this yet, can you please inform him about this ?

Comment 16 Brian Morrison 2023-03-26 13:55:11 UTC
(In reply to Hans de Goede from comment #15)
> (In reply to Brian Morrison from comment #14)
>
> > From what I have seen this also affected VL805/806 xhci host controllers but
> > the fix is the same.
> 
> Has this ("also affected VL805/806 xhci host controllers") also been told to
> the upstream developers looking into this ?  I have read parts of the
> upstream thread but I don't remember reading this.  If the upstream xhci
> maintainer does not know about this yet, can you please inform him about
> this ?

From the linux-usb list a response from Mathias Nyman suggests that the same incorrect ordering of buses may be due to the same commit:

https://marc.info/?l=linux-usb&m=167930436629423&w=4

So it seems that the developers are aware of it, although the reporter states that he can't verify the fix quickly. The symptoms are slightly different too, but it's on a different m/board.

Comment 17 Brian Morrison 2023-03-31 19:27:07 UTC
Seems not to be the same problem, it's something to do with AMD boards and iommu issues.

My problem is resolved in Fedora/Redhat, not sure when the change will be in upstream kernel source.

Comment 18 Aoife Moloney 2023-11-23 01:23:43 UTC
This message is a reminder that Fedora Linux 37 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 37 on 2023-12-05.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '37'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 37 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 19 Aoife Moloney 2024-01-12 22:58:50 UTC
Fedora Linux 37 entered end-of-life (EOL) status on 2023-12-05.

Fedora Linux 37 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.