Bug 2352642

Summary: USB killed from time to time by xhci_hcd xHCI host controller not responding, assume dead on a NUC14RVB
Product: [Fedora] Fedora Reporter: Peter Bieringer <pb>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 41CC: acaringi, adscvr, airlied, bskeggs, hdegoede, hpa, josef, kernel-maint, linville, masami256, mchehab, ptalbert, sly.midnight, steved, suraj.ghimire7, wisp3rwind
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: ---
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2025-03-20 05:11:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
cron job as workaround to renenable the script none

Description Peter Bieringer 2025-03-14 20:54:52 UTC
1. Please describe the problem:
USB disconnected from time to time suddenly like this:

[222464.476609] xhci_hcd 0000:00:14.0: xHCI host not responding to stop endpoint command
[222464.476624] xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead
[222464.476647] xhci_hcd 0000:00:14.0: HC died; cleaning up
[222464.476677] xhci_hcd 0000:00:14.0: Timeout while waiting for stop endpoint command
[222464.476679] usb 3-1: USB disconnect, device number 2
[222464.476682] usb 3-1.1: USB disconnect, device number 51
[222464.476686] usb 3-1.1.3: USB disconnect, device number 52
[222464.570277] usb 3-1.1.4: USB disconnect, device number 53
[222464.607082] usb 3-3: USB disconnect, device number 60
[222464.607285] usb 3-10: USB disconnect, device number 5


2. What is the Version-Release number of the kernel:
happen since >= 6.13.5-200.fc41.x86_64


3. Did it work previously in Fedora? 
Not seen <= 6.13.4-200.fc41.x86_64


4. Can you reproduce this issue?
Not directly, but happened already 8 times now


5. Does this problem occur with the latest Rawhide kernel?
Not tested so far


6. Are you running any modules that not shipped with directly Fedora's kernel?:
kmod-VirtualBox via akmods


7. Please attach the kernel logs.
See above

BIOS updated meanwhile, had not helped.
	Vendor: ASUSTeK COMPUTER INC.
	Version: RVMTL357.0047.2025.0108.1408
	Manufacturer: ASUSTeK COMPUTER INC.
	Product Name: NUC14RVB
	Version: 60AS0080-MB0A02

Looks like this issue appeared some time ago but not recently usually

As system becomes unresponsive via keyboard I have now created a cron job which looks for that message every minute and run in case an unbind+bind

usb=0000:00:14.0
echo "$usb" >/sys/bus/pci/drivers/xhci_hcd/unbind
echo "$usb" >/sys/bus/pci/drivers/xhci_hcd/bind

Hopefully this workaround will help for now.

Comment 1 Reilly Hall 2025-03-17 01:55:55 UTC
I would like to add that I am also experiencing this as well on a Dell OptiPlex 7000 with the latest 6.13.5 and 6.13.6 kernels.  This does not happen with 6.12.15 kernel which is the last one I have that this bug does not manifest itself on.

I too get the following messages:

Mar 16 12:15:20 kernel: xhci_hcd 0000:00:14.0: xHCI host not responding to stop endpoint command
Mar 16 12:15:20 kernel: xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead
Mar 16 12:15:20 kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
Mar 16 12:15:20 kernel: usb 1-5: USB disconnect, device number 4
Mar 16 12:15:21 kernel: usb 1-5.2: USB disconnect, device number 5
Mar 16 12:15:21 kernel: xhci_hcd 0000:00:14.0: Timeout while waiting for stop endpoint command
Mar 16 12:15:21 kernel: usb 2-3: USB disconnect, device number 2
Mar 16 12:15:21 kernel: usb 2-3.4: USB disconnect, device number 3
Mar 16 12:15:21 kernel: usb 1-5.3: USB disconnect, device number 6
Mar 16 12:15:21 kernel: usb 1-5.4: USB disconnect, device number 7
Mar 16 12:15:21 kernel: usb 1-5.4.4: USB disconnect, device number 8
Mar 16 12:15:21 kernel: usb 1-14: USB disconnect, device number 3

This is a very unfortunately annoying regression as just like the original poster, my keyboard and mouse die, so I can no longer control the computer directly.  Unplugging and replugging the devices also does nothing.  I have to login remotely via ssh from another computer and induce a shutdown or reboot to regain control.

The main thing I've noticed is that it tends to happen shortly after resuming from s2idle sleep, which I use often.

Please let me know if there is anything I can do to help debug newer kernels as currently my only solution is to boot to an older 6.12.15 kernel and uninstall all 6.13.x kernels for the time being after testing each newer one to see if this issue has been resolved.

Comment 2 Peter Bieringer 2025-03-17 06:18:19 UTC
Created attachment 2080495 [details]
cron job as workaround to renenable the script

workaround script

Comment 3 Peter Bieringer 2025-03-17 06:22:47 UTC
can it be that there is another issue with this kernel series? After wakeup, partially the network linked died, recover is to remove and insert the module "igc" in sequence:

rmmod igc
modprobe igc

Comment 4 wisp3rwind 2025-03-17 17:18:05 UTC
cf. https://bugzilla.redhat.com/show_bug.cgi?id=2349926

Comment 5 Peter Bieringer 2025-03-20 05:11:18 UTC
issue looks like fixed since 6.13.7-200.fc41.x86_64

*** This bug has been marked as a duplicate of bug 2349926 ***