Bug 221331

Summary: uhci_hcd & ehci_hcd loaded too early on Dell D620, hangs lsusb
Product: [Fedora] Fedora Reporter: Stijn Hoop <stijn>
Component: kernelAssignee: Pete Zaitcev <zaitcev>
Status: CLOSED WORKSFORME QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 6CC: davej, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-02-14 12:09:29 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Output of lsusb -v, after correction of module load time
none
lspci -v output for this D620 none

Description Stijn Hoop 2007-01-03 21:12:11 UTC
Description of problem:

Running /sbin/lsusb on a Dell Latitude D620 results in an unkillable process.

Version-Release number of selected component (if applicable):

FC6 stock kernel with all current (2007/01/03) updates, 2.6.18-1.2869.fc6,
usbutils-0.71-2.1 (although I think it's a kernel bug).

How reproducible:

Always.

Steps to Reproduce:
1. Install FC6 x86_64 on a Dell Latitude D620
2. Boot it
3. Enter 'lsusb'
  
Actual results:

It hangs and the process is unkillable.

Expected results:

It should display the USB devices present in the system.

Additional info:

I encountered this while debugging S3 (suspend to RAM) which did not work out of
the box, however I think that is just a symptom.

Before I got lsusb to work I had to
- remove the modules from /boot/initrd-2.6.18-1.2869.fc6.img (now insmod
complains at preboot of course)
- rename the modules in /lib/modules/2.6.18-1.2869.fc6/kernel/drivers/usb/host,
to 'uhci-hcd.ko-DISABLED' resp. 'ehci-hcd.ko-DISABLED'
- boot the machine into X
- login & run a terminal
- run 'insmod /lib/kernel/fc6/kernel/drivers/usb/host/uhci-hcd.ko-DISABLED'
resp. 'ehci-hcd.ko-DISABLED' by hand

lsusb then started showing output, as attached to the bug report.

I created an /etc/rc.d/init.d script that loads/unloads the modules at
'chkconfig 345 97 03', which fixes the problem, confirming that there is
something about the load time of the modules that confuses the system.

I have no idea whether it's flaky hardware that just needs some more time before
initialization, or some strange sort of dependency on other hardware that gets
loaded later in the boot. I will also attach lspci output, maybe that provides a
clue.

Comment 1 Stijn Hoop 2007-01-03 21:12:11 UTC
Created attachment 144742 [details]
Output of lsusb -v, after correction of module load time

Comment 2 Stijn Hoop 2007-01-03 21:14:26 UTC
Oops, I was unclear -- it's not the whole system that hangs, just the 'lsusb'
command. And some more information: trying to remove the uhci_hcd or ehci_hcd
modules when they were loaded at boot also results in a stuck rmmod process. The
rest of the system continues to work.

Comment 3 Stijn Hoop 2007-01-03 21:15:18 UTC
Created attachment 144743 [details]
lspci -v output for this D620

Comment 4 Stijn Hoop 2007-01-03 21:22:26 UTC
Actually, now that I know where to look, I do see this when I boot from the
original initrd.img (transcribed by hand but I think it's accurate):

Red Hat nash version 5.1.19 starting
  Reading all physical volums.  This may take a while...
  Found volume group "VolGroup00" using metadata type lvm2
usb 1-2: device not accepting address 2, error -71
  2 logical volume(s) in volume group "VolGroup00" now active
                Welcome to Fedora Core

Notice the usb 1-2 line. Hope this helps!

Comment 5 Pete Zaitcev 2007-01-12 20:40:16 UTC
I'm taking this but I really can't pay this problem the attention it needs,
sorry. Maybe after the LCA. I'm leaving this in NEW state for now.

Getting sysrq-t would help (with most other processes killed). I could see
where lsusb got stuck specifically -- and also what khubd was doing.

Comment 6 Stijn Hoop 2007-02-14 12:09:29 UTC
I *finally* found some time today to retest, and of course this problem has gone
away again using the latest kernel (2.6.19-1.2911.fc6). I'll reopen this if I
can confirm it on a new kernel again.

Comment 7 Pete Zaitcev 2007-02-14 17:39:42 UTC
OK. I still think this is fishy with the port hand-over between companion
controllers. Unfortunately, firmware/BIOS is heavily involved...

Next time this happens, I'll need an output of Sysrq-T to see what hangs
in the rmmod.