Bug 241002 - USB2.0 hub disappears after overcurrent on another port
Summary: USB2.0 hub disappears after overcurrent on another port
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.5
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: John Feeney
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks: 246028
TreeView+ depends on / blocked
 
Reported: 2007-05-23 16:22 UTC by Stuart Hayes
Modified: 2007-11-17 01:14 UTC (History)
1 user (show)

Fixed In Version: RHBA-2007-0791
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-11-15 16:27:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch for rhel4.5 (2.6.9-55.EL) (4.36 KB, patch)
2007-05-23 16:22 UTC, Stuart Hayes
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2007:0791 0 normal SHIPPED_LIVE Updated kernel packages available for Red Hat Enterprise Linux 4 Update 6 2007-11-14 18:25:55 UTC

Description Stuart Hayes 2007-05-23 16:22:01 UTC
Description of problem:

This problem is very similar to BZ231226--this was discovered while testing 
the fix that's in 231226.

When an empty USB port on the rear of a system has an overcurrent (I use a 
paper clip to briefly connect the power pin to the chassis), the front ports 
on this system quit working.  The front ports on this system are connected 
through an internal Cypress USB2.0 hub.

The problem is that the overcurrent causes the EHCI controller to get an error 
when it tries to talk to the hub, and hub_events() will see hub->error, and 
try to reset the controller by calling hub_reset().  The hub_reset() function 
calls __usb_reset_device, which refuses to reset a hub (see the FIXME in 
__usb_reset_device()), so hub_reset() fails, and the hub is gone.

This was fixed some time ago upstream, with this patch:
http://marc.info/?l=linux-usb-devel&m=109511190511780&w=2

It was trivial to port this patch to RHEL4.5 (2.6.9-55.EL).  I'll attach the 
patch for RHEL4.5.



Version-Release number of selected component (if applicable):
2.6.9-55.EL

How reproducible:
every time

Steps to Reproduce:
1. get system with internal cypress usb hub (many dell servers have this)
2. short power pin to chassis on unused rear USB port
3. observe that hub is no longer listed in "lsusb" output
  
Actual results:
hub disappears

Expected results:
hub should reappear after overcurrent error handling

Additional info:

Comment 1 Stuart Hayes 2007-05-23 16:22:01 UTC
Created attachment 155272 [details]
patch for rhel4.5 (2.6.9-55.EL)

Comment 2 RHEL Program Management 2007-06-22 22:04:41 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 3 John Feeney 2007-07-02 19:13:39 UTC
Just for the record, I found the patch above did not compile cleanly in
RHEL-4 because usb_disconnect_nolock() was no longer called. I removed
this function, built rpms with Brew, and asked Stuart to test Brew-built 
release which he did successfully (thanks again). I could not find
usb_disconnect_nolock() upstream but it was added to RHEL-4 by another 
patch. 

Submitted to rhkernel-list so setting state to post.

Comment 4 Pete Zaitcev 2007-07-18 00:52:57 UTC
The usb_disconnect_nolock was added to fix bug 171220.

Comment 5 Stuart Hayes 2007-07-18 16:22:44 UTC
It looks like the code that calls usb_disconnnect_nolock() is 
hub_start_disconnect() in the -55 kernel (before the patch in comment #1 is 
applied).  After the patch from comment #1 is applied, that code is in 
hub_pre_reset().

So, to apply the patch from comment #1 AND keep the fix for bug 171220, I 
believe all you'd need to do is apply the patch from comment #1 and then 
modify hub_pre_reset() to call usb_disconnect_nolock() instead of 
usb_disconnect().


Comment 6 Pete Zaitcev 2007-07-18 22:16:02 UTC
I disagree with Stuart about the usb_disconnect_nolock. The code to disconnect
the hub itself (if reset fails) is removed completely by the patch in question.
The code to disconnect hub's children was moved from hub_reset to hub_pre_reset.
The deadlock in bug 171220 was caused by hub_start_disconnect, and thus cannot
happen if hub_start_disconnect is removed. I may be wrong, but it looks this way.

John, please always attach the patch to the bug as posted for review.
Don't make us guess later.

Comment 7 John Feeney 2007-07-18 22:37:20 UTC
Pete,
Sorry if there was confusion about the patch. I did attach the patch that
I created from Stuart's when I posted it to rhkernel-list. I wrote you an 
email recently where I tried to explain the code and it crossed my mind that I 
should include the new patch as well, but I didn't. Perhaps that is where I went
wrong.

 John 

Comment 8 Stuart Hayes 2007-07-19 13:55:48 UTC
I would assume Pete is correct in comment #6.  I didn't spend too much time 
looking at the code when I posted comment #5.  I'll look at it again today to 
convince myself.

Comment 9 Stuart Hayes 2007-07-19 15:22:02 UTC
OK, yeah, I agree with Pete.  With the patch from comment #1 applied to the -
55 kernel, the code in hub_pre_reset is disconnecting the hub's children.  The 
code that caused the deadlock in bug 171220 was trying to disconnect the hub 
itself.  With the patch from comment #1 applied to the -55 kernel, it doesn't 
look like the hub itself gets disconnected when there's a hub error--it will 
disconnect the children, and try to reset the hub, and spew an error message 
if it can't reset the hub, but I don't see it actually disconnecting the hub 
itself.

I'm sorry about the confusion I caused with comment #5.


Comment 10 Jason Baron 2007-07-25 14:29:56 UTC
committed in stream U6 build 55.22. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 14 errata-xmlrpc 2007-11-15 16:27:32 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0791.html



Note You need to log in before you can comment on or make changes to this bug.