Bug 129265

Summary: kernel panic when repeatedly accessing /proc/bus/usb/devices and hot-swapping usb device
Product: Red Hat Enterprise Linux 3 Reporter: Dan Mechanic <danmechanic>
Component: kernelAssignee: Pete Zaitcev <zaitcev>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: mbrandsma, petrides, redhat-bugzilla, riel
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2006-0144 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-03-15 15:37:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 168424    
Attachments:
Description Flags
Candidate #1 none

Description Dan Mechanic 2004-08-05 18:08:00 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6)
Gecko/20040207 Firefox/0.8

Description of problem:

# while true; do cat /proc/bus/usb/devices ; done

now insert and remove usb device (I'm using a HASP4 dongle)
it may take a couple of iterations, but you will eventually get:

Kernel panic: Fatal exception



Version-Release number of selected component (if applicable):
kernel-2.4.21-15.EL

How reproducible:
Always

Steps to Reproduce:
1.run: while true; do cat /proc/bus/usb/devices ; done
2.insert and remove USB device
3.repeat step 2
    

Actual Results:  Kernel Panic

Additional info:

Comment 1 Ernie Petrides 2004-08-05 23:58:09 UTC
Please provide the console "oops" output.  Thanks.  -ernie


Comment 2 Robert Scheck 2004-08-06 21:08:10 UTC
Okay, I'm able to reproduce the same using the latest kernel. I 
started the cat loop and used my memory stick and inserted and 
removed a few times and there it is: Oops. I was able to reproduce 
the same with my very very bad usb camera, but I'm simply to slow 
copying the oops error before the screen of my laptop goes 
automatically to standby :-(

Maybe the following helps you anyway:

--- snipp ---
Unable to handle kernel paging request at virtual address 6f732e80
 printing eip:
d0b4007e
*pde = 00000000
Oops: 0000
sd_mod usb-storage e100 ide-cd cdrom sg scsi_mod keybdev mousedev hid input usb-uhci usbcore ext3 jbd
CPU:    0
EIP:    0060:[<d0b4007e>]       Not tainted
EFLAGS: 00010246

EIP is at usb_dump_config [usbcore] 0x6e (2.4.21-15.0.4.EL/i686)
eax: cf120c80   ebx: 6f732e78   ecx: 00000002   edx: 0000002f
esi: 00000000   edi: ce52a0cd   ebp: ce52bf00   esp: ce319e68
ds: 0068   es: 0068   ss: 0068
Process cat (pid: 3286, stackpage=ce319000)
Stack: ce52a0a5 ce52bf00 cf120c80 00000001 00000002 00000000 00000000 cf27be00
       ce52bf00 00000018 00000001 d0b403d6 00000002 ce52a0a5 ce52bf00 cf120c80
       00000001 ce52bf00 ce52a042 cf27be00 ce52a000 d0b40517 ce52a042 ce52bf00
Call Trace:     [<d0b403d6>] usb_dump_desc [usbcore] 0xb6 (0xce319e94)
[<d9b40517>] usb_device_dump [usbcore] 0x127 (0xce319ebc)
[<d0b44165>} .rodata.str1.1 [usbcore] 0x613 (0xce319ee0)

[ HERE MY LAPTOP SCREEN TURNED OF ITSELF / STANDBY *gnarf* :-( ]
--- snapp ---

Comment 3 Pete Zaitcev 2004-10-05 05:43:15 UTC
Oooh, I understand now. I forgot that reading from /proc/bus/usb/devices
fetches all descriptors from the device live. Not surprising it causes
oopses on disconnect...

This may be a little difficult to fix, unfortunately. 2.4 is somewhat
weak in the refcounting department.


Comment 5 Pete Zaitcev 2004-10-05 21:33:57 UTC
Created attachment 104809 [details]
Candidate #1

This seems too simple to be the solution, but it appears to work.
All it does is adding the refcounting which prevents oopses;
it does not interlock with disconnects.

The 2.6 takes a different path: they are trying to fracture
big-scope locks and then re-take them as needed, but keeping
semaphores around.

Comment 6 Ernie Petrides 2005-09-22 00:34:59 UTC
A fix for this problem has just been committed to the RHEL3 U7
patch pool this evening (in kernel version 2.4.21-37.3.EL).


Comment 9 Red Hat Bugzilla 2006-03-15 15:37:08 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0144.html


Comment 10 Mitchell Brandsma 2006-05-15 08:13:00 UTC
I have in the last few days has this occur with the 2.4.21-40.ELsmp kernel on 
HS20 blades.  This has only been noticed due to a hardware fault causing three 
USB devices to be automatically assigned/deassigned every few seconds for a 
period of typically 10 minutes before the problem occurs.  So while the bug may 
have been largely fixed, there might still be a less likely cause floating 
around in the kernel source.

I'm not expecting a quick fix for this, just documenting that the problem has 
been seen to still occur, albeit requiring a lot more effort to cause it.  Is 
this a new bug, or should this be reopened?

Comment 11 Mitchell Brandsma 2006-05-15 08:44:15 UTC
Further info for our situation: These are IBM model 8843 blades - and we are 
using the usb-handoff "workaround" to prevent lockup during boot.  Today we had 
one node experience two seperate spurts of 143 disconnects and 262 disconnects 
in minutes.  The second sequence caused a panic.  On another node we had a 
sequence of 271 USB connects/disconnects cause a panic.

In all cases where a panic occurs the last thing logged is the group of 
connects.  I can only assume this means that the disconnects are the final 
straw.  Unfortunately, as we are having the hardware issue fixed this is going 
to be difficult to replicate without risking the sanity of our production 
environment...

Comment 12 Mitchell Brandsma 2006-05-17 01:50:51 UTC
I managed to capture the last bit of a kernel panic this morning... here 'tis:

Process modprobe (pid: 29354, stackpage=e1677000)
Stack: e429d200 f7502200 00000001 f5d6f000 f898db08 f0d62800 00000000 f89a0708
       00000002 f898d150 f5d6f000 f8c24180 f898d0e6 f8999ce4 f8c20ec2 c0387f84
       f8c162a9 f8c24180 c012afc6 c04824a4 00000001 f8c12000 0000004d f8c2170c
Call Trace:   [<f898db08>] usb_check_support [usbcore] 0x68 (0xe1677ed4)
[<f89a0708>] usb_bus_list [usbcore] 0x0 (0xe1677ee0)
[<f898d150>] usb_scan_devices_Rsmp_ca4f6301 [usbcore] 0x40 (0xe1677ee8)
[<f8c24180>] usb_storage_driver [usb-storage] 0x0 (0xe1677ef0)
[<f898d0e6>] usb_register_Rsmp_6654b6ff [usbcore] 0x86 (0xe1677ef4)
[<f8999ce4>] .rodata.str1.4 [usbcore] 0x0 (0xe1677ef8)
[<f8c20ec2>] .rodata.str1.1 [usb-storage] 0x723 (0xe1677efc)
[<f8c162a9>] usb_stor_init [usb-storage] 0x59 (0xe1677f04)
[<f8c24180>] usb_storage_driver [usb-storage] 0x0 (0xe1677f08)
[<c012afc6>] sys_init_module [kernel] 0x5b6 (0xe1677f0c)
[<f8c2170c>] .kmodtab [usb-storage] 0x0 (0xe1677f20)
[<f8c12060>] host_info [usb-storage] 0x0 (0xe1677f2c)
[<f8c21724>] __ksymtab [usb-storage] 0x0 (0xe1677f30)
[<f8c12060>] host_info [usb-storage] 0x0 (0xe1677f58)
[<c02af06f>] no_timing [kernel] 0x7 (0xe1677fc0)

Code: 80 78 04 00 75 07 83 c4 08 5b 5e c3 90 89 5c 24 04 43 89 34

Kernel panic: Fatal exception

And this was in the messages just before the panic occurred...
May 16 22:41:56 hn /etc/hotplug/usb.agent: Setup usbcore for USB product 
4b4/5204/1
May 16 22:41:56 hn /etc/hotplug/usb.agent: Setup usbcore for USB product 
4b4/5204/1
May 16 22:41:56 hn devlabel: devlabel's temporary ignore 
list /etc/sysconfig/devlabel.d/ignore_list has been emptied due to a change in 
device configuration.
May 16 22:41:56 hn devlabel: devlabel service started/restarted
May 16 22:41:56 hn /etc/hotplug/usb.agent: Setup hid usb-storage for USB 
product 4b3/4004/1
May 16 22:41:56 hn last message repeated 2 times
May 16 22:41:56 hn kernel: Initializing USB Mass Storage driver...
May 16 22:41:56 hn kernel: usb.c: registered new driver usb-storage
May 16 22:41:56 hn kernel: scsi3 : SCSI emulation for USB Mass Storage devices
May 16 22:41:57 hn /etc/hotplug/usb.agent: Setup hid for USB product 4b3/4004/1
May 16 22:41:57 hn /etc/hotplug/usb.agent: Setup hid for USB product 4b3/4004/1
May 16 22:41:57 hn /etc/hotplug/usb.agent: Setup keybdev mousedev for USB 
product 4b3/4004/1
May 16 22:41:57 hn /etc/hotplug/usb.agent: Setup keybdev mousedev for USB 
product 4b3/4004/1
May 16 22:41:57 hn devlabel: devlabel's temporary ignore 
list /etc/sysconfig/devlabel.d/ignore_list has been emptied due to a change in 
device configuration.
May 16 22:41:57 hn devlabel: devlabel's temporary ignore 
list /etc/sysconfig/devlabel.d/ignore_list has been emptied due to a change in 
device configuration.
May 16 22:41:57 hn devlabel: devlabel service started/restarted
May 16 22:41:57 hn devlabel: devlabel service started/restarted
May 16 22:42:12 hn kernel: usb-storage: Refusing to reset a multi-interface 
device
May 16 22:42:16 hn kernel: usb.c: USB disconnect on device 00:1d.0-2 address 8
May 16 22:42:16 hn kernel: usb.c: USB disconnect on device 00:1d.0-2.1 address 9
May 16 22:42:16 hn kernel: usb.c: USB disconnect on device 00:1d.0-2.3 address 
10
May 16 22:42:16 hn kernel: inserting floppy driver for 2.4.21-40.ELsmp
May 16 22:42:16 hn kernel: Floppy drive(s): fd0 is 1.44M