Bug 471560 - [4.7.z] Unable to Unload "ohci-hcd " And to Reboot
[4.7.z] Unable to Unload "ohci-hcd " And to Reboot
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.7.z
All Linux
medium Severity medium
: rc
: ---
Assigned To: Pete Zaitcev
Martin Jenner
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-11-14 04:21 EST by CAI Qian
Modified: 2009-05-18 15:34 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-05-18 15:34:48 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
some SysRq logs captured during the hang. (58.99 KB, text/plain)
2008-11-14 04:30 EST, CAI Qian
no flags Details

  None (edit)
Description CAI Qian 2008-11-14 04:21:09 EST
Description of problem:
When loading/unloading ohci-hcd in a loop, the following messages were printed on the serial console after the while,

...
usb 3-1: device not accepting address 2, error -108
hub 3-0:1.0: cannot disable port 1 (err = -19)
hub 3-0:1.0: couldn't allocate port 1 usb_device
hub 3-0:1.0: cannot disable port 1 (err = -19)
hub 3-0:1.0: hub_port_status failed (err = -19)
hub 3-0:1.0: connect-debounce failed, port 1 disabled
hub 3-0:1.0: cannot disable port 1 (err = -19)
hub 3-0:1.0: hub_port_status failed (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: Cannot enable port 1.  Maybe the USB cable is bad?
hub 3-0:1.0: cannot disable port 1 (err = -19)
hub 3-0:1.0: couldn't allocate port 1 usb_device
hub 3-0:1.0: cannot disable port 1 (err = -19)
usb 3-1: control timeout on ep0out

As the result, ohci-hcd was not able to remove anymore, and the system could not reboot.

# rmmod ohci-hcd
ERROR: Removing 'ohci_hcd': Device or resource busy

# rmmod -f -v ohci-hcd
rmmod ohci_hcd, wait=no force
ERROR: Removing 'ohci_hcd': Device or resource busy

# lsmod
Module                  Size  Used by
ohci_hcd               24657  0 
parport_pc             29569  0 
lp                     15345  0 
parport                44621  2 parport_pc,lp
netconsole              8541  0 
netdump                15552  0 
md5                     5953  1 
ipv6                  285729  14 
autofs4                28105  0 
sunrpc                177337  1 
ds                     21961  0 
yenta_socket           23105  0 
pcmcia_core            70097  2 ds,yenta_socket
powernow_k8            14881  0 
cpufreq_powersave       3265  0 
loop                   18769  0 
joydev                 12225  0 
button                  9313  0 
battery                11465  0 
ac                      6985  0 
ehci_hcd               35273  0 
k8_edac                19561  0 
edac_mc                34305  1 k8_edac
snd_hda_intel         325417  0 
snd_pcm_oss            60025  0 
snd_mixer_oss          21825  1 snd_pcm_oss
snd_pcm               110537  2 snd_hda_intel,snd_pcm_oss
snd_timer              29129  1 snd_pcm
snd                    64265  5 snd_hda_intel,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer
soundcore              12769  1 snd
snd_page_alloc         14041  2 snd_hda_intel,snd_pcm
tg3                   120133  0 
floppy                 66065  0 
sr_mod                 20965  0 
dm_snapshot            19201  0 
dm_zero                 3649  0 
dm_mirror              33609  0 
ext3                  138961  2 
jbd                    69489  1 ext3
dm_mod                 76969  8 dm_snapshot,dm_zero,dm_mirror
ahci                   28101  2 
libata                124713  1 ahci
sd_mod                 19393  3 
scsi_mod              145425  4 sr_mod,ahci,libata,sd_mod

# reboot
INIT: Sending pStopping HAL daemon: [FAILED]
Stopping system message bus: [  OK  ]
Stopping atd: [  OK  ]
Stopping cups: [  OK  ]
Shutting down xfs: [  OK  ]
Shutting down console mouse services: [  OK  ]
Stopping sshd:[  OK  ]
Shutting down sm-client: [  OK  ]
Shutting down sendmail: [  OK  ]
Shutting down smartd: [  OK  ]
disabling netconsole [  OK  ]
disabling netdump [  OK  ]
Stopping xinetd: [  OK  ]
Stopping acpi daemon: [  OK  ]
Stopping crond: [  OK  ]
Stopping autofs:  [  OK  ]
Shutting down ntpd: [  OK  ]
Stopping NFS statd: [  OK  ]
Stopping irqbalance: [  OK  ]
Stopping portmap: [  OK  ]
Shutting down kernel logger: [  OK  ]
Shutting down system logger: [  OK  ]
Shutting down interface eth0:  [  OK  ]
Shutting down loopback interface:  [  OK  ]
Stopping pcmcia:  
<hung here ...>

I'll attach some SysRq logs captured during the hang.

Version-Release number of selected component (if applicable):
kernel-2.6.9-78.0.8.EL

How reproducible:
always

Steps to Reproduce:
1. reserve hp-xw4550-01.rhts.bos.redhat.com
2. while :; do rmmod ohci-hcd; modprobe ohci-hcd; done
3. monitor the serial console and wait for the following message coming up.

   usb 3-1: control timeout on ep0out

4. rmmod ohci-hcd
5. reboot
  
Actual results:
In step 3, the module cannot be removed due to "ERROR: Removing 'ohci_hcd': Device or resource busy".
In step 4, the machine cannot reboot.

Expected results:
The module should be able to remove; the machine should be able to reboot.
Comment 1 CAI Qian 2008-11-14 04:28:36 EST
By the way, it is usually reproducible within 30 minutes.
Comment 2 CAI Qian 2008-11-14 04:30:43 EST
Created attachment 323558 [details]
some SysRq logs captured during the hang.
Comment 3 CAI Qian 2008-11-14 06:17:09 EST
The following from the above looks interesting to me.

rmmod         D 0000000000503010     0  3904   6952                     (NOTLB)
000001001fc87d38 0000000000000002 000001001fc87cc8 ffffffff00000073 
       0000000000000001 000000008013475a 0000010001196b00 00000000803f68a0 
       00000100361687f0 0000000000005eb5 
Call Trace:<ffffffff801347b0>{__wake_up+54} <ffffffff803178b7>{__down+147} 
       <ffffffff80134709>{default_wake_function+0} <ffffffff803193d5>{__down_failed+53} 
       <ffffffff802947a9>{.text.lock.hub+55} <ffffffff802992f5>{usb_hcd_pci_remove+137} 
       <ffffffff801f8ff0>{pci_device_remove+34} <ffffffff8024f8e8>{device_release_driver+83} 
       <ffffffff8024fc96>{bus_remove_driver+153} <ffffffff8024ffbc>{driver_unregister+17} 
       <ffffffff801f8e1f>{pci_unregister_driver+15} <ffffffff8014f5df>{sys_delete_module+487} 
       <ffffffff801ef400>{__down_read_trylock+41} <ffffffff8016f570>{sys_munmap+94} 
       <ffffffff801102f6>{system_call+126}
Comment 4 Pete Zaitcev 2008-11-20 22:30:55 EST
The symptom is different from the oops in bug 455843, but while working
on that one I found that since there is a whole bunch of hanging pointers,
rmmod should not be tried without it fixed. Results may be quite wild,
and necesserily just the oops. Cai, plase try the kernel from this URL:
 http://people.redhat.com/zaitcev/ftp/455843/

There's also a hang specifically in rmmod ohci_hcd (does not happen
with other HCDs). It occurs when unlinks cross with rmmod: rmmod changes
the controller's state and so unlinks cannot return any callbacks, so
something hangs. The following may be helpful:

--- linux-2.6.9-78.12.EL/drivers/usb/host/ohci-q.c	2004-10-18 14:55:24.000000000 -0700
+++ linux-2.6.9-78.12.EL-u1/drivers/usb/host/ohci-q.c	2008-11-20 19:26:24.000000000 -0800
@@ -170,9 +170,6 @@ static int ed_schedule (struct ohci_hcd 
 {	 
 	int	branch;
 
-	if (ohci->hcd.state == USB_STATE_QUIESCING)
-		return -EAGAIN;
-
 	ed->state = ED_OPER;
 	ed->ed_prev = NULL;
 	ed->ed_next = NULL;

But we don't know if this is what this bug is. Let's reproduce the
problem on the test kernel for bug 455843, then work from there.
Comment 5 CAI Qian 2008-11-21 02:01:33 EST
The new Kernel gave an oops.

hub 3-0:1.0: connect-debounce failed, port 1 disabled
hub 3-0:1.0: cannot disable port 1 (err = -19)
bus 3: replacing with dummies
bus 4: replacing with dummies
bus 5: replacing with dummies
bus 6: replacing with dummies
bus 2: replacing with dummies
hub 3-0:1.0: hub_port_status failed (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: Cannot enable port 1.  Maybe the USB cable is bad?
hub 3-0:1.0: cannot disable port 1 (err = -19)
hub 3-0:1.0: couldn't allocate port 1 usb_device
hub 3-0:1.0: cannot disable port 1 (err = -19)
bus 3: replacing with dummies
bus 4: replacing with dummies
bus 5: replacing with dummies
bus 6: replacing with dummies
bus 2: replacing with dummies
hub 3-0:1.0: hub_port_status failed (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: Cannot enable port 1.  Maybe the USB cable is bad?
hub 3-0:1.0: cannot disable port 1 (err = -19)
hub 3-0:1.0: couldn't allocate port 1 usb_device
hub 3-0:1.0: cannot disable port 1 (err = -19)
bus 3: replacing with dummies
bus 4: replacing with dummies
bus 5: replacing with dummies
bus 6: replacing with dummies
bus 2: replacing with dummies
Unable to handle kernel NULL pointer dereference at 00000000000001b8 RIP: 
<ffffffffa02b8568>{:ohci_hcd:ohci_hub_resume+23}
PML4 2fa21067 PGD 30378067 PMD 0 
Oops: 0000 [1] SMP 
CPU 1 
Modules linked in: ohci_hcd parport_pc lp parport md5 ipv6 autofs4 sunrpc ds yenta_socket pcmcia_core powernow_k8 cpufreq_powersave loop joydev button battery ac ehci_hcd k8_edac edac_mc snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_hwdep snd soundcore tg3 floppy sr_mod dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod ahci libata sd_mod scsi_mod
Pid: 7, comm: events/1 Not tainted 2.6.9-78.18.EL.bz455843.1smp
RIP: 0010:[<ffffffffa02b8568>] <ffffffffa02b8568>{:ohci_hcd:ohci_hub_resume+23}
RSP: 0018:0000010037b8de28  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000010034a95800 RCX: 0000000000000297
RDX: 0000000000000000 RSI: 000000000000006c RDI: 0000010034a95800
RBP: 0000010034a95800 R08: 0000010037b8c000 R09: 0000000300000000
R10: 0000000300000000 R11: 00000000013ddb5c R12: 00000000ffffff8d
R13: 0000000000000000 R14: 0000010034a95800 R15: ffffffffa02b8af3
FS:  0000002a958a5b00(0000) GS:ffffffff8050cf80(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000000001b8 CR3: 0000000037c28000 CR4: 00000000000006e0
Process events/1 (pid: 7, threadinfo 0000010037b8c000, task 0000010037b65030)
Stack: 0000000300000000 0000010034a95800 0000010034a95e70 0000010036fcdc30 
       0000010036fcdc00 ffffffffa02b8b2c 0000010034a95e68 0000010037cb9440 
       0000000000000297 ffffffff8014894e 
Call Trace:<ffffffffa02b8b2c>{:ohci_hcd:ohci_rh_resume+57} <ffffffff8014894e>{worker_thread+419} 
       <ffffffff80134837>{default_wake_function+0} <ffffffff80134888>{__wake_up_common+67} 
       <ffffffff80134837>{default_wake_function+0} <ffffffff801487ab>{worker_thread+0} 
       <ffffffff8014c6a7>{kthread+200} <ffffffff80110fd3>{child_rip+8} 
       <ffffffff8014c5df>{kthread+0} <ffffffff80110fcb>{child_rip+0} 
       

Code: 41 83 bd b8 01 00 00 00 0f 84 74 05 00 00 48 8b 05 43 6c 25 
RIP <ffffffffa02b8568>{:ohci_hcd:ohci_hub_resume+23} RSP <0000010037b8de28>
CR2: 00000000000001b8
 <0>Kernel panic - not syncing: Oops
Comment 6 CAI Qian 2008-11-21 02:51:48 EST
Even with the new Linux kernel, I can still reproduce the original problem sometimes.

bus 3: replacing with dummies
bus 4: replacing with dummies
bus 5: replacing with dummies
bus 6: replacing with dummies
bus 2: replacing with dummies
hub 3-0:1.0: hub_port_status failed (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: cannot reset port 1 (err = -19)
hub 3-0:1.0: Cannot enable port 1.  Maybe the USB cable is bad?
hub 3-0:1.0: cannot disable port 1 (err = -19)
hub 3-0:1.0: couldn't allocate port 1 usb_device
hub 3-0:1.0: cannot disable port 1 (err = -19)
bus 3: replacing with dummies
bus 4: replacing with dummies
bus 5: replacing with dummies
bus 6: replacing with dummies
bus 2: replacing with dummies
usb 3-1: control timeout on ep0in
Comment 7 CAI Qian 2008-11-21 04:36:29 EST
I have tried 6 times on this one. 3 times with this bug; 3 times with the oops.
Comment 9 Pete Zaitcev 2008-11-26 12:17:08 EST
I was able to reproduce the hang @hp-xw4550-01, although not the oops, sadly.
Working on the hang for now. My patchlet for ed_schedule does not fix it.

The key is khubd being stuck with a lost unlink callback:

khubd         D 0000000000001388     0    37      1            66     7 (L-TLB)
000001003770bbf8 0000000000000046 0000000000000246 0000010000000073
       0000000232a679c0 0000000000000000 0000010001195aa0 0000000000000000
       00000100376d87f0 000000000000031b
Call Trace:<ffffffff8029552d>{hcd_submit_urb+1463} <ffffffff80318708>{wait_for_completion+167}
       <ffffffff80134837>{default_wake_function+0} <ffffffff80134837>{default_wake_function+0}
       <ffffffff80296022>{usb_start_wait_urb+185} <ffffffff80295ef4>{timeout_kill+0}  
       <ffffffff802960d9>{usb_internal_control_msg+122} <ffffffff8011f63c>{flat_send_IPI_mask+0}
       <ffffffff80296168>{usb_control_msg+123} <ffffffff802968b3>{usb_get_descriptor+103}
       <ffffffff80296bc0>{usb_get_device_descriptor+74} <ffffffff8029337d>{hub_port_init+1193}
       <ffffffff80293e79>{hub_thread+1399} <ffffffff801361f8>{autoremove_wake_function+0}
       <ffffffff801361f8>{autoremove_wake_function+0} <ffffffff80110fd3>{child_rip+8}
       <ffffffff801db498>{selinux_d_instantiate+0} <ffffffff80293902>{hub_thread+0}
       <ffffffff80110fcb>{child_rip+0}

I think rmmod is secondary to the above.
Comment 10 Pete Zaitcev 2008-12-05 00:57:22 EST
Created attachment 325802 [details]
WIP - survives for a day

Looks like I found the oops problem: there's a window when
usb_disconnect(&hcd->self.root_hub) set the root_hub to NULL,
but the HC is still running and can produce an RD event.
In response we schedule the work, and it oopses with NULL root_hub.

However, the hang has become elusive now. Although maybe it was
caused by RD-initiated schedule poking at HC while we were stopping it.
This is work in progress.

The patch includes the whole of the fix for bug 455843, or else rmmod
loop will inevitably oops for other reasons.
Comment 11 Pete Zaitcev 2008-12-05 19:35:19 EST
I have a test kernel 2.6.9-78.18.EL.bz455843.4 which incorporates the
necessary fixes from bz455843, a fix for a hang in ed_schedule, a
fix for the crash due to NULL root_hub (like comment #5).

I am not exactly sure what has caused the hang in comment #6 (we know
that it's a missing unlink callback, but we don't know why it occured).
I am unable to reproduce it anymore on hp-xw4550-01.

Please test, same location:
 http://people.redhat.com/zaitcev/ftp/455843/
Comment 12 CAI Qian 2008-12-10 06:04:21 EST
The machine has survived without hang or panic after 8 hours.
Comment 13 RHEL Product and Program Management 2008-12-17 15:19:19 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 14 Jan Tluka 2009-01-06 11:19:11 EST
Reproducer available and HW in RHTS.
Comment 15 Vivek Goyal 2009-01-15 09:04:49 EST
Committed in 78.29.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Comment 17 CAI Qian 2009-04-16 12:31:00 EDT
Pete, I have seen an Oops on another machine while testing if a RHEL4.7 EUS kernel,

...
hub 2-0:1.0: hub_port_status failed (err = -19)
hub 2-0:1.0: connect-debounce failed, port 1 disabled
hub 2-0:1.0: cannot disable port 1 (err = -19)
hub 2-0:1.0: hub_port_status failed (err = -19)
hub 2-0:1.0: hub_port_status failed (err = -19)
hub 2-0:1.0: connect-debounce failed, port 1 disabled
hub 2-0:1.0: cannot disable port 1 (err = -19)
hub 2-0:1.0: hub_port_status failed (err = -19)
hub 2-0:1.0: hub_port_status failed (err = -19)
hub 2-0:1.0: connect-debounce failed, port 2 disabled
hub 2-0:1.0: cannot disable port 2 (err = -19)
Unable to handle kernel NULL pointer dereference at virtual address 00000158
 printing eip:
f893e2cd
*pde = 00000000
Oops: 0000 [#1]
Modules linked in: ohci_hcd nfs nfsd exportfs lockd nfs_acl md5 ipv6 parport_pc lp parport autofs4 sunrpc cpufreq_powersave loop joydev button battery ac ehci_hcd k8_edac edac_mc tg3 sg dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod sata_svw libata mptscsih mptsas mptspi mptscsi mptbase sd_mod scsi_mod
CPU:    0
EIP:    0060:[<f893e2cd>]    Not tainted VLI
EFLAGS: 00010246   (2.6.9-78.0.19.EL) 
EIP is at ohci_hub_resume+0x17/0x3db [ohci_hcd]
eax: 00000000   ebx: f3308d40   ecx: f2127024   edx: f7fef000
esi: f3308fe4   edi: ffffff8d   ebp: 00000000   esp: f7feff4c
ds: 007b   es: 007b   ss: 0068
Process events/0 (pid: 3, threadinfo=f7fef000 task=f7e50bd0)
Stack: f2127024 f3308fe4 f2127024 f2127000 f7e95180 f893e6c4 f3308f94 00000283 
       f3308fe4 c0138904 f893e691 ffffffff ffffffff 00000001 00000000 c012007b 
       00010000 00000000 c0325289 f7feffcc 00000046 00000000 f7e50bd0 c012007b 
Call Trace:
 [<f893e6c4>] ohci_rh_resume+0x33/0x46 [ohci_hcd]
 [<c0138904>] worker_thread+0x1eb/0x2f0
 [<f893e691>] ohci_rh_resume+0x0/0x46 [ohci_hcd]
 [<c012007b>] default_wake_function+0x0/0xc
 [<c0325289>] schedule+0x44d/0x606
 [<c012007b>] default_wake_function+0x0/0xc
 [<c0138719>] worker_thread+0x0/0x2f0
 [<c013d7e1>] kthread+0x69/0x91
 [<c013d778>] kthread+0x0/0x91
 [<c01041dd>] kernel_thread_helper+0x5/0xb
Code: 7e c7 c7 43 04 00 00 00 00 fb 89 e9 5a 89 c8 5b 5e 5f 5d c3 55 57 bf 8d ff ff ff 56 53 8d 98 5c fd ff ff 56 8b 68 28 89 c6 31 c0 <83> bd 58 01 00 00 00 0f 84 b1 03 00 00 a1 20 84 37 c0 3b 83 4c 
 <0>Fatal exception: panic in 5 seconds
Kernel panic - not syncing: Fatal exception

Do you think it worth trying to reproduce it for RHEL4.8 kernels?

More information,

https://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=7702158

********** System Information **********
Hostname                = ibm-e326m.rhts.bos.redhat.com
Kernel Version          = 2.6.9-78.0.19.EL
Machine Hardware Name   = i686
Processor Type          = athlon
uname -a output         = Linux ibm-e326m.rhts.bos.redhat.com 2.6.9-78.0.19.EL #1 Mon Apr 6 07:39:02 EDT 2009 i686 athlon i386 GNU/Linux
Swap Size               = 4031 MB
Mem Size                = 2026 MB
Number of Processors    = 1
System Release          = Red Hat Enterprise Linux AS release 4 (Nahant Update 7)
Command Line            = ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200
System NMI Interrupts   = NMI:          0 
********** LSPCI **********
00:01.0 PCI bridge: Broadcom BCM5785 [HT1000] PCI/PCI-X Bridge
00:02.0 Host bridge: Broadcom BCM5785 [HT1000] Legacy South Bridge
00:02.1 IDE interface: Broadcom BCM5785 [HT1000] IDE
00:02.2 ISA bridge: Broadcom BCM5785 [HT1000] LPC
00:03.0 USB Controller: Broadcom BCM5785 [HT1000] USB (rev 01)
00:03.1 USB Controller: Broadcom BCM5785 [HT1000] USB (rev 01)
00:03.2 USB Controller: Broadcom BCM5785 [HT1000] USB (rev 01)
00:04.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 01)
00:06.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-X bridge (rev a3)
00:07.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-X bridge (rev a3)
00:08.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3)
00:09.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3)
00:0a.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3)
00:0b.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
01:0d.0 PCI bridge: Broadcom BCM5785 [HT1000] PCI/PCI-X Bridge (rev b2)
01:0e.0 IDE interface: Broadcom BCM5785 [HT1000] SATA (Native SATA Mode)
01:0e.1 IDE interface: Broadcom BCM5785 [HT1000] SATA (Native SATA Mode)
02:03.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 08)
04:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5780 Gigabit Ethernet (rev 03)
04:04.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5780 Gigabit Ethernet (rev 03)
********** Modprob **********
alias eth0 tg3
alias eth1 tg3
alias scsi_hostadapter mptbase
alias scsi_hostadapter1 mptscsi
alias scsi_hostadapter2 mptspi
alias scsi_hostadapter3 mptsas
alias scsi_hostadapter4 mptscsih
alias scsi_hostadapter5 sata_svw
alias usb-controller ehci-hcd
alias usb-controller1 ohci-hcd
Comment 18 CAI Qian 2009-04-23 00:22:53 EDT
FYI. I have not been able to this on the latest RHEL4.8 kernel -- 2.6.9-88.EL in the above RHTS machine by running the reproducer for 8 hours.
Comment 20 errata-xmlrpc 2009-05-18 15:34:48 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html

Note You need to log in before you can comment on or make changes to this bug.