Bug 199163 - System crashes with USB hard drive
Summary: System crashes with USB hard drive
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel   
(Show other bugs)
Version: 5
Hardware: i586
OS: Linux
Target Milestone: ---
Assignee: Pete Zaitcev
QA Contact: Brian Brock
Depends On:
TreeView+ depends on / blocked
Reported: 2006-07-17 17:01 UTC by Chris Brand
Modified: 2008-03-12 06:11 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-03-12 06:11:30 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
/proc/bus/usb/devices from the system (1.50 KB, text/plain)
2006-07-25 20:57 UTC, Chris Brand
no flags Details

Description Chris Brand 2006-07-17 17:01:34 UTC
Description of problem:
I get infrequent (maybe once a week) crashes with a USB hard drive attached.
This is roughly what appears on the console (multiple times) when the machine dies :
filp_close+0x52/0x59 common_interrupt+0xfa/0x20
BUG: warning at kernel/panic.c:138/panic() (Not tainted)
panic+0x180/0x193 die+0x26c/0x2a0
do_invalid_op+0x0/0xab do_invalid_op+0xa2/0xab
uhci_unlink_isochronous_tds+0x37/0xe7 [uhci_hcd] complete+0x2b/0x3d
sg_complete+0x198/0x1a8 ide_dma_exec_cmd+0x1f/0x22
uhci_scan_schedule+0x55e/0x5f0 [uhci_hcd] error_code+0x4f/0x54
uhci_scan_schedule+0x55e/0x5f0 [uhci_hcd] uhci_unlink_isochronous_tds+0x37/0xe7
freed_request+0x1d/0x37 uhci_irq+0x134/0x14a
usb_hcd_irq+0x23/0x4f handle_IRQ_event+0x23/0x4c
__do_IRQ+0x78/0xd1 do_IRQ+0xb3/0x80

Version-Release number of selected component (if applicable):
This is a regularly updated FC5 machine. The kernel running at the time was, I
believe 2.6.17-1.2145_FC5

How reproducible:

Steps to Reproduce:
1. It's always happened at the weekend with the USB harddrive mounted.
Actual results:
Machine crashes and has to be rebooted.

Expected results:
Machine stays alive for months on end, happily using the USB harddrive.

Additional info:

Comment 1 Pete Zaitcev 2006-07-17 19:24:06 UTC
It would help if the console output were captured with netconsole or
serial console.

Comment 2 Chris Brand 2006-07-17 19:33:09 UTC
That would be much easier for me, too. Unfortunately, the machine is completely
dead at this point. I have two things I can do :
1. reboot it.
2. leave it as it is.

Comment 3 Pete Zaitcev 2006-07-17 21:36:48 UTC
Either of the netconsole or serial console would catch preceding messages,
the precise stack trace of the first traceback if several happen, and the
precise kernel version - even if the box ends up completely dead.

Comment 4 Chris Brand 2006-07-17 23:29:39 UTC
I'm quite happy to run any tools that might help diagnose this problem. I do
have other machines on the same network.

Is netconsole the netdump and netdump-server packages ?

I tried to run netdump on that machine, but it fails to find the netdump module.
Am I right in thinking that there is no kernel-kdump package for the most recent
2 kernels ?

Comment 5 Chris Brand 2006-07-18 16:32:54 UTC
It crashed again last night. I don't know whether it's the same crash, of
course, but I do have different (earlier) information :
CPU : 0
EIP: 0060 [<ef83ac02>] Not tainted VLI
EFLAGS: 00010046 (2.6.17-1-FC5 #1)
EIP is at rt18139_poll+0x40b/0x49a [8139too]
eax: c07ade00 ebx: eca0f0c0 ecx: eca0f000 edx: ef82003c
esi: 00000001 edi: 000005ee ebp: ef820037 esp: c075bf5c
ds: 007b es: 007b ss: 0068
Process gzip (pid: 11115, thread_info=c075b000 task=dda14550)
Stack: 000005ee 00000001 ef820037 c075bf7c eca0f0c0 ef82003c eca0f000 c07ade00
       c0769420 d1859000 0001e000 00000001 3b9aca00 c075bfd8 eca0f000 eca0f2c0
       ef820000 00000040 00000001 f4145284 ec1b0000 ef820000 000005ee eca0f000
Call Trace:
<c05ad8e4> net_rx_action+0x7d/0x151 <c0420703> __do_softirq+0x35/0x7f
<c040508a> do_softirq+0x38/0x42
<c0405047> do_IRQ+0x75/0x80 <c04036f2> common_interrupt+0x1a/0x20
Code: <lots of bytes that I'm going to omit, but can provide if you really need
EP: [<ef83ac02>] rt18139_poll+0x40b/0x49a [8139too] SS:ESP 0068:c075bf5c
<0> Kernel panic - not syncing: Fatal exception in interrupt

Comment 6 Pete Zaitcev 2006-07-18 19:53:49 UTC
This does not seem the same, unless it's a common issue with RAM or power supply.
Please do not retype, it's tiring and not reliable. If serial console is not
easy to set up, the best workaround is to capture the screen with a camera.

Comment 7 Pete Zaitcev 2006-07-25 20:37:51 UTC
Chris, one more thing, while you're working on real console captures...
I looked at the original trace and it seems inconceivable for a disk
encolosure to use ISO transfers. Something is fishy here, I'm missinfg
an important detail. Please attach the complete /proc/bus/usb/devices.
Maybe it has a hint.

Comment 8 Chris Brand 2006-07-25 20:57:59 UTC
Created attachment 133023 [details]
/proc/bus/usb/devices from the system

Comment 9 Pete Zaitcev 2006-07-25 22:58:26 UTC
OK, thanks. I expected to find some other device, like a webcam. Those do
use ISOs, but disks do not.

I suppose the most important thing to do now is to capture the first oops
in the series somehow. Unfortunately, I do not remember details of setting
netconsole, I'm only use serial around here...

BTW, rtl8139 is a PCI based device. You might want to file a separate bug
and get Jeff Garzik to look at it.

Comment 10 Chris Brand 2006-10-10 17:44:33 UTC
After setting up serial console, it didn't crash until last night. Here's the
output :
BUG: unable to handle kernel paging request at virtual address 01000049

 printing eip:


*pde = 00000000

Oops: 0002 [#1]

last sysfs file: /block/hda/hda1/size

Modules linked in: nfs nfsd exportfs lockd nfs_acl ipv6 autofs4 sunrpc sd_mod sg
dm_mirror dm_mod lp parport_pc parport usb_storage scsi_mod i2c_prosavage
i2c_algo_bit snd_via82xx gameport snd_ac97_codec snd_ac97_bus floppy
snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss
snd_pcm uhci_hcd snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi 8139cp
i2c_viapro snd_seq_device 8139too i2c_core snd via_ircc soundcore mii irda
crc_ccitt ext3 jbd

CPU:    0

EIP:    0060:[<ef8cd5a8>]    Not tainted VLI

EFLAGS: 00010096   (2.6.17-1.2157_FC5 #1)

EIP is at uhci_unlink_isochronous_tds+0x3f/0xe7 [uhci_hcd]

eax: 01000049   ebx: 00000246   ecx: c17fef64   edx: ed6e03c0

esi: c17fef23   edi: c17feed0   ebp: c17fee00   esp: c075cf4c

ds: 007b   es: 007b   ss: 0068

Process gzip (pid: 9093, threadinfo=c075c000 task=e06ae550)

Stack: ef8ce05f ce3a6fbc c04c97f9 e06ae550 00000282 eedf6cd4 ce3a6fbc 00000000

       ed6e03c0 00000000 ed6e03dc 00000000 00000000 00000246 00000246 c17fef20

       c17feed0 c17fee00 ef8cf086 4542c30a 00002d79 e06ae550 ce3a6fbc c075cfa4

Call Trace:

 <ef8ce05f> uhci_scan_schedule+0x55e/0x5f0 [uhci_hcd]  <c04c97f9>

 <ef8cf086> uhci_irq+0x134/0x14a [uhci_hcd]  <c0576624> usb_hcd_irq+0x23/0x4f

 <c043db62> handle_IRQ_event+0x23/0x4c  <c043dc07> __do_IRQ+0x7c/0xd1

 <c0405035> do_IRQ+0x63/0x80


 <c04036f2> common_interrupt+0x1a/0x20

Code: 8b 43 24 83 f8 ff 75 40 8d 43 28 39 43 28 0f 84 9c 00 00 00 e8 b1 e7 b4 d0
89 44 24 10 c7 44 24 0c 49 fe 8c ef c7 44 24 08 73 00 <00> 00 c7 44 24 04 bb 00
8d ef c7 04 24 d5 00 8d ef e8 fd f3 b4

EIP: [<ef8cd5a8>] uhci_unlink_isochronous_tds+0x3f/0xe7 [uhci_hcd] SS:ESP

 <0>Kernel panic - not syncing: Fatal exception in interrupt

 BUG: warning at kernel/panic.c:137/panic() (Not tainted)

 <c041c0c4> panic+0x17b/0x18b  <c04042b1> die+0x26c/0x2a0

 <c06030db> do_page_fault+0x0/0x5ad  <c060351e> do_page_fault+0x443/0x5ad

 <c06030db> do_page_fault+0x0/0x5ad  <c04037df> error_code+0x4f/0x54

 <c057007b> ti12xx_tie_interrupts+0xd/0x66  <ef8cd5a8>
uhci_unlink_isochronous_tds+0x3f/0xe7 [uhci_hcd]

 <ef8ce05f> uhci_scan_schedule+0x55e/0x5f0 [uhci_hcd]  <c04c97f9>

 <ef8cf086> uhci_irq+0x134/0x14a [uhci_hcd]  <c0576624> usb_hcd_irq+0x23/0x4f

 <c043db62> handle_IRQ_event+0x23/0x4c  <c043dc07> __do_IRQ+0x7c/0xd1

 <c0405035> do_IRQ+0x63/0x80


 <c04036f2> common_interrupt+0x1a/0x20

Comment 11 Dave Jones 2006-10-16 21:22:00 UTC
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.

Comment 12 Chris Brand 2006-10-20 18:40:30 UTC
I have updated the kernel as specified.
Haven't seen a crash yet, but unfortunately it's not very consistent in its
appearance. If it hasn't reoccurred in the next month, it's probably safe to say
that it's fixed.

Comment 13 Chris Brand 2006-10-23 19:23:47 UTC
It crashed again with the latest kernel. No idea whether this is the same or not :
invalid opcode: 0000 [#1]

last sysfs file: /block/sda/sda1/size

Modules linked in: nls_utf8 cifs nfs fscache nfsd exportfs lockd nfs_acl autofs4
ipv6 sunrpc sd_mod sg hfsplus dm_mirror dm_mod lp parport_pc parport usb_storage
scsi_mod snd_via82xx gameport snd_ac97_codec snd_ac97_bus snd_seq_dummy
snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss uhci_hcd snd_mixer_oss
snd_pcm floppy snd_timer serio_raw snd_page_alloc i2c_prosavage snd_mpu401_uart
i2c_algo_bit snd_rawmidi snd_seq_device i2c_viapro snd i2c_core via_ircc 8139cp
irda 8139too soundcore mii crc_ccitt ide_cd cdrom pcspkr ext3 jbd

CPU:    0

EIP:    0060:[<ef987c1c>]    Not tainted VLI

EFLAGS: 00010003   (2.6.18-1.2200.fc5 #1)

EIP is at uhci_scan_schedule+0x49e/0x787 [uhci_hcd]

eax: ffffffff   ebx: c8ac6404   ecx: c8ac6404   edx: c8ac63f0
esi: ed80f600   edi: ed80f600   ebp: eeb86cd0   esp: c075bf38

ds: 007b   es: 007b   ss: 0068

Process swapper (pid: 0, ti=c075b000 task=c065fc80 task.ti=c071d000)

Stack: 00000000 c042467b c04c95c5 0000000f eea2d080 c8ac63f4 c071df8c 0000000c

       00000000 ed80f618 006ddffd 00000082 c8ac63f0 cafaf3c0 c8ac6404 c065fc80

       00000000 c8ac63f0 00000246 eeb86d20 eeb86cd0 eeb86c00 ef9896f8 0001aebb

Call Trace:

 [<ef9896f8>] uhci_irq+0x129/0x13f [uhci_hcd]

 [<c056fa17>] usb_hcd_irq+0x23/0x4f

 [<c0440c82>] handle_IRQ_event+0x23/0x49

 [<c0440d2a>] __do_IRQ+0x82/0xde

 [<c040536d>] do_IRQ+0x9a/0xb8


Code: 14 8b 48 04 89 4b 04 89 19 89 40 04 89 42 14 00 30 ed 2d 00 00 00 00 69 7f
e0 ff 00 00 00 00 00 30 ed 2d 14 30 ed ed 14 30 ed ed <ff> ff ff ff 89 e8 e8 3e
f8 ff ff 3b 7c 24 38 75 d9 83 7c 24 20

EIP: [<ef987c1c>] uhci_scan_schedule+0x49e/0x787 [uhci_hcd] SS:ESP 0068:c075bf38

 <0>Kernel panic - not syncing: Fatal exception in interrupt

 BUG: warning at kernel/panic.c:137/panic() (Not tainted)

 [<c0403f10>] dump_trace+0x69/0x1af

 [<c040406e>] show_trace_log_lvl+0x18/0x2c

 [<c04045e9>] show_trace+0xf/0x11

 [<c0404673>] dump_stack+0x15/0x17

 [<c041bedf>] panic+0x17b/0x18c

 [<c040457b>] die+0x25c/0x290

 [<c0404c2a>] do_invalid_op+0xa2/0xab

 [<c04038a1>] error_code+0x39/0x40

DWARF2 unwinder stuck at error_code+0x39/0x40

Leftover inexact backtrace:

 [<ef987c1c>] uhci_scan_schedule+0x49e/0x787 [uhci_hcd]

 [<c042467b>] do_timer+0x7a0/0x8d8

 [<c04c95c5>] end_that_request_last+0x7c/0x8d

 [<ef9896f8>] uhci_irq+0x129/0x13f [uhci_hcd]

 [<c056fa17>] usb_hcd_irq+0x23/0x4f

 [<c0440c82>] handle_IRQ_event+0x23/0x49

 [<c0440d2a>] __do_IRQ+0x82/0xde

 [<c040536d>] do_IRQ+0x9a/0xb8

 [<c0411596>] apm_bios_call_simple+0x78/0xc3

 [<c04037ca>] common_interrupt+0x1a/0x20

 [<c0401bb6>] default_idle+0x31/0x59

 [<c0412427>] apm_cpu_idle+0x19e/0x1f4

 [<c0401c17>] cpu_idle+0x39/0x4e

 [<c071e74d>] start_kernel+0x2ff/0x303

 [<c071e24a>] unknown_bootoption+0x0/0x204


Comment 14 Pete Zaitcev 2007-08-28 03:35:05 UTC
Chris, how is it doing now? It was a unique and an odd report, what with
the ISO processing when no ISO URBs should be present... So I suspected
hardware. I am wondering if the box has cooked itself by now. Also, FC-7
is available at this time.

Comment 15 petrosyan 2008-03-12 06:11:30 UTC
The information we've requested above is required in order
to review this problem report further and diagnose/fix the
issue if it is still present.  Since there have not been any
updates to the report since thirty (30) days or more since we
requested additional information, we're assuming the problem
is either no longer present in the current Fedora release, or
that there is no longer any interest in tracking the problem.

Setting status to "INSUFFICIENT_DATA".  If you still
experience this problem after updating to our latest Fedora
release and can provide the information previously requested, 
please feel free to reopen the bug report.

Thank you in advance.

Note You need to log in before you can comment on or make changes to this bug.