139665 – External USB DVD-RW causes a kernel OOPS

Bug 139665 - External USB DVD-RW causes a kernel OOPS

Summary: External USB DVD-RW causes a kernel OOPS

Keywords:
Status:	CLOSED CANTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	3
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Dave Jones
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-11-17 10:39 UTC by Srihari Vijayaraghavan
Modified:	2016-03-27 14:27 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2005-10-03 01:18:27 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Srihari Vijayaraghavan 2004-11-17 10:39:27 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3)
Gecko/20040921

Description of problem:
Turning off an external USB DVD-RW drive causes a kernel OOPS.

Version-Release number of selected component (if applicable):
kernel-2.6.9-1.667

How reproducible:
Always

Steps to Reproduce:
1. Turn on (plug) the external DVD-RW.
2. Wait for few seconds (say 30 secs)
3. Turn off (unplug) the external DVD-RW


Actual Results:  FC3 kernel oopses.

Expected Results:  FC3 kernel may not crash. (Oh neither FC2's
2.6.8-1.521 nor kernel.org's 2.6.10-rc2 on FC2 crashes.)

Additional info:

Here is the entire kernel log during this event:

##### Starts here #####
usb 1-2: new high speed USB device using address 2
Initializing USB Mass Storage driver...
scsi2 : SCSI emulation for USB Mass Storage devices
  Vendor: PIONEER   Model: DVD-RW  DVR-107D  Rev: 1.13
  Type:   CD-ROM                             ANSI SCSI revision: 02
USB Mass Storage device found at 2
usbcore: registered new driver usb-storage
USB Mass Storage support registered.
sr0: scsi3-mmc drive: 40x/40x writer cd/rw xa/form2 cdda tray
Attached scsi CD-ROM sr0 at scsi2, channel 0, id 0, lun 0
usb 1-2: USB disconnect, address 2
scsi: Device offlined - not ready after error recovery: host 2 channel
0 id 0 lun 0
sr 2:0:0:0: Illegal state transition cancel->offline
Badness in scsi_device_set_state at drivers/scsi/scsi_lib.c:1688

Call Trace:<ffffffffa0008716>{:scsi_mod:scsi_device_set_state+231}
       <ffffffffa0005c8b>{:scsi_mod:scsi_error_handler+3219}
       <ffffffff8011124f>{child_rip+8}
<ffffffffa0004ff8>{:scsi_mod:scsi_error_handler+0}
       <ffffffff80111247>{child_rip+0}
Unable to handle kernel NULL pointer dereference at 0000000000000010 RIP:
<ffffffff8028aa42>{cfq_insert_request+96}
PML4 35b7d067 PGD 359da067 PMD 36010067 PTE 0
Oops: 0000 [1]
CPU 0
Modules linked in: sr_mod usb_storage radeon parport_pc lp parport
autofs4 i2c_dev i2c_core sunrpc ds yenta_socket pcmcia_core ipt_REJECT
ipt_state ip_conntrack iptable_filter ip_tables dm_mod button battery
ac md5 ipv6 ohci1394 ieee1394 uhci_hcd ehci_hcd snd_via82xx
snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer
snd_page_alloc gameport snd_mpu401_uart snd_rawmidi snd_seq_device snd
soundcore via_rhine mii r8169 floppy ext3 jbd sata_via libata sd_mod
scsi_mod
Pid: 3656, comm: scsi_eh_2 Not tainted 2.6.9-1.667
RIP: 0010:[<ffffffff8028aa42>] <ffffffff8028aa42>{cfq_insert_request+96}
RSP: 0018:000001002d16bdc8  EFLAGS: 00010046
RAX: 0000000000000002 RBX: 000001003e25f030 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000010030828cd8 RDI: 000001003e25f030
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000ffff80443420 R11: 0000000000000001 R12: 0000010030828cd8
R13: 0000000000000000 R14: 000001003e25f030 R15: 000001002d16bea8
FS:  0000002a9589db00(0000) GS:ffffffff80503480(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 0000000000101000 CR4: 00000000000006e0
Process scsi_eh_2 (pid: 3656, threadinfo 000001002d16a000, task
000001002d10eae0)
Stack: 000001003e25f030 0000000000000001 0000010030828cd8 0000000000000001
       000001002d16be98 ffffffff80280300 0000000000000001 000001003e25f030
       0000010030828cd8 ffffffff80282a6f
Call Trace:<ffffffff80280300>{__elv_add_request+65}
<ffffffff80282a6f>{blk_insert_request+221}
       <ffffffffa000676d>{:scsi_mod:scsi_queue_insert+162}
       <ffffffffa0005d30>{:scsi_mod:scsi_error_handler+3384}
       <ffffffff8011124f>{child_rip+8}
<ffffffffa0004ff8>{:scsi_mod:scsi_error_handler+0}
       <ffffffff80111247>{child_rip+0}

Code: 48 8b 45 10 48 8b 10 48 89 72 08 48 89 16 48 89 46 08 48 89
RIP <ffffffff8028aa42>{cfq_insert_request+96} RSP <000001002d16bdc8>
CR2: 0000000000000010
##### Ends Here #####

I have the kernel log messages of FC2, which I can provide on request.
Also ss I have mentioned above turning on and off this device does not
crash FC2's kernel 2.6.8-1.521 or 2.6.10-rc2 on FC2.

I have not figured out a .config for vanilla 2.6.10-rc2 on FC3 that
would boot; perhaps when I do, I shall verify if I see the same crash
there as well.

Thank you.
Hari.

PS: Once this bug is fixed, then I should verify if this device works
on cdrecord, dvdrecord, k3b etc..

Comment 1 John Rosauer 2004-11-18 23:32:50 UTC

Same thing happens to me with an external USB CD writer.

Comment 2 Srihari Vijayaraghavan 2004-11-19 09:57:37 UTC

With vanilla 2.6.10-rc2 on FC3, Kernel does not oops due to on and off
of this device. Also I did a test DVD burn (FC3 DVD image itself), and
it works great too.

Thank you.
Hari.

Comment 3 Srihari Vijayaraghavan 2004-11-19 12:11:27 UTC

I think I spoke a bit too soon. Although kernel.org's 2.6.10-rc2 did
not crash during on/off, burning a DVD etc., it did crash once, which
I am unable to reproduce despite my sincere efforts. Here is that oops
message:

##### Starts Here #####
usb 1-2: USB disconnect, address 4
 target4:0:0: Illegal state transition <NULL>->cancel
Badness in scsi_device_set_state at drivers/scsi/scsi_lib.c:1717

Call Trace:<ffffffffa0006828>{:scsi_mod:scsi_device_set_state+264}
       <ffffffffa00010d9>{:scsi_mod:scsi_device_cancel+41}
       <ffffffff8018ae27>{simple_rmdir+55}
<ffffffffa00011f0>{:scsi_mod:scsi_device_cancel_cb+0}
       <ffffffff80214aa1>{device_for_each_child+81}
<ffffffffa000122f>{:scsi_mod:scsi_host_cancel+47}
       <ffffffff80214a09>{device_del+105}
<ffffffffa0008690>{:scsi_mod:scsi_remove_device+160}
       <ffffffffa00012f3>{:scsi_mod:scsi_remove_host+19}
<ffffffffa024da84>{:usb_storage:storage_disconnect+116}
       <ffffffff80244c72>{usb_unbind_interface+82}
<ffffffff802158c7>{device_release_driver+119}
       <ffffffff80215ab9>{bus_remove_device+153}
<ffffffff802149f8>{device_del+88}
       <ffffffff8024b71b>{usb_disable_device+123}
<ffffffff80246d05>{usb_disconnect+197}
       <ffffffff802480c7>{hub_thread+759}
<ffffffff80144100>{autoremove_wake_function+0}
       <ffffffff80144100>{autoremove_wake_function+0}
<ffffffff80133563>{do_exit+2819}
       <ffffffff8010ebe3>{child_rip+8} <ffffffff80247dd0>{hub_thread+0}
       <ffffffff8010ebdb>{child_rip+0}
Unable to handle kernel NULL pointer dereference at 0000000000000d68 RIP:
<ffffffffa00010e7>{:scsi_mod:scsi_device_cancel+55}
PML4 30aef067 PGD 2c7a8067 PMD 0
Oops: 0000 [1]
CPU 0
Modules linked in: reiserfs sr_mod usb_storage radeon ipt_LOG
ipt_limit ipt_MASQUERADE ipt_multiport ipt_conntrack ip_nat_ftp
ip_conntrack_ftp iptable_nat nfsdexportfs lockd autofs4 sunrpc
ipt_REJECT ipt_state ip_conntrack iptable_filter ip_tables dm_mod
video button ohci1394 ieee1394 uhci_hcd ehci_hcd snd_via82xx
snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer
snd_page_alloc gameportsnd_mpu401_uart snd_rawmidi snd_seq_device snd
soundcore via_rhine mii r8169 floppy ext3 mbcache jbd sata_via libata
sd_mod scsi_mod
Pid: 102, comm: khubd Not tainted 2.6.10-rc2
RIP: 0010:[<ffffffffa00010e7>]
<ffffffffa00010e7>{:scsi_mod:scsi_device_cancel+55}
RSP: 0018:000001003fd6fc58  EFLAGS: 00010016
RAX: 00000000ffffffea RBX: 000001002ea3d228 RCX: 0000000000020000
RDX: 0000000000000d68 RSI: 00000000000106e6 RDI: ffffffff803260a0
RBP: 0000000000000d48 R08: 00000000fffffffa R09: 0000000000000000
R10: 00000000ffffffff R11: 0000000000000000 R12: 000001003fd6fc68
R13: 0000000000000000 R14: 000001003fd6fce4 R15: 000001003f5a7c00
FS:  0000002a95d90020(0000) GS:ffffffff803d5c80(0000)
knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000d68 CR3: 0000000000101000 CR4: 00000000000006e0
Process khubd (pid: 102, threadinfo 000001003fd6e000, task
000001003fed2070)
Stack: 00000100342b7638 0000000000000216 000001003fd6fc68 000001003fd6fc68
       0000010030fb0940 0000010039577158 000001002ea3d408 0000000000000000
       ffffffffa00011f0 ffffffff80214aa1
Call Trace:<ffffffffa00011f0>{:scsi_mod:scsi_device_cancel_cb+0}
       <ffffffff80214aa1>{device_for_each_child+81}
<ffffffffa000122f>{:scsi_mod:scsi_host_cancel+47}
       <ffffffff80214a09>{device_del+105}
<ffffffffa0008690>{:scsi_mod:scsi_remove_device+160}
       <ffffffffa00012f3>{:scsi_mod:scsi_remove_host+19}
<ffffffffa024da84>{:usb_storage:storage_disconnect+116}
       <ffffffff80244c72>{usb_unbind_interface+82}
<ffffffff802158c7>{device_release_driver+119}
       <ffffffff80215ab9>{bus_remove_device+153}
<ffffffff802149f8>{device_del+88}
       <ffffffff8024b71b>{usb_disable_device+123}
<ffffffff80246d05>{usb_disconnect+197}
       <ffffffff802480c7>{hub_thread+759}
<ffffffff80144100>{autoremove_wake_function+0}
       <ffffffff80144100>{autoremove_wake_function+0}
<ffffffff80133563>{do_exit+2819}
       <ffffffff8010ebe3>{child_rip+8} <ffffffff80247dd0>{hub_thread+0}
       <ffffffff8010ebdb>{child_rip+0}

Code: 48 8b 45 20 0f 18 08 48 83 c3 38 48 39 da 74 4a 48 8b 85 10
RIP <ffffffffa00010e7>{:scsi_mod:scsi_device_cancel+55} RSP
<000001003fd6fc58>
CR2: 0000000000000d68
##### Ends Here #####

Of course it closely resembles that of FC3's kernel, I think. Should I
escalate that to LKML?

Would it be unfair of me to expect FC guys to look at FC3 kernel's
issue when kernel.org's kernel exhibits the same oops, albeit under
different circumstances (which I do not completely understand yet, as
it is not as easy to trigger as it is in FC3) ?

Thank you.
Hari.

PS: It seems another gentleman has already reported this (or very
similar) problem to LKML today:
http://marc.theaimsgroup.com/?l=linux-kernel&m=110081002103288&w=2
I think my oops message looks very similar.

Comment 4 Emily Brantley 2004-11-19 14:05:27 UTC

just for you guys' information, the bug persists in 2.6.9-1.678_FC3
when i remove my own CD-RW, following is my output from dmesg


usb 2-1.3: USB disconnect, address 4
scsi: Device offlined - not ready after error recovery: host 0 channel
0 id 0 lun 0
sr 0:0:0:0: Illegal state transition cancel->offline
Badness in scsi_device_set_state at drivers/scsi/scsi_lib.c:1688
 [<161fb645>] scsi_device_set_state+0xc8/0xd3 [scsi_mod]
 [<161f8b8b>] scsi_eh_offline_sdevs+0x49/0x5e [scsi_mod]
 [<161f9146>] scsi_unjam_host+0x22d/0x23e [scsi_mod]
 [<161f9291>] scsi_error_handler+0x13a/0x191 [scsi_mod]
 [<0211b3d5>] schedule_tail+0xc/0x37
 [<161f9157>] scsi_error_handler+0x0/0x191 [scsi_mod]
 [<021041d9>] kernel_thread_helper+0x5/0xb
Unable to handle kernel NULL pointer dereference at virtual address
00000008
 printing eip:
02250207
*pde = 00000000
Oops: 0000 [#1]
Modules linked in: nls_utf8 vfat fat i915 md5 ipv6 parport_pc lp
parport i8k ipt_REJECT ipt_state ip_conntrack iptable_filter ip_tables
dm_mod sd_mod sr_mod usb_storage scsi_mod button battery ac joydev
yenta_socket uhci_hcd hw_random snd_intel8x0m snd_intel8x0
snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer
snd_page_alloc gameport snd_mpu401_uart snd_rawmidi snd_seq_device snd
soundcore orinoco_cs ds pcmcia_core orinoco hermes 3c59x floppy ext3 jbd
CPU:    0
EIP:    0060:[<02250207>]    Not tainted VLI
EFLAGS: 00010046   (2.6.9-1.678_FC3)
EIP is at cfq_insert_request+0x45/0xdf
eax: 15569290   ebx: 1304f6b0   ecx: 00000001   edx: 1304f6b0
esi: 00000001   edi: 00000000   ebp: 00000000   esp: 12735efc
ds: 007b   es: 007b   ss: 0068
Process scsi_eh_0 (pid: 1890, threadinfo=12735000 task=120c81f0)
Stack: 15569290 15569290 00000001 1304f6b0 00000202 022469e3 15569290
00000001
       1304f6b0 022469a5 00000000 02248b52 12465c40 13247000 12770000
00001057
       161f9576 12465c40 00000001 12465c40 12735f74 12735f74 12735f7c
161f8ec8
Call Trace:
 [<022469e3>] __elv_add_request+0x3c/0x71
 [<022469a5>] elv_requeue_request+0x29/0x2b
 [<02248b52>] blk_insert_request+0xba/0x18b
 [<161f9576>] scsi_queue_insert+0x84/0x8d [scsi_mod]
 [<161f8ec8>] scsi_eh_flush_done_q+0x7d/0xce [scsi_mod]
 [<161f914f>] scsi_unjam_host+0x236/0x23e [scsi_mod]
 [<161f9291>] scsi_error_handler+0x13a/0x191 [scsi_mod]
 [<0211b3d5>] schedule_tail+0xc/0x37
 [<161f9157>] scsi_error_handler+0x0/0x191 [scsi_mod]
 [<021041d9>] kernel_thread_helper+0x5/0xb
Code: 74 29 eb 51 83 f9 03 74 33 eb 4a 8b 04 24 89 fa e8 f8 fa ff ff
85 c0 75 f2 8b 47 08 8b 50 04 89 03 89 58 04 89 1a 89 53 04 eb 3f <8b>
47 08 8b 10 89 5a 04 89 13 89 43 04 89 18 eb 2e f6 42 08 10

Comment 5 Emily Brantley 2004-11-19 14:09:53 UTC

i should add that
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=138755 seems to
be a duplicate of this one... ?

Comment 6 Emily Brantley 2004-11-19 18:24:55 UTC

one more thing, i can't seem to reproduce this with 2.6.10-rc2-mm1 but
i haven't tried any of linus' kernels yet and i don't know if the bug
was in vanilla 2.6.9 and if so if the patch to fix it is in linus's or
morton's tree

Comment 7 John Rosauer 2004-11-23 00:52:43 UTC

just tried kernel-2.6.9-1.681_FC3 and it does the same thing

Comment 8 petrosyan 2004-12-22 06:49:14 UTC

this bug has been fixed in kernel-2.6.9-1.715_FC3
you can grab it from
http://download.fedora.redhat.com/pub/fedora/linux/core/updates/testing/3/i386/

Comment 9 vvs 2004-12-22 09:21:53 UTC

Unfortunately, kernel-2.6.9-1.715_FC3 is too buggy. I have oopses and system
freezes when I tried to exit from Xserver. And the kernel-2.6.9-1.1047_FC4
(which have this bug fixed according to its changelog) is even worse - it
crashed on boot.

Comment 10 Srihari Vijayaraghavan 2004-12-22 09:42:22 UTC

Kernel-2.6.9-1.715_FC3 does fix this problem, but unfortunately it has 
introduced this problem: 
[root@desktop ~]# ps -eo state,pid,cmd,wchan|egrep '^[D]' 
D    29 [khubd]          scsi_wait_req 
D  2582 hald             usb_device_read 
D  6774 [scsi_eh_16]     - 
 
Thank you. 
Hari 
PS: While I was turning on/off the external USB DVD-RW to simulate the kernel 
bug, I came across D state processes involving USB/SCSI.

Comment 11 vvs 2004-12-23 12:26:21 UTC

Seems, that in kernel-2.6.9-1.1049_FC4 these and agpgart bugs were fixed for
good! You can get it from http://cvs.fedora.redhat.com/ or wait for it to appear
in rawhide.

Comment 12 Dave Jones 2005-07-15 20:42:57 UTC

An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which
may contain a fix for your problem.   Please update to this new kernel, and
report whether or not it fixes your problem.

If you have updated to Fedora Core 4 since this bug was opened, and the problem
still occurs with the latest updates for that release, please change the version
field of this bug to 'fc4'.

Thank you.

Comment 13 Dave Jones 2005-10-03 01:18:27 UTC

This bug has been automatically closed as part of a mass update.
It had been in NEEDINFO state since July 2005.
If this bug still exists in current errata kernels, please reopen this bug.

There are a large number of inactive bugs in the database, and this is the only
way to purge them.

Thank you.

Note You need to log in before you can comment on or make changes to this bug.