Bug 134453 - Processes hang in uninterruptible state when trying to access CD/DVD drive
Processes hang in uninterruptible state when trying to access CD/DVD drive
Status: CLOSED CANTFIX
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
4
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Dave Jones
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-10-02 18:13 EDT by Matt Dainty
Modified: 2015-01-04 17:10 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-12-27 21:10:25 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Kernel boot log (20.92 KB, text/plain)
2004-10-02 18:15 EDT, Matt Dainty
no flags Details
Output of "lspci -v" (4.98 KB, text/plain)
2004-10-02 18:16 EDT, Matt Dainty
no flags Details

  None (edit)
Description Matt Dainty 2004-10-02 18:13:38 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3)
Gecko/20040922

Description of problem:
I've noticed processes hang in the uninterruptible state 'D' when
trying to access my CD/DVD drive, which is a Pioneer DVD-120S on the
secondary channel of the AMD IDE controller on a Gigabyte GA-7DPXDW-P
(which also has a Promise controller, for reference).

The process to nearly always go first is magicdev as it polls the
various removable media devices periodically.

Once one process gets stuck, any further commands that try and access
the same device also get stuck. While a reboot will clear this out
every time, the processes are unkillable and so I can never unmount my
/usr and /home partitions cleanly, nor even exit my X session cleanly
as I suspect its waiting for magicdev to exit.

The only output I get from the kernel is usually the following:

hdc: request sense failure: status=0x51 { DriveReady SeekComplete Error }
hdc: request sense failure: error=0x04Aborted Command

And I'm fairly sure it's consistently that exact output every time,
which I'm guessing coincides with the first time magicdev polls the drive.

From a bit of searching, I got the output of "ps -e -o
pid,comm,state,wchan=WCHAN-WIDE-COLUMN" and it shows the following output:

 3810 magicdev         D ide_do_drive_cmd
...
 4308 hdparm           D -

(the hdparm being the command I tried running afterwards, and it also
hangs)

I can temporarily stop this happening by keeping some media inserted
in the drive, but as soon as I eject it and wait, the same thing
happens eventually, usually preceed by this message once or twice:

ATAPI device hdc:
  Error: Not ready -- (Sense key=0x02)
  (vendor-specific error) -- (asc=0x90, ascq=0x3e)
  The failed "<NULL>" packet command was:
  "ff ff ff 7f 9c 45 24 02 94 3e 82 38 00 6c 2d 02 "

which doesn't cause any hangs or problems running something against
the drive device, but after a while it will be followed by the first
error mentioned above and the hanging occurs again.

It seems to be localised just to the CD/DVD drive, I have an IDE Zip
drive on the primary channel, and that remains working throughout all
of this, all other I/O is located on SCSI and also remains unaffected.

The only datapoint I have is when the machine had FC1 installed, (I
only upgraded recently), I don't remember ever experiencing the
problem there.

I'll attach the kernel boot logs and also the output of "lspci -v" in
case that might be useful.

Version-Release number of selected component (if applicable):
kernel-2.6.8-1.521smp

How reproducible:
Always

Steps to Reproduce:
1. Boot with no media in the CD/DVD drive
2. Time passes, Gandalf
3. ps output shows magicdev stuck in 'D' state inside ide_do_drive_cmd
and also some other processes stuck just in 'D' state
Comment 1 Matt Dainty 2004-10-02 18:15:54 EDT
Created attachment 104669 [details]
Kernel boot log
Comment 2 Matt Dainty 2004-10-02 18:16:28 EDT
Created attachment 104670 [details]
Output of "lspci -v"
Comment 3 Graeme Fowler 2004-11-06 06:05:24 EST
I'm also experiencing this problem, absolutely identically.

Boot messages:

Nov  6 10:41:05 ernie kernel: Linux version 2.6.8-1.521smp
(bhcompile@tweety.build.redhat.com) (gcc version 3.3.3 20040412 (Red
Hat Linux 3.3.3-7)) #1 SMP Mon Aug 16 09:25:06 EDT 2004
<snip lots more output>
Nov  6 10:41:07 ernie kernel: Uniform Multi-Platform E-IDE driver
Revision: 7.00alpha2
Nov  6 10:41:07 ernie kernel: ide: Assuming 33MHz system bus speed for
PIO modes; override with idebus=xx
Nov  6 10:41:07 ernie kernel: ICH5: IDE controller at PCI slot
0000:00:1f.1
Nov  6 10:41:07 ernie kernel: PCI: Enabling device 0000:00:1f.1 (0005
-> 0007)
Nov  6 10:41:07 ernie kernel: ACPI: PCI interrupt 0000:00:1f.1[A] ->
GSI 18 (level, low) -> IRQ 193
Nov  6 10:41:07 ernie kernel: ICH5: chipset revision 2
Nov  6 10:41:07 ernie kernel: ICH5: not 100%% native mode: will probe
irqs later
Nov  6 10:41:07 ernie kernel:     ide0: BM-DMA at 0xfc00-0xfc07, BIOS
settings: hda:DMA, hdb:pio
Nov  6 10:41:07 ernie kernel:     ide1: BM-DMA at 0xfc08-0xfc0f, BIOS
settings: hdc:pio, hdd:pio
Nov  6 10:41:07 ernie kernel: hda: ATAPI CD-RW 52XMax, ATAPI
CD/DVD-ROM drive
Nov  6 10:41:07 ernie kernel: Using cfq io scheduler
Nov  6 10:41:07 ernie kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Nov  6 10:41:07 ernie kernel: hda: ATAPI 40X CD-ROM CD-R/RW drive,
2048kB Cache, UDMA(33)
Nov  6 10:41:07 ernie kernel: Uniform CD-ROM driver Revision: 3.20
Nov  6 10:41:07 ernie kernel: ide-floppy driver 0.99.newide
<snip more output>
<time passes>
Nov  6 11:00:17 ernie kernel: ide-cd: cmd 0x3 timed out
Nov  6 11:00:17 ernie kernel: hda: lost interrupt
Nov  6 11:00:17 ernie kernel: hda: request sense failure: status=0x51
{ DriveReady SeekComplete Error }
Nov  6 11:00:17 ernie kernel: hda: request sense failure:
error=0x04Aborted Command

And after this I have:

  PID COMMAND          S WCHAN-WIDE-COLUMN
 2626 magicdev         D ide_do_drive_cmd

Logging out from gnome now freezes, shutting the machine down results
in a dirty unmount and journal replay at next boot.

It happens with either UP or SMP kernels, of all FC2 flavours.
Comment 4 Matt Dainty 2004-11-21 18:02:31 EST
Just tried the newer 2.6.9-1.3_FC2smp kernel I can confirm the bug is
still present.
 
I now get the following kernel message once after leaving the drive
with no media for a short while, (I don't remember seeing it in the
earlier release or at least phrased as it is):
 
hdc: status timeout: status=0xd8 { Busy }
hdc: status timeout: error=0x04Aborted Command
hdc: DMA disabled
hdc: ATAPI reset complete
 
Then the same thing happens as before, same kernel messages and stuck
magicdev process, etc.
 
Keeping some media in the drive still seems to be the best workaround.
Comment 5 Dave Jones 2005-04-16 00:08:46 EDT
Fedora Core 2 has now reached end of life, and no further updates will be
provided by Red Hat.  The Fedora legacy project will be producing further kernel
updates for security problems only.

If this bug has not been fixed in the latest Fedora Core 2 update kernel, please
try to reproduce it under Fedora Core 3, and reopen if necessary, changing the
product version accordingly.

Thank you.
Comment 6 Matt Dainty 2005-07-17 09:15:30 EDT
Having checked on FC4, this problem is still present. Leave the drive with
nothing inserted, and it goes into a stuck state. Only differences are now the
following error messages:

hdc: DMA timeout retry
hdc: timeout waiting for DMA
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hdc: drive not ready for command
hdc: request sense failure: status=0x51 { DriveReady SeekComplete Error }
hdc: request sense failure: error=0x04 { AbortedCommand }

and s/magicdev/hald-addon-storage/ for the process that periodically tries to
mount the drive.
Comment 7 Matt Dainty 2005-07-24 19:06:21 EDT
In desperation, bought a new NEC 3540A combo DVD burner replacing the Pioneer
unit, and so far the problem has disappeared. PITA as the Pioneer unit showed
this problem from new.

Not sure if the drive really is duff or some bad combination of drive and IDE
controller.
Comment 8 Dave Jones 2005-09-30 02:11:56 EDT
Mass update to all FC4 bugs:

An update has been released (2.6.13-1.1526_FC4) which rebases to a new upstream
kernel (2.6.13.2). As there were ~3500 changes upstream between this and the
previous kernel, it's possible your bug has been fixed already.

Please retest with this update, and update this bug if necessary.

Thanks.
Comment 9 Dave Jones 2005-11-10 14:09:37 EST
2.6.14-1.1637_FC4 has been released as an update for FC4.
Please retest with this update, as a large amount of code has been changed in
this release, which may have fixed your problem.

Thank you.
Comment 10 Dave Jones 2005-12-27 21:10:25 EST
hdc: DMA timeout retry
hdc: timeout waiting for DMA
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }

This part I'm not sure about, it could be anything from a bad cable, to a bad drive.


ide: failed opcode was: unknown
hdc: drive not ready for command
hdc: request sense failure: status=0x51 { DriveReady SeekComplete Error }
hdc: request sense failure: error=0x04 { AbortedCommand }

This bit is just the result of us sending the drive a command it doesn't know
how to handle, so the drive says "Huh?" . It's purely informational, and nothing
to worry about.

Given that you've now switched to different hardware, I guess further debugging
on this issue is impossible, so I'll mark this as closed.

Thanks.

Note You need to log in before you can comment on or make changes to this bug.