Bug 473852 - [5.3] Kdump Blocked Forever for IDE Disks Initialization
Summary: [5.3] Kdump Blocked Forever for IDE Disks Initialization
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.2
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Red Hat Kernel Manager
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-12-01 03:08 UTC by Qian Cai
Modified: 2009-11-18 12:21 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
kdump now serializes drive creation registration with the rest of the kdump process. Consequently, kdump may hang waiting for IDE drives to be initialized. In these cases, it is recommended that IDE disks not be used with kdump.
Clone Of:
Environment:
Last Closed: 2008-12-10 19:58:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Qian Cai 2008-12-01 03:08:45 UTC
Description of problem:
I have seen kdump kernel blocked forever waiting for IDE disks to initialize no matter what the dump target has been chosen.

...
RAMDISK driver initialized: 16 RAM disks of 16384K size 4096 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
JMB368: IDE controller at PCI slot 0000:09:00.0
ACPI: PCI Interrupt 0000:09:00.0[A] -> GSI 16 (level, low) -> IRQ 169
JMB368: chipset revision 0
JMB368: 100% native mode on irq 169
    ide2: BM-DMA at 0xd000-0xd007, BIOS settings: hde:pio, hdf:pio
    ide3: BM-DMA at 0xd008-0xd00f, BIOS settings: hdg:pio, hdh:pio
ide-floppy driver 0.99.newide
usbcore: registered new driver hiddev
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
TCP bic registered
Initializing IPsec netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
Using IPI No-Shortcut mode
Time: tsc clocksource has been installed.
ACPI: (supportsACPI Error (evgpe-0711): No handler or method for
GPE[1D], disabling event [20060707]
 S0 S1 S3 S4 S5)
Freeing unused kernel memory: 228k freed
Write protecting the kernel read-only data: 4294939026k
Mounting proc filesystem
Mounting sysfs filesystem
Creating /dev
Creating initial device nodes
Loading scsi_mod.ko module
SCSI subsystem initialized
Loading sd_mod.ko module
Loading libata.ko module
Loading ata_piix.ko module
ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 58
ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ]
scsi0 : ata_piix
scsi1 : ata_piix
ata1: SATA max UDMA/133 bmdma 0xf150 irq 14
ata2: SATA max UDMA/133 bmdma 0xf158 irq 15
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ATA-6: ST3500320NS, SN04, max UDMA/133
ata1.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata1.00: configured for UDMA/133
ata2: SATA link down (SStatus 0 SControl 300)
  Vendor: ATA       Model: ST3500320NS       Rev: SN04
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sda: 976773168 512-byte hdwr sectors (500108 MB)
sda: Write Protect is off
SCSI device sda: drive cache: write back
SCSI device sda: 976773168 512-byte hdwr sectors (500108 MB)
sda: Write Protect is off
SCSI device sda: drive cache: write back
 sda: sda1 sda2
sd 0:0:0:0: Attached scsi disk sda
ACPI: PCI Interrupt 0000:00:1f.5[B] -> GSI 19 (level, low) -> IRQ 58
ata_piix 0000:00:1f.5: MAP [ P0 -- P1 -- ]
scsi2 : ata_piix
scsi3 : ata_piix
ata3: SATA max UDMA/133 cmd 0xf130 ctl 0xf120 bmdma 0xf0f0 irq 58
ata4: SATA max UDMA/133 cmd 0xf110 ctl 0xf100 bmdma 0xf0f8 irq 58
ata3: SATA link down (SStatus 0 SControl 300)
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata4.00: ATAPI: SlimtypeDVD A  DS8A1P, CX17, max UDMA/33
ata4.00: applying bridge limits
ata4.00: configured for UDMA/33
  Vendor: Slimtype  Model: DVD A  DS8A1P     Rev: CX17
  Type:   CD-ROM                             ANSI SCSI revision: 05
Loading jbd.ko module
Loading ext3.ko module
Loading dm-mod.ko module
device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.11.5-ioctl (2007-12-12) initialised:
dm-devel
Loading dm-log.ko module
Loading dm-mirror.ko module
Loading dm-zero.ko module
Loading dm-snapshot.ko module
Loading igb.ko module
Intel(R) Gigabit Ethernet Network Driver - version 1.2.45-k2
Copyright (c) 2008 Intel Corporation.
ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 169
igb 0000:01:00.0: Intel(R) Gigabit Ethernet Network Connection
igb 0000:01:00.0: eth0: (PCIe:2.5Gb/s:Width x4) 00:30:48:c3:7d:a4
igb 0000:01:00.0: eth0: PBA No: ffffff-0ff
igb 0000:01:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
ACPI: PCI Interrupt 0000:01:00.1[B] -> GSI 17 (level, low) -> IRQ 177
igb 0000:01:00.1: Intel(R) Gigabit Ethernet Network Connection
igb 0000:01:00.1: eth1: (PCIe:2.5Gb/s:Width x4) 00:30:48:c3:7d:a5
igb 0000:01:00.1: eth1: PBA No: ffffff-0ff
igb 0000:01:00.1: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
Loading sunrpc.ko module
Loading lockd.ko module
Loading fscache.ko module
FS-Cache: Loaded
Loading nfs_acl.ko module
Loading nfs.ko module
Waiting for required block device discovery
Waiting for hda...

I have tried to manually drop to a shell when failed to find hda.

root:/> ls /dev/hda*
ls: /dev/hda*: No such file or directory

root:/> dmesg
...
<6>JMB368: chipset revision 0
<6>JMB368: 100% native mode on irq 169
<6>    ide2: BM-DMA at 0xd000-0xd007, BIOS settings: hde:pio, hdf:pio
<6>    ide3: BM-DMA at 0xd008-0xd00f, BIOS settings: hdg:pio, hdh:pio
<7>Probing IDE interface ide2...
<7>Probing IDE interface ide3...
<7>Probing IDE interface ide0...
<7>Probing IDE interface ide1...
<7>Probing IDE interface ide2...
<7>Probing IDE interface ide3...

In normal Linux kernel, I have seen hda coming up properly though.

# dmesg
...
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
JMB368: IDE controller at PCI slot 0000:09:00.0
ACPI: PCI Interrupt 0000:09:00.0[A] -> GSI 16 (level, low) -> IRQ 16
JMB368: chipset revision 0
JMB368: 100% native mode on irq 16
    ide2: BM-DMA at 0xd000-0xd007, BIOS settings: hde:pio, hdf:pio
    ide3: BM-DMA at 0xd008-0xd00f, BIOS settings: hdg:pio, hdh:pio
hda: ST3500320NS, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: max request size: 512KiB
hda: 976773168 sectors (500107 MB), CHS=60801/255/63
hda: cache flushes supported
 hda: hda1 hda2
ide-floppy driver 0.99.newide
...

# lspci
...
09:00.0 IDE interface: JMicron Technologies, Inc. JMB368 IDE controller
...

Since it has been blocked forever, there was no way to debug it. In
addition, hda1 is mounted as /boot.

# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                     468001888   2635388 441209932   1% /
/dev/hda1               101086     22289     73578  24% /boot
tmpfs                  7664368         0   7664368   0% /dev/shm
none                   7664280        40   7664240   1%

Version-Release number of selected component (if applicable):
kernel-PAE-2.6.18-92.el5
kernel-PAE-2.6.18-124.el5
kexec-tools-1.102pre-51.el5

How reproducible:
Most of the time.

Steps to Reproduce:
1. reserve intel-s3ea2-02.rhts.bos.redhat.com which uses /dev/hda1 as /boot.
2. configure with crashkernel=128M@16M
3. echo c >/proc/sysrq-trigger
  
Actual results:
Kdump blocked forever.

Expected results:
Kdump kernel captured a VMCore successfully.

Additional info:
Sometimes, I have seen the following from normal kernel boot after hard-reset from the capture Linux kernel.

EXT3-fs: write access will be enabled during recovery.
hda: status timeout: status=0xd0 { Busy }
ide: failed opcode was: unknown
hda: no DRQ after issuing MULTWRITE_EXT
ide0: reset: success
hda: status timeout: status=0xd0 { Busy }
ide: failed opcode was: unknown
hda: no DRQ after issuing MULTWRITE_EXT
ide0: reset: success
hda: status timeout: status=0xd0 { Busy }
ide: failed opcode was: unknown
hda: no DRQ after issuing MULTWRITE_EXT
ide0: reset: success
kjournald starting.  Commit interval 5 seconds

I have seen some of those messages which never seen in normal kernel.

...
ACPI Error (evgpe-0711): No handler or method for GPE[1D], disabling
event [20060707]
 S0 S1 S3 S4 S5)
...
irq 114, desc: c22ef780, depth: 1, count: 0, unhandled: 0
->handle_irq():  00000000, 0x0
->chip(): c2284d00, 0xc2284d00
->action(): 00000000
  IRQ_DISABLED set
   IRQ_PENDING set
unexpected IRQ trap at vector 72
irq 114, desc: c22ef780, depth: 1, count: 0, unhandled: 0
->handle_irq():  00000000, 0x0
->chip(): c2284d00, 0xc2284d00
->action(): 00000000
  IRQ_DISABLED set
   IRQ_PENDING set
...

However, I tried to add "noacpi and acpi=off" to kdump kernel without
luck. At least, we know it was no a regression because both RHEL 5.2 GA
kernel and kexec-tools failed there as well.

I have also used smartmontools to diagnose the harddisk to see if it is a
hardware issue, but it passed the self test.

# smartctl --test=long /dev/hda

# smartctl -l selftest /dev/hda
smartctl version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      2424         -

Comment 1 Neil Horman 2008-12-01 11:39:04 UTC
try specifying hda=noprobe on the kernel command line.

I don't have the email anymore, but is this the same system that we discussed over email?  If so this is likely a kernel problem (as I think we discussed), and one for which the only current reasonable solution is the above code kdump kernel commandline addition.

Comment 2 Neil Horman 2008-12-02 16:14:40 UTC
ahh, we're in luck, Jarod, has a bug to auto-add the above options via the kdump init script, so this problem should just go away soon.  Its bz 254163.  I'm closing this as a dup of that.  Please re-open if this problem recurrs after that goes into the build.  Thanks!

*** This bug has been marked as a duplicate of bug 254163 ***

Comment 3 Qian Cai 2008-12-05 08:06:53 UTC
I re-open this bug because the above option does not solve the problem, and it is a regression that kdump will block forever if the system has IDE disks mounted. I have seen it on some other systems like,

dellgx240.rhts.bos.redhat.com

Linux version 2.6.18-125.el5PAE (mockbuild.redhat.com) (gcc version 4.1.2 20071124 (Red Hat 4.1.2-42)) #1 SMP Mon Dec 1 17:56:48 EST 2008
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000100 - 00000000000a0000 (usable)
 BIOS-e820: 0000000000100000 - 000000003ff77000 (usable)
 BIOS-e820: 000000003ff77000 - 000000003ff79000 (ACPI NVS)
 BIOS-e820: 000000003ff79000 - 0000000040000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
 BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
user-defined physical RAM map:
 user: 0000000000000000 - 00000000000a0000 (usable)
 user: 0000000002000000 - 0000000009f5b000 (usable)
...
Allocating PCI resources starting at 10000000 (gap: 09f5b000:f60a5000)
Detected 1694.614 MHz processor.
Built 1 zonelists.  Total pages: 40795
Kernel command line: ro root=LABEL=/ console=ttyS0,115200 acpi=off irqpoll maxcpus=1 reset_devices  hda=noprobe hdc=cdrom memmap=exactmap memmap=640K@0K memmap=130412K@32768K elfcorehdr=163180K
Misrouted IRQ fixup and polling support enabled
This may significantly impact system performance
ide_setup: hda=noprobe
ide_setup: hdc=cdrom
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
...
Loading jbd.ko module
Loading ext3.ko module
Waiting for required block device discovery
Waiting for hda...

Comment 4 Neil Horman 2008-12-05 12:22:40 UTC
Cai, this can't actually be a regression.  If you block 'forever here waiting for hda to come active', that implies that:
a) hda is required for the proper operation of kdump
b) that hda will never actually appear as an available disk in the system on kdump

If the disk is needed and never appears, it couldn't have worked on previous versions of kdump either, since the fact that it never registered as a drive  means we can't use it.  We would have just died somewhere else in the dump capture process rather than waiting for it to appear forever.

About the only way it would have worked properly would be if only the ide drive were a swap device, and then its just working by a stroke of luck, rather than any real code change. 

The only other possibility of regression here is if in previous kernels ide drives didn't have some problem in which they didn't get detected in the kdump kernel, in which case this would be a kernel regression, rather than a kdump regression.

This is way too late for any more changes.  We nned to wait for these disks to register, or we create more problems down the line.  The fact that one of the disks we need doesn't ever come get detected by the kernel, isn't a kdump problem.  I can work around it there (and we can hope to get lucky that it doesn't break anything else subsequently), but we shouldn't have to do that.  If the noprobe option doesn't fix this, and we need that disk, theres nothing we can do.  If that drive is the dump target, and we don't wait for it there, we'll just fail when we go to mount it later.  Suggest we move this to be a kernel bug, and target for 5.4.  Theres nothing I can fairly/safely do in kdump about this.

Comment 5 Qian Cai 2008-12-05 13:51:42 UTC
Neil, it is a regression because the following working kdump scenarios on systems with IDE disks mounted will not work anymore.

* dump to a remote host.
* dump to a non-IDE disk.

If I understand correctly, in the previous release, at least kdump can save VMCore to the above targets from initramfs. We just cannot mount the root filesystem and run INIT to capture the VMCore if the first attempt failed. That is why we have not seen kdump failed on those RHTS systems, because the first attempt succeed.

However, in 5.3, it is getting worse that kdump will not work for the above scenarios in the first place, so I have seen kdump hung forever on those systems.

If I remembered correctly, we discussed that IDE controllers have no easy way to reset device, so it is probably not surprise that kdump kernel could not recognize IDE disks? If so, since we cannot fix it in kernel side, is it possible to help the situation from the user-space tool? Or, at least don't make it worse?

Comment 6 Neil Horman 2008-12-05 14:29:54 UTC
Thats a slippery slope to say "can we at least not make it worse".  You are correct in your statement that, previously, if the ide disk in question was a filesystem or raw/swap device that wasnt strictly needed as the primary dump target, you would be able to capture a vmcore.  However, if it was the dump target, if would fail in various mysterious ways, possibly without a clear indicator of what went wrong.  About the only situation that is worse is the one in which it was possible that we would need an ide disk to capture a dump , but would up not needing it at the actual dump time (i.e. ide disk as rootfs, dump target to nfs share).

That being said, I think we can agree the need to wait for the required disks to be registered with the kernel is valid.  It was added in response to a bug in which we tried to mount a drive before the system detected it was present during a scsi bus scan.  Now we could exclude ide devices from the critical disk wait list (which would be the only possible fix in kdump we could implemnt), and if its that important I will, but doing so will re-introduce the potential for bz446279 for ide devices when used as kdump targets (in the event they simply take a long time to be discovered but not indefinately)

I'll do whichever you feel is most important, but either way, you'll get a regression here.

Comment 7 Qian Cai 2008-12-05 15:22:30 UTC
Thanks for the explanation. I read through the bug 446279, and it looks like it concerns more about SCSI disks (the original patch was polling for /proc/scsi/scsi), isn't it? If so, I would suggest to exclude IDE disks from the critical disk. Otherwise, I'll let you decide. If you feel that you need more time to figure out a better solution, I am fine to move it to 5.4.

Comment 8 Neil Horman 2008-12-05 16:16:32 UTC
The bug is reported against scsi disks specifically there, but the problem is generic in nature.  The issue is the ansynchronous nature of module loading against disk registration.  Its not /proc/scsi/scsi that we're looking at in that bug, its /sys/block/<device>.  When you load a module that looks for devices on a bus (be it ide/scsi/firewire/usb/etc), the registration of those discovered devices is asynchronous to the creation of those drives in the kernel.  As such, its possible for us to start preforming operations on drives before the kernel knows they exist, so we introduced the concept of  the critical disks list to provide that syncronization.  This allows us to wait until all the disks we need to preform all possible operations during a kdump to appear before we procede.

So, I'll filter out ide devices from that list if this gets approved as a blocker, but fair warning, it will be possible for ide drives that are working properly, but slow to be detected to cause kdump breaks in a random and haphazard fashion, and those bugs willhave to be closed with a CANTFIX.

Comment 9 Neil Horman 2008-12-10 19:58:01 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
During the course of kdumping it is possible for IDE disks to not properly re-appear during kdump kernel boot, due to legacy issues in resetting controllers.  Since, for various other bugs, kdump now serializes drive creation registration, with the rest of th kdump process, you may notice that  kdump hangs waiting for ide drives to be created.  If this is the case, it is recommended that you not use IDE disks with kdump.

Comment 11 Ryan Lerch 2009-01-12 22:38:03 UTC
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1 @@
-During the course of kdumping it is possible for IDE disks to not properly re-appear during kdump kernel boot, due to legacy issues in resetting controllers.  Since, for various other bugs, kdump now serializes drive creation registration, with the rest of th kdump process, you may notice that  kdump hangs waiting for ide drives to be created.  If this is the case, it is recommended that you not use IDE disks with kdump.+kdump now serializes drive creation registration with the rest of the kdump process. Consequently, kdump may hang waiting for IDE drives to be initialized. In these cases, it is recommended that IDE disks not be used with kdump.

Comment 12 Bernd Schubert 2009-11-18 12:21:49 UTC
I have no why, but kdump but hangs here at probing hda. If I'm in the default environment, there is no hda:

[root@mds1 ~]# ll /dev/cdrom
lrwxrwxrwx 1 root root 4 Nov 18 11:48 /dev/cdrom -> scd1
[root@mds1 ~]# ll /dev/scd1
brw-rw---- 1 root disk 11, 1 Nov 18 11:48 /dev/scd1

[root@mds1 ~]# ll /dev/hda
ls: /dev/hda: No such file or directory

But still kdump fails to probe hda. As far as I can see there is no config to set kdump specific options. And if I would at the beginning of init script, these line would overwrite it

        # We don't find cdrom drive.
        if [ $COUNTER -eq 0 ]; then
                KDUMP_IDE_NOPROBE_COMMANDLINE=""
        fi


Note You need to log in before you can comment on or make changes to this bug.