Bug 682110
Summary: | kdump dont't work on megaraid_sas | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Chao Ye <cye> | ||||||||||||
Component: | kernel | Assignee: | Tomas Henzl <thenzl> | ||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||||||||||
Severity: | high | Docs Contact: | |||||||||||||
Priority: | high | ||||||||||||||
Version: | 6.1 | CC: | ajb2, bugproxy, czhang, emcnabb, haruo.tomita, jjarvis, jkachuck, phan, qcai, rlerch | ||||||||||||
Target Milestone: | rc | Keywords: | Regression | ||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | All | ||||||||||||||
OS: | Linux | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | kernel-2.6.32-130.el6 | Doc Type: | Bug Fix | ||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | |||||||||||||||
: | 692099 (view as bug list) | Environment: | |||||||||||||
Last Closed: | 2011-05-19 12:27:49 UTC | Type: | --- | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Bug Depends On: | |||||||||||||||
Bug Blocks: | 684385, 692099 | ||||||||||||||
Attachments: |
|
Description
Chao Ye
2011-03-04 07:20:32 UTC
Same issue found on intel-s3e36-01.rhts.eng.nay.redhat.com when trigger crash via NMI button: ========================================================= ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:09: registered as cooling_device9 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:0a: registered as cooling_device10 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:0b: registered as cooling_device11 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:0c: registered as cooling_device12 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:0d: registered as cooling_device13 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:0e: registered as cooling_device14 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:0f: registered as cooling_device15 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:10: registered as cooling_device16 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:11: registered as cooling_device17 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:12: registered as cooling_device18 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:13: registered as cooling_device19 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:14: registered as cooling_device20 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:15: registered as cooling_device21 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:16: registered as cooling_device22 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:17: registered as cooling_device23 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:18: registered as cooling_device24 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:19: registered as cooling_device25 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:1a: registered as cooling_device26 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:1b: registered as cooling_device27 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:1c: registered as cooling_device28 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:1d: registered as cooling_device29 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:1e: registered as cooling_device30 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:1f: registered as cooling_device31 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:20: registered as cooling_device32 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:21: registered as cooling_device33 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:22: registered as cooling_device34 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:23: registered as cooling_device35 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:24: registered as cooling_device36 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:25: registered as cooling_device37 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:26: registered as cooling_device38 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:27: registered as cooling_device39 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:28: registered as cooling_device40 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:29: registered as cooling_device41 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:2a: registered as cooling_device42 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:2b: registered as cooling_device43 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:2c: registered as cooling_device44 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:2d: registered as cooling_device45 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:2e: registered as cooling_device46 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:2f: registered as cooling_device47 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:30: registered as cooling_device48 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:31: registered as cooling_device49 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:32: registered as cooling_device50 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:33: registered as cooling_device51 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:34: registered as cooling_device52 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:35: registered as cooling_device53 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:36: registered as cooling_device54 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:37: registered as cooling_device55 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:38: registered as cooling_device56 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:39: registered as cooling_device57 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:3a: registered as cooling_device58 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:3b: registered as cooling_device59 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:3c: registered as cooling_device60 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:3d: registered as cooling_device61 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:3e: registered as cooling_device62 ACPI: HARDWARE addr space,NOT supported yet processor LNXCPU:3f: registered as cooling_device63 ERST: Error Record Serialization Table (ERST) support is initialized. Non-volatile memory driver v1.3 Linux agpgart interface v0.103 tpm_tis 00:0a: 1.2 TPM (device-id 0x4A10, rev-id 78) crash memory driver: version 1.1 Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled �serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A 00:08: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A 00:09: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A brd: module loaded loop: module loaded input: Macintosh mouse button emulation as /devices/virtual/input/input2 Fixed MDIO Bus: probed ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver ehci_hcd 0000:00:1a.7: PCI INT C -> GSI 18 (level, low) -> IRQ 18 ehci_hcd 0000:00:1a.7: EHCI Host Controller ehci_hcd 0000:00:1a.7: new USB bus registered, assigned bus number 1 ehci_hcd 0000:00:1a.7: debug port 1 ehci_hcd 0000:00:1a.7: irq 18, io mem 0x9bc01000 ehci_hcd 0000:00:1a.7: USB 2.0 started, EHCI 1.00 usb usb1: New USB device found, idVendor=1d6b, idProduct=0002 usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 usb usb1: Product: EHCI Host Controller usb usb1: Manufacturer: Linux 2.6.32-122.el6.x86_64 ehci_hcd usb usb1: SerialNumber: 0000:00:1a.7 usb usb1: configuration #1 chosen from 1 choice hub 1-0:1.0: USB hub found hub 1-0:1.0: 6 ports detected ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 23 (level, low) -> IRQ 23 ehci_hcd 0000:00:1d.7: EHCI Host Controller ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 2 ehci_hcd 0000:00:1d.7: debug port 1 ehci_hcd 0000:00:1d.7: irq 23, io mem 0x9bc00000 ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00 usb usb2: New USB device found, idVendor=1d6b, idProduct=0002 usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1 usb usb2: Product: EHCI Host Controller usb usb2: Manufacturer: Linux 2.6.32-122.el6.x86_64 ehci_hcd usb usb2: SerialNumber: 0000:00:1d.7 usb usb2: configuration #1 chosen from 1 choice hub 2-0:1.0: USB hub found hub 2-0:1.0: 6 ports detected ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver uhci_hcd: USB Universal Host Controller Interface driver uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 uhci_hcd 0000:00:1a.0: UHCI Host Controller uhci_hcd 0000:00:1a.0: new USB bus registered, assigned bus number 3 uhci_hcd 0000:00:1a.0: irq 16, io base 0x000060c0 usb usb3: New USB device found, idVendor=1d6b, idProduct=0001 usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1 usb usb3: Product: UHCI Host Controller usb usb3: Manufacturer: Linux 2.6.32-122.el6.x86_64 uhci_hcd usb usb3: SerialNumber: 0000:00:1a.0 usb usb3: configuration #1 chosen from 1 choice hub 3-0:1.0: USB hub found hub 3-0:1.0: 2 ports detected uhci_hcd 0000:00:1a.1: PCI INT B -> GSI 21 (level, low) -> IRQ 21 uhci_hcd 0000:00:1a.1: UHCI Host Controller uhci_hcd 0000:00:1a.1: new USB bus registered, assigned bus number 4 uhci_hcd 0000:00:1a.1: irq 21, io base 0x000060a0 usb usb4: New USB device found, idVendor=1d6b, idProduct=0001 usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1 usb usb4: Product: UHCI Host Controller usb usb4: Manufacturer: Linux 2.6.32-122.el6.x86_64 uhci_hcd usb usb4: SerialNumber: 0000:00:1a.1 usb usb4: configuration #1 chosen from 1 choice hub 4-0:1.0: USB hub found hub 4-0:1.0: 2 ports detected uhci_hcd 0000:00:1a.2: PCI INT D -> GSI 19 (level, low) -> IRQ 19 uhci_hcd 0000:00:1a.2: UHCI Host Controller uhci_hcd 0000:00:1a.2: new USB bus registered, assigned bus number 5 uhci_hcd 0000:00:1a.2: irq 19, io base 0x00006080 usb usb5: New USB device found, idVendor=1d6b, idProduct=0001 usb usb5: New USB device strings: Mfr=3, Product=2, SerialNumber=1 usb usb5: Product: UHCI Host Controller usb usb5: Manufacturer: Linux 2.6.32-122.el6.x86_64 uhci_hcd usb usb5: SerialNumber: 0000:00:1a.2 usb usb5: configuration #1 chosen from 1 choice hub 5-0:1.0: USB hub found hub 5-0:1.0: 2 ports detected uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23 uhci_hcd 0000:00:1d.0: UHCI Host Controller uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 6 uhci_hcd 0000:00:1d.0: irq 23, io base 0x00006060 usb usb6: New USB device found, idVendor=1d6b, idProduct=0001 usb usb6: New USB device strings: Mfr=3, Product=2, SerialNumber=1 usb usb6: Product: UHCI Host Controller usb usb6: Manufacturer: Linux 2.6.32-122.el6.x86_64 uhci_hcd usb usb6: SerialNumber: 0000:00:1d.0 usb usb6: configuration #1 chosen from 1 choice hub 6-0:1.0: USB hub found hub 6-0:1.0: 2 ports detected uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19 uhci_hcd 0000:00:1d.1: UHCI Host Controller uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 7 uhci_hcd 0000:00:1d.1: irq 19, io base 0x00006040 usb usb7: New USB device found, idVendor=1d6b, idProduct=0001 usb usb7: New USB device strings: Mfr=3, Product=2, SerialNumber=1 usb usb7: Product: UHCI Host Controller usb usb7: Manufacturer: Linux 2.6.32-122.el6.x86_64 uhci_hcd usb usb7: SerialNumber: 0000:00:1d.1 usb usb7: configuration #1 chosen from 1 choice hub 7-0:1.0: USB hub found hub 7-0:1.0: 2 ports detected uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18 uhci_hcd 0000:00:1d.2: UHCI Host Controller uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 8 uhci_hcd 0000:00:1d.2: irq 18, io base 0x00006020 usb usb8: New USB device found, idVendor=1d6b, idProduct=0001 usb usb8: New USB device strings: Mfr=3, Product=2, SerialNumber=1 usb usb8: Product: UHCI Host Controller usb usb8: Manufacturer: Linux 2.6.32-122.el6.x86_64 uhci_hcd usb usb8: SerialNumber: 0000:00:1d.2 usb usb8: configuration #1 chosen from 1 choice hub 8-0:1.0: USB hub found hub 8-0:1.0: 2 ports detected PNP: No PS/2 controller found. Probing ports directly. mice: PS/2 mouse device common for all mice rtc_cmos 00:03: RTC can wake from S4 rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0 rtc0: alarms up to one month, y3k, 114 bytes nvram, hpet irqs cpuidle: using governor ladder cpuidle: using governor menu usbcore: registered new interface driver hiddev usbcore: registered new interface driver usbhid usbhid: v2.6:USB HID core driver TCP cubic registered Initializing XFRM netlink socket NET: Registered protocol family 17 registered taskstats version 1 rtc_cmos 00:03: setting system clock to 2011-03-15 01:31:00 UTC (1300152660) Initalizing network drop monitor service Freeing unused kernel memory: 1228k freed Write protecting the kernel read-only data: 10240k Freeing unused kernel memory: 1120k freed Freeing unused kernel memory: 1800k freed Mounting proc fiusb 4-1: new full speed USB device using uhci_hcd and address 2 lesystem Mounting sysfs filesystem Creating /dev Creating initial device nodes Free memory/Total memory (free %): 65560 / 108284 ( 60.5445 ) Loading jbd2.ko module Loading mbcache.ko module Loading ext4.ko module Loading crc-t10dif.ko module Loading sd_mod.ko module Loading ata_generic.ko module Loading dm-mod.ko module device-mapper: uevent: version 1.0.3 device-mapper: ioctl: 4.20.6-ioctl (2011-02-02) initialised: dm-devel Loading dm-log.ko module Loading dm-region-hash.ko module Loading dm-mirror.ko module Loading dm-zero.ko module Loading dm-snapshot.ko module usb 4-1: New USB device found, idVendor=046b, idProduct=ff10 usb 4-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 usb 4-1: Product: Virtual Keyboard and Mouse usb 4-1: Manufacturer: American Megatrends Inc. usb 4-1: SerialNumber: serial Loading autofs4.ko module Loading sunrpc.ko module usb 4-1: configuration #1 chosen from 1 choice RPC: Registered udp transport module. RPC: Registered tcp transport module. RPC: Registered tcp NFSv4.1 backchannel transport module. Loading freq_table.ko module input: American Megatrends Inc. Virtual Keyboard and Mouse as /devices/pci0000:00/0000:00:1a.1/usb4/4-1/4-1:1.0/input/input3 Loading ipv6.ko module generic-usb 0003:046B:FF10.0001: input,hidraw0: USB HID v1.10 Keyboard [American Megatrends Inc. Virtual Keyboard and Mouse] on usb-0000:00:1a.1-1/input0 NET: Registered protocol family 10 lo: Disabled Privacy Extensions Loading microcode.ko module microcode: CPU0 sig=0x206e6, pf=0x4, revision=0x6 input: American Megatrends Inc. Virtual Keyboard and Mouse as /devices/pci0000:00/0000:00:1a.1/usb4/4-1/4-1:1.1/input/input4 platform microcode: firmware: requesting intel-ucode/06-2e-06 generic-usb 0003:046B:FF10.0002: input,hidraw1: USB HID v1.10 Mouse [American Megatrends Inc. Virtual Keyboard and Mouse] on usb-0000:00:1a.1-1/input1 usb 7-2: new low speed USB device using uhci_hcd and address 2 usb 7-2: New USB device found, idVendor=0624, idProduct=0307 usb 7-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0 usb 7-2: Product: Avocent DSRIQ-USB usb 7-2: Manufacturer: Avocent usb 7-2: configuration #1 chosen from 1 choice input: Avocent Avocent DSRIQ-USB as /devices/pci0000:00/0000:00:1d.1/usb7/7-2/7-2:1.0/input/input5 generic-usb 0003:0624:0307.0003: input,hidraw2: USB HID v1.10 Keyboard [Avocent Avocent DSRIQ-USB] on usb-0000:00:1d.1-2/input0 input: Avocent Avocent DSRIQ-USB as /devices/pci0000:00/0000:00:1d.1/usb7/7-2/7-2:1.1/input/input6 generic-usb 0003:0624:0307.0004: input,hidraw3: USB HID v1.10 Mouse [Avocent Avocent DSRIQ-USB] on usb-0000:00:1d.1-2/input1 Microcode Update Driver: v2.00 <tigran.co.uk>, Peter Oruba Loading hed.ko module Loading i2c-core.ko module Loading iTCO_vendor_support.ko module iTCO_vendor_support: vendor-support=0 Loading edac_core.ko module EDAC MC: Ver: 2.1.0 Mar 9 2011 Loading sg.ko module Loading dca.ko module dca service started, version 1.12.1 Loading cdrom.ko module Loading pata_acpi.ko module pata_acpi 0000:00:1f.2: PCI INT A -> GSI 19 (level, low) -> IRQ 19 pata_acpi 0000:00:1f.2: PCI INT A disabled pata_acpi 0000:00:1f.5: PCI INT D -> GSI 21 (level, low) -> IRQ 21 pata_acpi 0000:00:1f.5: PCI INT D disabled Loading ata_piix.ko module ata_piix 0000:00:1f.2: PCI INT A -> GSI 19 (level, low) -> IRQ 19 ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ] scsi0 : ata_piix scsi1 : ata_piix ata1: SATA max UDMA/133 cmd 0x6138 ctl 0x614c bmdma 0x6110 irq 19 ata2: SATA max UDMA/133 cmd 0x6130 ctl 0x6148 bmdma 0x6118 irq 19 ata_piix 0000:00:1f.5: PCI INT D -> GSI 21 (level, low) -> IRQ 21 ata_piix 0000:00:1f.5: MAP [ P0 -- P1 -- ] scsi2 : ata_piix scsi3 : ata_piix ata3: SATA max UDMA/133 cmd 0x6128 ctl 0x6144 bmdma 0x60f0 irq 21 ata4: SATA max UDMA/133 cmd 0x6120 ctl 0x6140 bmdma 0x60f8 irq 21 ata4: SATA link down (SStatus 0 SControl 300) ata3: SATA link down (SStatus 4 SControl 300) ata2.00: SATA link down (SStatus 0 SControl 300) ata2.01: SATA link down (SStatus 0 SControl 300) ata1.00: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.01: SATA link down (SStatus 0 SControl 300) ata1.00: ATAPI: Optiarc DVD RW AD-7580S, FX04, max UDMA/100 ata1.00: configured for UDMA/100 scsi 0:0:0:0: CD-ROM Optiarc DVD RW AD-7580S FX04 PQ: 0 ANSI: 5 scsi 0:0:0:0: Attached scsi generic sg0 type 5 Loading megaraid_sas.ko module megasas: 00.00.05.29-RH1 Tue. Dec. 7 17:00:00 PDT 2010 megasas: 0x1000:0x0079:0x8086:0x9261: bus 7:slot 0:func 0 megaraid_sas 0000:07:00.0: PCI INT A -> GSI 24 (level, low) -> IRQ 24 megasas: Waiting for FW to come to ready state megasas: FW now in Ready state megasas_init_mfi: fw_support_ieee=0 megasas: INIT adapter done megaraid_sas: fw state:c0000000 megasas: fwstate:c0000000, dis_OCR=0 DRHD: handling fault status reg 102 INTR-REMAP: Request device [[01:00.0] fault index 32 INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear INFO: task insmod:201 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. insmod D 0000000000000000 0 201 1 0x00000000 ffff880006f6bbb8 0000000000000082 ffff880004015f40 0000000000000000 0000000000000000 0000000000000082 ffff880006f6bb98 ffffffff8105c5cc ffff880006f4b038 ffff880006f6bfd8 000000000000f558 ffff880006f4b038 Call Trace: [<ffffffff8105c5cc>] ? try_to_wake_up+0xcc/0x400 [<ffffffff8108e1ee>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa01b7745>] megasas_issue_blocked_cmd+0x85/0xc0 [megaraid_sas] [<ffffffff8108df00>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa01bacc3>] megasas_start_aen+0x143/0x2d0 [megaraid_sas] [<ffffffffa01b8500>] ? megasas_isr+0x0/0x1f0 [megaraid_sas] [<ffffffffa01c1b13>] megasas_probe_one+0xc2d/0xee4 [megaraid_sas] [<ffffffff8127f4e7>] local_pci_probe+0x17/0x20 [<ffffffff812806d1>] pci_device_probe+0x101/0x120 [<ffffffff81339e52>] ? driver_sysfs_add+0x62/0x90 [<ffffffff81339ff0>] driver_probe_device+0xa0/0x2a0 [<ffffffff8133a29b>] __driver_attach+0xab/0xb0 [<ffffffff8133a1f0>] ? __driver_attach+0x0/0xb0 [<ffffffff81339254>] bus_for_each_dev+0x64/0x90 [<ffffffff81339d8e>] driver_attach+0x1e/0x20 [<ffffffff81339690>] bus_add_driver+0x200/0x300 [<ffffffff8133a5c6>] driver_register+0x76/0x140 [<ffffffff81174e7b>] ? __register_chrdev+0x8b/0xf0 [<ffffffff81280936>] __pci_register_driver+0x56/0xd0 [<ffffffffa01cc000>] ? megasas_init+0x0/0x1ea [megaraid_sas] [<ffffffffa01cc000>] ? megasas_init+0x0/0x1ea [megaraid_sas] [<ffffffffa01cc0a1>] megasas_init+0xa1/0x1ea [megaraid_sas] [<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0 [<ffffffff810ac7ff>] sys_init_module+0xdf/0x250 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b INFO: task insmod:201 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. insmod D 0000000000000000 0 201 1 0x00000000 ffff880006f6bbb8 0000000000000082 ffff880004015f40 0000000000000000 0000000000000000 0000000000000082 ffff880006f6bb98 ffffffff8105c5cc ffff880006f4b038 ffff880006f6bfd8 000000000000f558 ffff880006f4b038 Call Trace: [<ffffffff8105c5cc>] ? try_to_wake_up+0xcc/0x400 [<ffffffff8108e1ee>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa01b7745>] megasas_issue_blocked_cmd+0x85/0xc0 [megaraid_sas] [<ffffffff8108df00>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa01bacc3>] megasas_start_aen+0x143/0x2d0 [megaraid_sas] [<ffffffffa01b8500>] ? megasas_isr+0x0/0x1f0 [megaraid_sas] [<ffffffffa01c1b13>] megasas_probe_one+0xc2d/0xee4 [megaraid_sas] [<ffffffff8127f4e7>] local_pci_probe+0x17/0x20 [<ffffffff812806d1>] pci_device_probe+0x101/0x120 [<ffffffff81339e52>] ? driver_sysfs_add+0x62/0x90 [<ffffffff81339ff0>] driver_probe_device+0xa0/0x2a0 [<ffffffff8133a29b>] __driver_attach+0xab/0xb0 [<ffffffff8133a1f0>] ? __driver_attach+0x0/0xb0 [<ffffffff81339254>] bus_for_each_dev+0x64/0x90 [<ffffffff81339d8e>] driver_attach+0x1e/0x20 [<ffffffff81339690>] bus_add_driver+0x200/0x300 [<ffffffff8133a5c6>] driver_register+0x76/0x140 [<ffffffff81174e7b>] ? __register_chrdev+0x8b/0xf0 [<ffffffff81280936>] __pci_register_driver+0x56/0xd0 [<ffffffffa01cc000>] ? megasas_init+0x0/0x1ea [megaraid_sas] [<ffffffffa01cc000>] ? megasas_init+0x0/0x1ea [megaraid_sas] [<ffffffffa01cc0a1>] megasas_init+0xa1/0x1ea [megaraid_sas] [<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0 [<ffffffff810ac7ff>] sys_init_module+0xdf/0x250 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b INFO: task insmod:201 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. insmod D 0000000000000000 0 201 1 0x00000000 ffff880006f6bbb8 0000000000000082 ffff880004015f40 0000000000000000 0000000000000000 0000000000000082 ffff880006f6bb98 ffffffff8105c5cc ffff880006f4b038 ffff880006f6bfd8 000000000000f558 ffff880006f4b038 Call Trace: [<ffffffff8105c5cc>] ? try_to_wake_up+0xcc/0x400 [<ffffffff8108e1ee>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa01b7745>] megasas_issue_blocked_cmd+0x85/0xc0 [megaraid_sas] [<ffffffff8108df00>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa01bacc3>] megasas_start_aen+0x143/0x2d0 [megaraid_sas] [<ffffffffa01b8500>] ? megasas_isr+0x0/0x1f0 [megaraid_sas] [<ffffffffa01c1b13>] megasas_probe_one+0xc2d/0xee4 [megaraid_sas] [<ffffffff8127f4e7>] local_pci_probe+0x17/0x20 [<ffffffff812806d1>] pci_device_probe+0x101/0x120 [<ffffffff81339e52>] ? driver_sysfs_add+0x62/0x90 [<ffffffff81339ff0>] driver_probe_device+0xa0/0x2a0 [<ffffffff8133a29b>] __driver_attach+0xab/0xb0 [<ffffffff8133a1f0>] ? __driver_attach+0x0/0xb0 [<ffffffff81339254>] bus_for_each_dev+0x64/0x90 [<ffffffff81339d8e>] driver_attach+0x1e/0x20 [<ffffffff81339690>] bus_add_driver+0x200/0x300 [<ffffffff8133a5c6>] driver_register+0x76/0x140 [<ffffffff81174e7b>] ? __register_chrdev+0x8b/0xf0 [<ffffffff81280936>] __pci_register_driver+0x56/0xd0 [<ffffffffa01cc000>] ? megasas_init+0x0/0x1ea [megaraid_sas] [<ffffffffa01cc000>] ? megasas_init+0x0/0x1ea [megaraid_sas] [<ffffffffa01cc0a1>] megasas_init+0xa1/0x1ea [megaraid_sas] [<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0 [<ffffffff810ac7ff>] sys_init_module+0xdf/0x250 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b DRHD: handling fault status reg 202 INTR-REMAP: Request device [[01:00.0] fault index 33 INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear DRHD: handling fault status reg 302 INTR-REMAP: Request device [[01:00.0] fault index 35 INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear DRHD: handling fault status reg 402 INTR-REMAP: Request device [[01:00.0] fault index 31 INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear DRHD: handling fault status reg 502 INTR-REMAP: Request device [[01:00.0] fault index 37 INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear INFO: task insmod:201 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. insmod D 0000000000000000 0 201 1 0x00000000 ffff880006f6bbb8 0000000000000082 ffff880004015f40 0000000000000000 0000000000000000 0000000000000082 ffff880006f6bb98 ffffffff8105c5cc ffff880006f4b038 ffff880006f6bfd8 000000000000f558 ffff880006f4b038 Call Trace: [<ffffffff8105c5cc>] ? try_to_wake_up+0xcc/0x400 [<ffffffff8108e1ee>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa01b7745>] megasas_issue_blocked_cmd+0x85/0xc0 [megaraid_sas] [<ffffffff8108df00>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa01bacc3>] megasas_start_aen+0x143/0x2d0 [megaraid_sas] [<ffffffffa01b8500>] ? megasas_isr+0x0/0x1f0 [megaraid_sas] [<ffffffffa01c1b13>] megasas_probe_one+0xc2d/0xee4 [megaraid_sas] [<ffffffff8127f4e7>] local_pci_probe+0x17/0x20 [<ffffffff812806d1>] pci_device_probe+0x101/0x120 [<ffffffff81339e52>] ? driver_sysfs_add+0x62/0x90 [<ffffffff81339ff0>] driver_probe_device+0xa0/0x2a0 [<ffffffff8133a29b>] __driver_attach+0xab/0xb0 [<ffffffff8133a1f0>] ? __driver_attach+0x0/0xb0 [<ffffffff81339254>] bus_for_each_dev+0x64/0x90 [<ffffffff81339d8e>] driver_attach+0x1e/0x20 [<ffffffff81339690>] bus_add_driver+0x200/0x300 [<ffffffff8133a5c6>] driver_register+0x76/0x140 [<ffffffff81174e7b>] ? __register_chrdev+0x8b/0xf0 [<ffffffff81280936>] __pci_register_driver+0x56/0xd0 [<ffffffffa01cc000>] ? megasas_init+0x0/0x1ea [megaraid_sas] [<ffffffffa01cc000>] ? megasas_init+0x0/0x1ea [megaraid_sas] [<ffffffffa01cc0a1>] megasas_init+0xa1/0x1ea [megaraid_sas] [<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0 [<ffffffff810ac7ff>] sys_init_module+0xdf/0x250 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b RHEL6.1-20110311.3 installed, system reserved for debug. *** Bug 688334 has been marked as a duplicate of this bug. *** Created attachment 485874 [details]
serial console log
Created attachment 485875 [details]
sosreport
------- Comment From masbock.com 2011-03-17 13:50 EDT------- From the call traces it looks like the driver is waiting for an interrupt in response to megasas_issue_blocked_cmd. Perhaps this is an issue with MSI-x initialization in the kdump kernel. Therefore it might be worthwhile to boot both kernels with megaraid_sas.disable_msix=1 to see if the problem goes away. (This is not meant as a workaround, only a debugging step.) ------- Comment From masbock.com 2011-03-17 20:24 EDT------- This is a snippet from the kernel messages: It looks like megaraid_sas allocated a legacy (non-msix) interrupt vector. I'd guess that the device still thinks it's using MSI-x The following is a vt-d error. It is the result of devices still using MSI vectors activate by the main kernel for which there are no interrupt remapping table entries. This one could have been caused by the MSI in response to the megaraid device to the megasas_issue_blocked_cmd. (But it could also belong to the network device) Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause Loading megaraid_sas driver and hardware in kdump kernel. Consequence Kdump kernel is unable to process any further, and insmod command is blocked with those messages. INFO: task insmod:201 blocked for more than 120 seconds. Workaround None found yet. ------- Comment From shubgoya.com 2011-03-18 08:25 EDT------- There is one more addition to the platform list affected by this issue. Below are the details:- x3650M3 - 01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 9240 (rev 03) Thx, Shubham ------- Comment From masbock.com 2011-03-18 14:27 EDT------- (In reply to comment #16) > From the call traces it looks like the driver is waiting for an interrupt in > response to megasas_issue_blocked_cmd. Perhaps this is an issue with MSI-x > initialization in the kdump kernel. > Therefore it might be worthwhile to boot both kernels with > megaraid_sas.disable_msix=1 to see if the problem goes away. (This is not meant > as a workaround, only a debugging step.) The boot parameter really is megaraid_sas.msix_disable=1 (I had it backwards). Shubham just verified that this works. This means that if the megaraid_sas driver doesn't use MSI-x the kdump kernel boots. As a next step we should back port the following patch: [PATCH 5/15] megaraid_sas: Fix probe_one to clear MSI-X flags in kdump http://marc.info/?l=linux-scsi&m=129816856305895&w=2 to see if that fixes the issues. ------- Comment From masbock.com 2011-03-21 13:21 EDT------- (In reply to comment #21) > (In reply to comment #20) > > (In reply to comment #16) > > As a next step we should back port the following patch: > > > > [PATCH 5/15] megaraid_sas: Fix probe_one to clear MSI-X flags in kdump > > http://marc.info/?l=linux-scsi&m=129816856305895&w=2 > > > > to see if that fixes the issues. > > > Is this patch accepted in mainline? if we want this bug fixed in 6.1, we need > to pack port it and test it asap. please give an estimate as when the pack port > patch will be ready. > I thought I had seen the patch last Friday on kernel.org in 2.6.39-git8. But I don't see it anymore in git10. Would it be possible to have somebody from LSI to comment on the status of this patch in main-line? We can provide a back port and test results for the patch quickly. ------- Comment From masbock.com 2011-03-21 13:54 EDT------- I just verified that the patch is indeed in the latest main-line kernel git tree. I can be found here: http://www.kernel.org/diff/diffview.cgi?file=/pub/linux/kernel/v2.6/snapshots/patch-2.6.38-git10.bz2 Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,9 +1,3 @@ -Cause - Loading megaraid_sas driver and hardware in kdump kernel. -Consequence - Kdump kernel is unable to process any further, and insmod command is blocked with those messages. +Loading the megaraid_sas driver on the kdump kernel will result in the insmod command being blocked, returning messages similar to: -INFO: task insmod:201 blocked for more than 120 seconds. +INFO: task insmod:201 blocked for more than 120 seconds.- -Workaround - None found yet. Created attachment 486704 [details] Patch to reset msix in megaraid_sas driver in kdump kernel ------- Comment on attachment From masbock.com 2011-03-21 19:24 EDT------- This is a back port of a patch that resets msix in the megaraid_sas driver in the kdump kernel. The patch was originally posted by Adam Radford from LSI here: http://marc.info/?l=linux-scsi&m=129816856305895&w=2 "Back port" is really an exaggeration, it's only shifted by 4 lines. This patch depends on the "reset_devices" boot parameter being passed. This parameter is by default passed in the kdump kernel. Therefore no special documentation is needed. It tested this patch with the linux-2.6.32-122.el6.x86_64 kernel and obtained a vmcore. The megaraid_sas driver worked fine in the kdump kernel with the patch. This patches fixes the issue. ------- Comment From tpnoonan.com 2011-03-22 15:58 EDT------- red hat please consider as exception for rhel6.1, thnx Hello, This is requested as a blocker for 6.1. Thank You Joe Kachuck I posted the patch for review. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Deleted Technical Notes Contents. Old Contents: Loading the megaraid_sas driver on the kdump kernel will result in the insmod command being blocked, returning messages similar to: INFO: task insmod:201 blocked for more than 120 seconds. ------- Comment From ramesyr1.com 2011-03-29 00:07 EDT------- I tested the patch on Linux 2.6.32-122.el6.x86_64 kernel ,after triggering the kdump megaraid_sas driver is loading without any issues but vmcore is not generated.I could only see an empty directory under /var/crash with no vmcore in it. FYI, [root@mx3850x5 crash]# cd 127.0.0.1-2011-03-23-20:17:28 ;ll total 0 I also noticed from console log that while loading SELinux policy ,the makedumpfile process is getting killed by a oom-killer while it is trying to save the vmcore. from the above observation ,I disabled the SELinux policy and triggered the kdump but the system got hanged(console log for hang point attached), after a longtime i rebooted the machine manually and the interesting thing was it has generated the vmcore of zero size. FYI, ----SELinux disabled---- [root@mx3850x5 ~]# getenforce Permissive -----Zero size vmcore------- [root@mx3850x5 ~]# cd /var/crash/2011-03-23-21:44/ ; ll total 0 -rw-------. 1 root root 0 Mar 23 21:44 vmcore. Machine Type = X3850-X5 system memory = 256GB * complete serial console log is attached. Created attachment 488322 [details]
serial console log
------- Comment (attachment only) From ramesyr1.com 2011-03-29 00:11 EDT-------
(In reply to comment #25) > ------- Comment From ramesyr1.com 2011-03-29 00:07 EDT------- > I tested the patch on Linux 2.6.32-122.el6.x86_64 kernel ,after triggering the > kdump megaraid_sas driver is loading without any issues but vmcore is not > generated.I could only see an empty directory under /var/crash with no vmcore > in it. > system memory = 256GB This look like another issue, please try, before we open another bug report, to reserve a larger memory size for the kdump kernel. It can happen is some corner cases that default value isn't enough. Try some larger values, the kernel option is crashkernel, so for example crashkernel=256M@256M or higher. ------- Comment From ramesyr1.com 2011-03-29 08:35 EDT------- (In reply to comment #32) > > This look like another issue, please try, before we open another bug report, to > reserve a larger memory size for the kdump kernel. It can happen is some corner > cases that default value isn't enough. Try some larger values, the kernel > option is crashkernel, so for example crashkernel=256M@256M or higher. I tried with crashkernel=512M ,the issue goes away however the vmcore genearation fails with crashkernel=auto . But according to the kdump documentation at Documentation/kdump/kdump.txt from RHEL6, the crashkernel=auto reserves memory according to following table: Memory size Reserved memory =========== =============== [4G, 12G) 256M [12G, 128G) 512M [128G, 256G) 768M [256G, 378G) 1024M [378G, 512G) 1536M [512G, 768G) 2048M [768G, ) 3072M According to the above table 1024M should have been reserved for crashkernel when specified crashkernel=auto but it is reserving only 129M. Since the auto feature does not work as expected in kdump documentation, this looks to be BUG in crashkernel=auto feature. Shall I go ahead and raise a separate BUG for this issue? (In reply to comment #28) > Since the auto feature does not work as expected in kdump documentation, this > looks to be BUG in crashkernel=auto feature. Shall I go ahead and raise a > separate BUG for this issue? Yes please, open a new Bug for this issue. I'm seeing the same problem on RHEL 5.6 (In reply to comment #30) > I'm seeing the same problem on RHEL 5.6 Do you mean the msi-x issue or the problem with the size of the crashkernel? The msi-x issue. (In reply to comment #32) > The msi-x issue. Let us continue this RHEL5 issue here -> bz#692099#c5 ------- Comment From iranna.ankad.com 2011-04-06 09:39 EDT------- (In reply to comment #35) > (In reply to comment #28) > > Since the auto feature does not work as expected in kdump documentation, this > > looks to be BUG in crashkernel=auto feature. Shall I go ahead and raise a > > separate BUG for this issue? > > Yes please, open a new Bug for this issue. Hello Red Hat FYI, We have opened below bugzilla to track this new issue. Bug 71062 - RH692764 :crashkernel=auto feature does not work as per the kdump documentation Now, coming back to the original issue for which we are using this bugzilla (i.e kdump hang on megaraid), the patch which we have submitted/tested is still under your review. Please let us know further status on that. Thanks! (In reply to comment #34) > Hello Red Hat FYI, > We have opened below bugzilla to track this new issue. > > Bug 71062 - RH692764 :crashkernel=auto feature does not work as per the > kdump documentation I hope I added the right person to this bug. > Now, coming back to the original issue for which we are using this bugzilla > (i.e kdump hang on megaraid), > the patch which we have submitted/tested is still under your review. Please let > us know further status on that. The patch is in POSTED state, it will updated accordingly when the patch gets included. This fix is approved and planned for inclusion in snapshot 3. Patch(es) available on kernel-2.6.32-130.el6 ------- Comment From iranna.ankad.com 2011-04-08 06:43 EDT------- (In reply to comment #42) > This fix is approved and planned for inclusion in snapshot 3. > Patch(es) available on kernel-2.6.32-130.el6 Red Hat, Thanks for the updates. We will retest once snap3 is available. ------- Comment From ramesyr1.com 2011-04-14 07:46 EDT------- I verified the kdump on snap3 , kdump worked fine and generated the vmcore successfully. Thanks ! ------- Comment From iranna.ankad.com 2011-04-14 10:16 EDT------- (In reply to comment #46) > I verified the kdump on snap3 , kdump worked fine and generated the vmcore > successfully. > > Thanks ! Ramesh, Thanks for the snap3 results. Red Hat, We can close this bug now. Thanks! ------- Comment From iranna.ankad.com 2011-04-14 10:16 EDT------- Closing... An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0542.html |