Hide Forgot
Description of problem: On a PRIMERGY RX600S4, we observe that kdump over NFS fails. Version-Release number of selected component (if applicable): kernel 2.6.32-220.el6 kexec-tools-2.0.0-209.el6 How reproducible: always Steps to Reproduce: 1. configure kdump via NFS 2. trigger a crash dump via alt-sysrq Actual results: See description Expected results: kdump successfully written Additional info: kdump on local disk succeeds
Created attachment 538584 [details] sosreport
Created attachment 538589 [details] serial log of kdump attempt You can see that the last messages displayed are from FS-Cache and USB devices. FS-Cache: Netfs 'nfs' registered for caching input: Avocent FSC A3C40047297 as /devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1:1.0/input/input5 generic-usb 0003:0624:0327.0003: input,hidraw2: USB HID v1.10 Keyboard [Avocent FSC A3C40047297] on usb-0000:00:1d.1-1/input0 input: Avocent FSC A3C40047297 as /devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1:1.1/input/input6 generic-usb 0003:0624:0327.0004: input,hidraw3: USB HID v1.10 Mouse [Avocent FSC A3C40047297] on usb-0000:00:1d.1-1/input1
Hello, I have two questions: 1. What are the following messages in the first kernel? 2. Does this only happen when you dump over NFS?
1. See dmesg file in sosreport: input: Avocent FSC A3C40047297 as /devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1:1.1/input/input6 generic-usb 0003:0624:0327.0004: input,hidraw3: USB HID v1.10 Mouse [Avocent FSC A3C40047297] on usb-0000:00:1d.1-1/input1 alloc irq_desc for 40 on node -1 alloc kstat_irqs on node -1 lpfc 0000:64:00.0: irq 40 for MSI/MSI-X ata_piix 0000:00:1f.2: version 2.13 ata_piix 0000:00:1f.2: PCI INT A -> GSI 17 (level, low) -> IRQ 17 ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ] ata_piix 0000:00:1f.2: setting latency timer to 64 scsi3 : ata_piix scsi4 : ata_piix ata1: SATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0x1880 irq 14 ata2: SATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0x1888 irq 15 lpfc 0000:64:00.0: 0:1303 Link Up Event x1 received Data: x1 x1 x8 x2 x0 x0 0 lpfc 0000:64:00.0: 0:(0):2858 FLOGI failure Status:x3/x18 TMO:x0 lpfc 0000:64:00.0: 0:(0):2858 FLOGI failure Status:x3/x18 TMO:x0 2. Yes, so I'm told (ssh to be clarified, dump on local disk works fine). I was just told that the NFS dump worked on another similar machine. The difference between the "good" and "bad" case is that in the "bad" case, a number of additional controllers were in the system: 1 Emulex LPe 1150 1 LSI SAS 8880 EM2 1 Intel Pro 1000 PT Quad Port 1 Intel 10 GB XF SR LAN Controller That, together with the information under 1.) above, makes me think that lpfc may be involved. Another possible error cause is the presence of the additional LAN controllers that may be causing confusion about the LAN interface.
The problem occurs also with kdump over ssh.
I have done several kdump attempts on the said-to-be-affected machine, and all worked fine. Hang on.
Given that this also occurs with kdump over ssh, I'm going to declare this "not a NFS bug" and reset the owner back to kernel-mgr. Also cc'ing Rob Evers since he seems to have done some work recently on the lpfc driver.
The problem isn't reproducable any more. I am sorry for bothering you.