Description of problem: When zFCP loads (ananconda or live system) the kernel panics. I can reproduce this on two of my zFCP enabled guests, and my guests only. Here's the panic: scsi0 : zfcp operand exception: 0015 Ý#1¨ CPU: 0 Tainted: G Process ksoftirqd/0 (pid: 3, task: 00000000007fe618, ksp: 0000000001f1fd90) Krnl PSW : 0704000180000000 0000000040897130 (tiqdio_tl+0x34c/0x267c Ýqdio¨) Krnl GPRS: 0000000000000002 000000000001000b 00000000ffffffff 00000000ffffffff 00000000ffffffff 000000000001000b 00000000408a8c00 0000000000000000 000000003e005000 0000000000000000 0000000000000040 00000000408a4818 0000000040888000 000000004089b740 0000000001f0bec8 0000000001f0bde0 Krnl Code: b2 22 00 50 88 50 00 1c a7 f4 00 aa bf bf 81 d0 a7 74 00 a6 Call Trace: (Ý<00000000001a5ff8>¨ ccw_device_timeout+0x0/0x84) Ý<0000000000043eac>¨ tasklet_hi_action+0x108/0x1cc Ý<00000000000433da>¨ __do_softirq+0xba/0x190 Ý<000000000001ec8a>¨ do_softirq+0x8a/0xb0 (Ý<00000003003b0007>¨ 0x3003b0007) Ý<000000000004355c>¨ ksoftirqd+0xac/0x13c Ý<0000000000055d94>¨ kthread+0x118/0x14c Ý<000000000001859e>¨ kernel_thread_starter+0x6/0xc Ý<0000000000018598>¨ kernel_thread_starter+0x0/0xc <0>Kernel panic - not syncing: Fatal exception in interrupt 01: HCPGSP2629I The virtual machine is placed in CP mode due to a SIGP stop from CPU 00. 00: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 00015EF8 Relevant FCP options: FCP_1="0.0.4206 0X01 0X5005076300C4156D 0X0 0X5308000000000000" FCP_2="0.0.4207 0X02 0X5005076300C4156D 0X1 0X5309000000000000" If you need access to the guests ping me via e-mail :)
We've fixed a similar problem in RHEL5.3. What is the exact kernel version that panics?
(In reply to comment #2) > We've fixed a similar problem in RHEL5.3. What is the exact kernel version that > panics? This is from kicking off an install from anaconda. Kernel Version 2.6.18-128.el5 (5.3 Anaconda): Starting graphical installation... scsi0 : zfcp operand exception: 0015 Ý#1¨ CPU: 0 Tainted: G Process ksoftirqd/0 (pid: 3, task: 00000000007fe618, ksp: 0000000001f1fd90) Krnl PSW : 0704000180000000 0000000040897130 (tiqdio_tl+0x34c/0x267c Ýqdio¨) Krnl GPRS: 0000000000000002 0000000000010007 00000000ffffffff 00000000ffffffff 00000000ffffffff 0000000000010007 00000000408a8c00 0000000000000000 000000003e78d000 0000000000000000 0000000000000040 00000000408a4818 0000000040888000 000000004089b740 0000000001f0bec8 0000000001f0bde0 Krnl Code: b2 22 00 50 88 50 00 1c a7 f4 00 aa bf bf 81 d0 a7 74 00 a6 Call Trace: (Ý<00000000001a5ff8>¨ ccw_device_timeout+0x0/0x84) Ý<0000000000043eac>¨ tasklet_hi_action+0x108/0x1cc Ý<00000000000433da>¨ __do_softirq+0xba/0x190 Ý<000000000001ec8a>¨ do_softirq+0x8a/0xb0 (Ý<00000003003b0007>¨ 0x3003b0007) Ý<000000000004355c>¨ ksoftirqd+0xac/0x13c Ý<0000000000055d94>¨ kthread+0x118/0x14c Ý<000000000001859e>¨ kernel_thread_starter+0x6/0xc Ý<0000000000018598>¨ kernel_thread_starter+0x0/0xc <0>Kernel panic - not syncing: Fatal exception in interrupt 01: HCPGSP2629I The virtual machine is placed in CP mode due to a SIGP stop from CPU 00. 00: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 00015EF8 Kernel Version 2.6.18-156.el5 (5.4 Nightly Anaconda): The VNC server is now running. Starting graphical installation... operand exception: 0015 Ý#1¨ CPU: 1 Tainted: G 2.6.18-156.el5 #1 Process ksoftirqd/1 (pid: 5, task: 0000000001f69888, ksp: 0000000001f6fd90) Krnl PSW : 0704000180000000 000000004088eb3c (tiqdio_tl+0x34c/0x287c Ýqdio¨) Krnl GPRS: 0000000000000002 0000000000010007 00000000ffffffff 00000000ffffffff 00000000ffffffff 0000000000010007 00000000408a0a00 0000000000000000 000000003e0a8000 0000000000000000 0000000000000040 000000004089c620 000000004087f000 0000000040893378 0000000001f4fec8 0000000001f4fdd8 Krnl Code: b2 22 00 50 88 50 00 1c a7 f4 00 aa bf bf 81 d0 a7 74 00 a6 Call Trace: (Ý<000000000027d798>¨ 0x27d798) Ý<0000000000045078>¨ tasklet_hi_action+0x108/0x1cc Ý<00000000000445a6>¨ __do_softirq+0xba/0x190 Ý<000000000001ecda>¨ do_softirq+0x8a/0xb0 (Ý<00000003003cc007>¨ 0x3003cc007) Ý<0000000000044728>¨ ksoftirqd+0xac/0x13c Ý<0000000000057018>¨ kthread+0x118/0x14c Ý<00000000000185ae>¨ kernel_thread_starter+0x6/0xc Ý<00000000000185a8>¨ kernel_thread_starter+0x0/0xc <0>Kernel panic - not syncing: Fatal exception in interrupt 00: HCPGSP2629I The virtual machine is placed in CP mode due to a SIGP stop from CPU 01. 01: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 0001F02A
------- Comment From ursula.braun.com 2009-07-16 08:28 EDT------- Is it possible to take a dump after the panic occurred?
Arlinton, can you provide the dump requested in comment #6?
------- Comment From mgrf.com 2009-09-09 05:38 EDT------- (In reply to comment #9) > Arlinton, can you provide the dump requested in comment #6? > Please provide dump to enable for debugging, Thx
(In reply to comment #10) > ------- Comment From mgrf.com 2009-09-09 05:38 EDT------- > (In reply to comment #9) > > Arlinton, can you provide the dump requested in comment #6? > > > > Please provide dump to enable for debugging, Thx Unfortunately during the last outage, we did a power-on reset of the z9 and the problem has 'disappeared' (this happened before - hence the word Rare in the topic). Stay tuned as I try to replicate the issue.
(In reply to comment #10) > ------- Comment From mgrf.com 2009-09-09 05:38 EDT------- > (In reply to comment #9) > > Arlinton, can you provide the dump requested in comment #6? > > > > Please provide dump to enable for debugging, Thx For the time this is reproducible again, what is the recommended way to dump the memory?
@IBM (ursula.braun.com) Please review and respond to comment #12. Thanks.
------- Comment From ursula.braun.com 2009-11-18 05:39 EDT------- Dump handling for RHEL5 is described here: http://www.ibm.com/developerworks/linux/linux390/october2005_documentation.html
------- Comment From mgrf.com 2010-04-01 06:57 EDT------- (In reply to comment #4) > Dump handling for RHEL5 is described here: > http://www.ibm.com/developerworks/linux/linux390/october2005_documentation.html Hello Red Hat, any more questions?
------- Comment From 2010-09-21 05:26 EDT------- Hello Redhat, Can we close this bug, if it is no more reproducing? Thanks Muni