505555 – Rare: When zFCP.ko loads, kernel panics.

Bug 505555 - Rare: When zFCP.ko loads, kernel panics.

Summary: Rare: When zFCP.ko loads, kernel panics.

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.3
Hardware:	s390x
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Hans-Joachim Picht
QA Contact:	Red Hat Kernel QE team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	533192
TreeView+	depends on / blocked

Reported:	2009-06-12 12:34 UTC by Arlinton Bourne
Modified:	2010-12-08 14:06 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2010-11-30 13:51:43 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
IBM Linux Technology Center	54569	0	None	None	None	Never

Description Arlinton Bourne 2009-06-12 12:34:16 UTC

Description of problem:
When zFCP loads (ananconda or live system) the kernel panics. I can reproduce this on two of my zFCP enabled guests, and my guests only.

Here's the panic:

scsi0 : zfcp 
operand exception: 0015 Ý#1¨ 
CPU:    0    Tainted: G      
Process ksoftirqd/0 (pid: 3, task: 00000000007fe618, ksp: 0000000001f1fd90) 
Krnl PSW : 0704000180000000 0000000040897130 (tiqdio_tl+0x34c/0x267c Ýqdio¨) 
Krnl GPRS: 0000000000000002 000000000001000b 00000000ffffffff 00000000ffffffff 
           00000000ffffffff 000000000001000b 00000000408a8c00 0000000000000000 
           000000003e005000 0000000000000000 0000000000000040 00000000408a4818 
           0000000040888000 000000004089b740 0000000001f0bec8 0000000001f0bde0 
Krnl Code: b2 22 00 50 88 50 00 1c a7 f4 00 aa bf bf 81 d0 a7 74 00 a6  
Call Trace: 
(Ý<00000000001a5ff8>¨ ccw_device_timeout+0x0/0x84) 
 Ý<0000000000043eac>¨ tasklet_hi_action+0x108/0x1cc 
 Ý<00000000000433da>¨ __do_softirq+0xba/0x190 
 Ý<000000000001ec8a>¨ do_softirq+0x8a/0xb0 
(Ý<00000003003b0007>¨ 0x3003b0007) 
 Ý<000000000004355c>¨ ksoftirqd+0xac/0x13c 
 Ý<0000000000055d94>¨ kthread+0x118/0x14c 
 Ý<000000000001859e>¨ kernel_thread_starter+0x6/0xc 
 Ý<0000000000018598>¨ kernel_thread_starter+0x0/0xc 
 
 <0>Kernel panic - not syncing: Fatal exception in interrupt 
01: HCPGSP2629I The virtual machine is placed in CP mode due to a SIGP stop from
 CPU 00.
00: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 00015EF8

Relevant FCP options:
FCP_1="0.0.4206 0X01 0X5005076300C4156D 0X0 0X5308000000000000"
FCP_2="0.0.4207 0X02 0X5005076300C4156D 0X1 0X5309000000000000"

If you need access to the guests ping me via e-mail :)

Comment 2 Jan Glauber 2009-07-06 12:27:46 UTC

We've fixed a similar problem in RHEL5.3. What is the exact kernel version that panics?

Comment 3 Arlinton Bourne 2009-07-06 22:20:17 UTC

(In reply to comment #2)
> We've fixed a similar problem in RHEL5.3. What is the exact kernel version that
> panics?  

This is from kicking off an install from anaconda.

Kernel Version 2.6.18-128.el5 (5.3 Anaconda):
Starting graphical installation... 
scsi0 : zfcp 
operand exception: 0015 Ý#1¨ 
CPU:    0    Tainted: G      
Process ksoftirqd/0 (pid: 3, task: 00000000007fe618, ksp: 0000000001f1fd90) 
Krnl PSW : 0704000180000000 0000000040897130 (tiqdio_tl+0x34c/0x267c Ýqdio¨) 
Krnl GPRS: 0000000000000002 0000000000010007 00000000ffffffff 00000000ffffffff 
           00000000ffffffff 0000000000010007 00000000408a8c00 0000000000000000 
           000000003e78d000 0000000000000000 0000000000000040 00000000408a4818 
           0000000040888000 000000004089b740 0000000001f0bec8 0000000001f0bde0 
Krnl Code: b2 22 00 50 88 50 00 1c a7 f4 00 aa bf bf 81 d0 a7 74 00 a6  
Call Trace: 
(Ý<00000000001a5ff8>¨ ccw_device_timeout+0x0/0x84) 
 Ý<0000000000043eac>¨ tasklet_hi_action+0x108/0x1cc 
 Ý<00000000000433da>¨ __do_softirq+0xba/0x190 
 Ý<000000000001ec8a>¨ do_softirq+0x8a/0xb0 
(Ý<00000003003b0007>¨ 0x3003b0007) 
 Ý<000000000004355c>¨ ksoftirqd+0xac/0x13c 
 Ý<0000000000055d94>¨ kthread+0x118/0x14c 
 Ý<000000000001859e>¨ kernel_thread_starter+0x6/0xc 
 Ý<0000000000018598>¨ kernel_thread_starter+0x0/0xc 
 
 <0>Kernel panic - not syncing: Fatal exception in interrupt 
01: HCPGSP2629I The virtual machine is placed in CP mode due to a SIGP stop from
 CPU 00.
00: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 00015EF8


Kernel Version 2.6.18-156.el5 (5.4 Nightly Anaconda): 
The VNC server is now running. 
Starting graphical installation... 
operand exception: 0015 Ý#1¨ 
CPU: 1 Tainted: G      2.6.18-156.el5 #1 
Process ksoftirqd/1 (pid: 5, task: 0000000001f69888, ksp: 0000000001f6fd90) 
Krnl PSW : 0704000180000000 000000004088eb3c (tiqdio_tl+0x34c/0x287c Ýqdio¨) 
Krnl GPRS: 0000000000000002 0000000000010007 00000000ffffffff 00000000ffffffff 
           00000000ffffffff 0000000000010007 00000000408a0a00 0000000000000000 
           000000003e0a8000 0000000000000000 0000000000000040 000000004089c620 
           000000004087f000 0000000040893378 0000000001f4fec8 0000000001f4fdd8 
Krnl Code: b2 22 00 50 88 50 00 1c a7 f4 00 aa bf bf 81 d0 a7 74 00 a6  
Call Trace: 
(Ý<000000000027d798>¨ 0x27d798) 
 Ý<0000000000045078>¨ tasklet_hi_action+0x108/0x1cc 
 Ý<00000000000445a6>¨ __do_softirq+0xba/0x190 
 Ý<000000000001ecda>¨ do_softirq+0x8a/0xb0 
(Ý<00000003003cc007>¨ 0x3003cc007) 
 Ý<0000000000044728>¨ ksoftirqd+0xac/0x13c 
 Ý<0000000000057018>¨ kthread+0x118/0x14c 
 Ý<00000000000185ae>¨ kernel_thread_starter+0x6/0xc 
 Ý<00000000000185a8>¨ kernel_thread_starter+0x0/0xc 
 
 <0>Kernel panic - not syncing: Fatal exception in interrupt 
00: HCPGSP2629I The virtual machine is placed in CP mode due to a SIGP stop from
 CPU 01.
01: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 0001F02A

Comment 6 IBM Bug Proxy 2009-07-16 12:30:39 UTC

------- Comment From ursula.braun.com 2009-07-16 08:28 EDT-------
Is it possible to take a dump after the panic occurred?

Comment 9 David Kovalsky 2009-09-03 08:34:01 UTC

Arlinton, can you provide the dump requested in comment #6?

Comment 10 IBM Bug Proxy 2009-09-09 09:40:33 UTC

------- Comment From mgrf.com 2009-09-09 05:38 EDT-------
(In reply to comment #9)
> Arlinton, can you provide the dump requested in comment #6?
>

Please provide dump to enable for debugging, Thx

Comment 11 Arlinton Bourne 2009-09-09 15:10:04 UTC

(In reply to comment #10)
> ------- Comment From mgrf.com 2009-09-09 05:38 EDT-------
> (In reply to comment #9)
> > Arlinton, can you provide the dump requested in comment #6?
> >
> 
> Please provide dump to enable for debugging, Thx  

Unfortunately during the last outage, we did a power-on reset of the z9 and the problem has 'disappeared' (this happened before - hence the word Rare in the topic). Stay tuned as I try to replicate the issue.

Comment 12 Arlinton Bourne 2009-09-12 08:37:13 UTC

(In reply to comment #10)
> ------- Comment From mgrf.com 2009-09-09 05:38 EDT-------
> (In reply to comment #9)
> > Arlinton, can you provide the dump requested in comment #6?
> >
> 
> Please provide dump to enable for debugging, Thx  

For the time this is reproducible again, what is the recommended way to dump the memory?

Comment 17 Chris Ward 2009-11-18 10:08:17 UTC

@IBM (ursula.braun.com)

Please review and respond to comment #12. Thanks.

Comment 18 IBM Bug Proxy 2009-11-18 10:40:47 UTC

------- Comment From ursula.braun.com 2009-11-18 05:39 EDT-------
Dump handling for RHEL5 is described here:
http://www.ibm.com/developerworks/linux/linux390/october2005_documentation.html

Comment 23 IBM Bug Proxy 2010-04-01 11:00:48 UTC

------- Comment From mgrf.com 2010-04-01 06:57 EDT-------
(In reply to comment #4)
> Dump handling for RHEL5 is described here:
> http://www.ibm.com/developerworks/linux/linux390/october2005_documentation.html

Hello Red Hat,
any more questions?

Comment 27 IBM Bug Proxy 2010-09-21 09:31:50 UTC

------- Comment From  2010-09-21 05:26 EDT-------
Hello Redhat,

Can we close this bug, if it is no more reproducing?

Thanks
Muni

Note You need to log in before you can comment on or make changes to this bug.