Bug 157314

Summary: Kernel panic when loading qla2300 with qla2300_conf
Product: Red Hat Enterprise Linux 3 Reporter: Philip Pokorny <ppokorny>
Component: kernelAssignee: Tom Coughlan <coughlan>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: jparadis, mcohen, petrides
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-19 19:02:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
kernel console log during boot with the kernel panic
none
/var/log/messages with unsuccesful and successful boots
none
qla2300_conf.o that causes the panic none

Description Philip Pokorny 2005-05-10 16:08:35 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030225

Description of problem:
When loading qla2300 with qla2300_conf, the kernel panics with the following:

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018
 printing rip:
ffffffffa00c2845
PML4 1e535067 PGD 1ffcbe067 PMD 0
Oops: 0000
CPU 0
Pid: 1377, comm: modprobe Not tainted
RIP: 0010:[<ffffffffa00c2845>]{:qla2300:qla2x00_find_matching_lun_by_num+69}
RSP: 0000:00000101ffcf9bd0  EFLAGS: 00010287
RAX: 00000101ff7ea1d0 RBX: 00000101ff7e0000 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 00000101ff7ea0c0 RDI: 0000000000000000
RBP: 00000101ffc65000 R08: 0000000000000000 R09: 0000000000000001
R10: 000001020c4c6000 R11: 00000101ff7ea304 R12: 00000101ff7ea0c0
R13: 0000000000000000 R14: 0000000000000000 R15: 00000101ff7ea1c0
FS:  0000002a958cb4c0(0000) GS:ffffffff805dd4c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000018 CR3: 0000000000101000 CR4: 00000000000006e0
                                                                                
Call Trace: [<ffffffffa00c2de5>]{:qla2300:qla2x00_map_a_oslun+133}
       [<ffffffffa00c2bad>]{:qla2300:qla2x00_map_or_failover_oslun+45}
       [<ffffffffa00c2d43>]{:qla2300:qla2x00_map_os_luns+291}
       [<ffffffffa00c2b42>]{:qla2300:qla2x00_map_os_targets+162}
       [<ffffffffa00c262c>]{:qla2300:qla2x00_update_mp_host+140}
       [<ffffffffa014aa10>]{:qla2300:QLBoardTbl_fc+48} [<ffffffffa00bf8d5>]{:qla2300:qla2x00_cfg_path_discovery+165}
       [<ffffffffa00bf824>]{:qla2300:qla2x00_cfg_init+52}
       [<ffffffffa00ad041>]{:qla2300:qla2x00_detect+2081}
       [<ffffffffa014ad40>]{:qla2300:driver_template+0} [<ffffffff80154e38>]{__alloc_pages+152}
       [<ffffffff8014a1ca>]{__vmalloc+442} [<ffffffffa014ad40>]{:qla2300:driver_template+0}
       [<ffffffffa0002373>]{:scsi_mod:scsi_register_host+115}
       [<ffffffffa00be3b0>]{:qla2300:init_this_scsi_driver+32}
       [<ffffffff80125fb6>]{sys_init_module+1686} [<ffffffffa00a30b8>]
       [<ffffffff80110177>]{system_call+119}
Process modprobe (pid: 1377, stackpage=101ffcf9000)
Stack: 00000101ffcf9bd0 0000000000000000 ffffffffa00c2de5 0000000000000007
       00000000000000ff 0000000000000000 0000ffff802b85e8 000001020c4c6000
       00000101ff7ea0c0 000001020c4c6000 0000000000000000 0000000000000000
       0000000000000000 000001000e318100 ffffffffa00c2bad 0000000000000001
       00000101ff7ea0c0 000001020c4c6000 0000000000000000 00000101ff7ea0c0
       ffffffffa00c2d43 0000000000000246 000001020c4c7000 00000101ff7ea0c0
       00000101ffc65000 0000000000000000 000001020c4c6000 000001000e318100
       ffffffffa00c2b42 0000000000000001 0000000000000000 000001000e318228
       000001020c4c6000 0000000000000002 000001000e318100 0000000000000001
       ffffffffa00c262c 000001020c4c6000 000001020c4c6000 000001000e318100
Call Trace: [<ffffffffa00c2de5>]{:qla2300:qla2x00_map_a_oslun+133}
       [<ffffffffa00c2bad>]{:qla2300:qla2x00_map_or_failover_oslun+45}
       [<ffffffffa00c2d43>]{:qla2300:qla2x00_map_os_luns+291}
       [<ffffffffa00c2b42>]{:qla2300:qla2x00_map_os_targets+162}
       [<ffffffffa00c262c>]{:qla2300:qla2x00_update_mp_host+140}
       [<ffffffffa014aa10>]{:qla2300:QLBoardTbl_fc+48} [<ffffffffa00bf8d5>]{:qla2300:qla2x00_cfg_path_discovery+165}
       [<ffffffffa00bf824>]{:qla2300:qla2x00_cfg_init+52}
       [<ffffffffa00ad041>]{:qla2300:qla2x00_detect+2081}
       [<ffffffffa014ad40>]{:qla2300:driver_template+0} [<ffffffff80154e38>]{__alloc_pages+152}
       [<ffffffff8014a1ca>]{__vmalloc+442} [<ffffffffa014ad40>]{:qla2300:driver_template+0}
       [<ffffffffa0002373>]{:scsi_mod:scsi_register_host+115}
       [<ffffffffa00be3b0>]{:qla2300:init_this_scsi_driver+32}
       [<ffffffff80125fb6>]{sys_init_module+1686} [<ffffffffa00a30b8>]
       [<ffffffff80110177>]{system_call+119}
                                                                                
Code: 66 41 3b 78 18 75 07 ba 01 00 00 00 eb 08 4d 8b 00 49 39 c0
                                                                                
Kernel panic: Fatal exception



Version-Release number of selected component (if applicable):
2.4.21-20.ELsmp

How reproducible:
Always

Steps to Reproduce:
Install RHEL3 U3.  Enable failover with the following in /etc/modules.conf
  options qla2300 ql2xfailover=1 ConfigRequired=1 displayConfig=1 ql2xopts=

Then configure using SANsurfer (2.0.27) from Qlogic for the Red Hat supplied 7.0.x driver (which creates qla2300_conf and changes the option line to read 'ql2xuseextopts=1')

I have commented out "alias scsi_hostadapter1 qla2300" and rebuilt the initrd so that qla2300 is not loaded from the initrd.  Rather, /etc/rc.modules contains:

   modprobe qla2300

That way I don't have to rebuild the initrd when the config (qla2300_conf) changes.

Now reboot the node.

When the qla2300 driver loads, the system will reliably kernel panic.


Actual Results:  Kernel panic

Expected Results:  Driver loads with failover and load balance configuration.

Additional info:

Setting mem=3000m (to limit RAM to less than 4G) did not make a difference.  (Did I say that these machines have 12G of RAM in them?)

The error happens with both 7.0.xxx (stock Red Hat EL3 U3) and 7.3.xxx (from Qlogic).

The problem is specific to use of qla2300_conf (the extended options module).  If I disable it by setting 'ConfigRequired=0' and 'ql2xuseextopts=0' while leaving 'ql2xfailover=1', then the failover config is automatically detected (without load-balance) correctly and the module does not panic.

The problem can be reliably reproduced.  It has happened on 3 different machines with identical hardware configs.

Comment 1 Philip Pokorny 2005-05-10 16:10:20 UTC
Created attachment 114207 [details]
kernel console log during boot with the kernel panic

Comment 2 Philip Pokorny 2005-05-10 16:11:49 UTC
Created attachment 114208 [details]
/var/log/messages with unsuccesful and successful boots

Here is a compressed /var/log/messsages from a machine that can be made to
crash.

The last boot was with qla2300_conf disabled so that the system would not
kernel panic.

Comment 3 Philip Pokorny 2005-05-12 15:27:22 UTC
Here is the configuration from the qla2300_conf.o that was created by SANsurfer:

scsi-qla0-adapter-port=210000e08b1ddc47\;
scsi-qla0-tgt-0-di-0-node=2000000bb5209b5d\;
scsi-qla0-tgt-0-di-0-port=2100000bb5209b5d\;
scsi-qla0-tgt-0-di-0-pid=010fe1\;
scsi-qla0-tgt-0-di-0-preferred=ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff\;
scsi-qla0-tgt-0-di-0-control=00\;
scsi-qla0-tgt-0-di-0-node=2000000bb5209b5d\;
scsi-qla0-tgt-0-di-0-port=2100000bb5209b5d\;
scsi-qla0-tgt-0-di-0-pid=010fe1\;
scsi-qla0-tgt-0-di-0-preferred=ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff\;
scsi-qla0-tgt-0-di-0-control=00\;
scsi-qla1-adapter-port=210100e08b3ddc47\;
scsi-qla1-tgt-0-di-0-node=2000000bb5209b5d\;
scsi-qla1-tgt-0-di-0-port=2200000bb5209b5d\;
scsi-qla1-tgt-0-di-0-pid=010fe0\;
scsi-qla1-tgt-0-di-0-preferred=0000000000000000000000000000000000000000000000000000000000000000\;
scsi-qla1-tgt-0-di-0-control=80\;
scsi-qla1-tgt-0-di-1-node=2000000bb5209b5d\;
scsi-qla1-tgt-0-di-1-port=2200000bb5209b5d\;
scsi-qla1-tgt-0-di-1-pid=010fe0\;
scsi-qla1-tgt-0-di-1-preferred=0000000000000000000000000000000000000000000000000000000000000000\;
scsi-qla1-tgt-0-di-1-control=80\;

Comment 4 Philip Pokorny 2005-05-12 15:29:48 UTC
Created attachment 114299 [details]
qla2300_conf.o that causes the panic

This qla2300_conf.o was created by SANsurfer after performing a typical
configuration of the HCA.

Comment 5 Tom Coughlan 2005-08-22 15:21:28 UTC
Philip,

RHEL 3 U6 is available for beta test now. It has the latest version of the
QLogic driver. Would you be able to test this and see if it fixes your problem?

Thanks,

Tom

Comment 7 RHEL Program Management 2007-10-19 19:02:33 UTC
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.