Description of problem: What is in-band management connection? -- It's a method to manage storage array in which a storage management station sends commands to the storage array through the host i/o connection to the controller. For the in-band connection to happen, there should be an Access volume mapped. The Access volume is created by the storage array to establish the communication between the host & the storage array. An access volume is required only for in-band management. What is Controller FW download? -- As it is known there are two controllers in the storage array. During FW download through in-band, using the Simplicity (GUI management for storage array), the user chooses the file and starts FW downloading on both the controllers. After the download is complete on both the controllers, one of the controller (ControllerA) will go a reboot and once it is back, the alternate controller (ControllerB) will go for a reboot. Procedure for reproducing: 1. It's a 1x1 setup which one host connected to one storage array 2. On storage array, there is only Access LUN 3. Map the Access LUN to the host 4. Reboot the host 5. Once the host is back online, the host does see only Access LUN on dual paths as below, <n/a> (/dev/sg3) [Storage Array , Virtual Disk Access, LUN 31, Virtual Disk ID <6001372000ffe36f0000000000000000>] <n/a> (/dev/sg5) [Storage Array , Virtual Disk Access, LUN 31, Virtual Disk ID <6001372000ffe36f0000000000000000>] As it can be seen above, the host sees only two sg devices as there is only Access LUN mapped to the host. 6. With above in the configuration, using the GUI (Simplicity), the user does the Controller FW download, and then the host gives a panic. The panic stack is as below which has been collected from serial console redirection o/p, [root@chap ~]# Unable to handle kernel paging request at 0000000014004013 RIP: <ffffffffa002f31d>{:sg:sg_common_write+2179} PML4 1218fc067 PGD 11d927067 PMD 0 Oops: 0000 [1] SMP CPU 0 Modules linked in: md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core smbfs sunrpc crc32c libcrc32c iscsi_sfnet scsi_transport_iscsi ds yenta_socket pcmcia_core joydev dm_multipath button battery ac uhci_hcd ehci_hcd hw_random e1000 bnx2 dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod mppVhba(U) megaraid_sas mppUpper(U) sg sd_mod scsi_mod Pid: 6732, comm: java Not tainted 2.6.9-42.ELsmp RIP: 0010:[<ffffffffa002f31d>] <ffffffffa002f31d>{:sg:sg_common_write+2179} RSP: 0018:000001011cf33b58 EFLAGS: 00010202 RAX: 0000000000000002 RBX: 000001011c110000 RCX: 0000000014004010 RDX: 000001011df76088 RSI: 0000000008099600 RDI: 000001011c118000 RBP: 0000000000008200 R08: 0600ed1100002d93 R09: 2528000200000000 R10: 1800a62704000434 R11: 0000cf9102001927 R12: 000001011c115020 R13: 000000001f006014 R14: 0000000000000000 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffffffff804e5080(005b) knlGS:00000000eb0fdbb0 CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 CR2: 0000000014004013 CR3: 0000000000101000 CR4: 00000000000006e0 Process java (pid: 6732, threadinfo 000001011cf32000, task 000001012cec4030) Stack: 0000010133da76c0 0000000100000000 000001011df76088 000001011df760a8 000001012cec4030 00000100010447e0 000001011df76038 000001011df76088 0000820038ad0940 000001011df76000 Call Trace:<ffffffffa002f77f>{:sg:sg_new_write+580} <ffffffffa002f9fa> {:sg:sg_ioctl+595} <ffffffff802a72c8>{sock_recvmsg+284} <ffffffff8030a14d> {thread_return+88} <ffffffff8010ed22>{__switch_to+306} <ffffffff8030a0f5>{thread_return+0} <ffffffff80135752>{autoremove_wake_function+0} <ffffffff802a6ecb> {sockfd_lookup+16} <ffffffff80135752>{autoremove_wake_function+0} <ffffffff802a8734> {sys_recvfrom+243} <ffffffff80135752>{autoremove_wake_function+0} <ffffffff8017a358> {fget+75} <ffffffff8018ae05>{sys_ioctl+853} <ffffffff8012a122>{sg_ioctl_trans+832} <ffffffff8019e8ac>{compat_sys_ioctl+235} <ffffffff80125bbb> {sysenter_do_call+27} Code: 48 0f b6 41 03 48 8b 14 c5 c0 e0 48 80 48 b8 b7 6d db b6 6d RIP <ffffffffa002f31d>{:sg:sg_common_write+2179} RSP <000001011cf33b58> CR2: 0000000014004013 <0>Kernel panic - not syncing: Oops 7. As it can be seen above from the stack o/p, the EIP is at "sg_common_write " and the trace is called by the "sg_new_write" Host details which gave a panic: [root@chap ~]# uname -a Linux chap.boldvt.co.lsil.com 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:32:02 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux [root@chap ~]# cat /etc/redhat-release Red Hat Enterprise Linux AS release 4 (Nahant Update 4) [root@chap ~]# [root@chap ~]# sginfo -v Sginfo version 2.02 [20031215] [root@chap ~]# [root@chap ~]# lsmod | grep -i iscsi iscsi_sfnet 95197 3 scsi_transport_iscsi 13377 1 iscsi_sfnet scsi_mod 141457 6 iscsi_sfnet,mppVhba,megaraid_sas,mppUpper,sg,sd_mod [root@chap ~]# modinfo iscsi_sfnet filename: /lib/modules/2.6.9- 42.ELsmp/kernel/drivers/scsi/iscsi_sfnet/iscsi_sfnet.ko parm: max_initial_login_retries:Max number of times to retry logging into a target for the first time before giving up. The default is 3. Set to -1 for no limit version: 4:0.1.11-3 BA273FAEA64EA20472A07EC license: GPL description: iSCSI initiator author: Mike Christie and Cisco Systems, Inc. depends: scsi_transport_iscsi,scsi_mod vermagic: 2.6.9-42.ELsmp SMP gcc-3.4 [root@chap ~]# modinfo scsi_transport_iscsi filename: /lib/modules/2.6.9- 42.ELsmp/kernel/drivers/scsi/scsi_transport_iscsi.ko license: GPL description: iSCSI Transport Attributes author: Mike Christie depends: vermagic: 2.6.9-42.ELsmp SMP gcc-3.4 [root@chap ~]# Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: The reproducing steps are as above & will reproduce it again as below, 1. It's a 1x1 setup with one host connected to one array 2. Map the Access LUN to the host and reboot the host 3. Make sure that the host sees Access LUN 4. Make sure the host can be connected to the Simplicity (Storage array management GUI) through Access LUN 5. Then, start the Ctlr FW download using Simplicity and the host will give a Panic and the panic stack is above Actual results:Panic Expected results: There shouldn't be a panic. The Ctlr FW download should be done fine. Additional info:
*** Bug 237554 has been marked as a duplicate of this bug. ***
*** Bug 237555 has been marked as a duplicate of this bug. ***
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Closing per K.H. Tan in an email.
*** This bug has been marked as a duplicate of 239447 ***