Description of problem: System panics with following message in the console if a Fibre Channel disk disappear in the middle of lun scan. rport-6:0-2: blocked FC remote port time out: saving binding lpfc 0000:07:00.0: 0:0203 Devloss timeout on WWPN 50:0:1f:e1:50:6:54:88 NPort x610513 Data: x8 x7 x4 Unable to handle kernel NULL pointer dereference at 0000000000000060 RIP: [<ffffffff80061625>] mutex_lock+0x10/0x1d PGD 11423f067 PUD 11415a067 PMD 0 Oops: 0002 [1] SMP last sysfs file: /class/scsi_host/host6/scan CPU 2 Modules linked in: lpfc(U) nfs lockd fscache nfs_acl autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 dm_mirror dmd Pid: 4451, comm: bash Not tainted 2.6.18-8.el5 #1 RIP: 0010:[<ffffffff80061625>] [<ffffffff80061625>] mutex_lock+0x10/0x1d RSP: 0018:ffff810115ed9dd8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000060 RCX: 00000000ffffffff RDX: 0000000000000000 RSI: 000000002f9806d8 RDI: 0000000000000060 RBP: 0000000000000060 R08: 0000000000000001 R09: 000000000000003c R10: 0000000000000000 R11: ffffffff8807c088 R12: ffff81012f9806f8 R13: 0000000000000001 R14: 00000000ffffffff R15: 0000000000000000 FS: 00002aaaaaabbdb0(0000) GS:ffff81012fcd7e40(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000060 CR3: 0000000118629000 CR4: 00000000000006e0 Process bash (pid: 4451, threadinfo ffff810115ed8000, task ffff81012f9ea080) Stack: 0000000000000001 00000000ffffffff 0000000000000000 ffffffff8807bea2 2f9806d8ffffffd8 ffff81012f980698 00000000ffffffff 00000000ffffffff ffff81010deea000 00000000ffffffff ffff81012592fd80 ffffffff880f6f75 Call Trace: [<ffffffff8807bea2>] :scsi_mod:scsi_scan_target+0x4e/0x83 [<ffffffff880f6f75>] :scsi_transport_fc:fc_user_scan+0x55/0x85 [<ffffffff8807c808>] :scsi_mod:store_scan+0x9b/0xc5 [<ffffffff800fa3a4>] sysfs_write_file+0xb9/0xe8 [<ffffffff80016121>] vfs_write+0xce/0x174 [<ffffffff800169b2>] sys_write+0x45/0x6e [<ffffffff8005b2c1>] tracesys+0xd1/0xdc Code: f0 ff 0f 0f 88 8d 01 00 00 59 5e 5b c3 41 54 55 48 89 fd 53 RIP [<ffffffff80061625>] mutex_lock+0x10/0x1d RSP <ffff810115ed9dd8> CR2: 0000000000000060 <0>Kernel panic - not syncing: Fatal exception Version-Release number of selected component (if applicable): RHEL5 GA 2.6.18-8.el5 How reproducible: 100% reproducible. Steps to Reproduce: 1. Connect a Emulex lpfc HBA to a SAN with atleast one storage array visible to the HBA and atleast one lun presented to the HBA. 2. Make sure that SCSI midlayer can see the SCSI lun using "cat /proc/scsi/scsi" command. 3. Unplug the Fibre Channel cable connected to the HBA. 4. Run following command immediately after unplugging the cable "echo '- - -' > /sys/class/scsi_host/host<host_no>/scan" Where <host_no> is the SCSI host number assigned to the lpfc HBA. 5. The lun_scan will wait until devloss timer expire. 6. Wait atleast 30 seconds for dev_loss timer to expire. Actual results: The system panicked with following stacl trace: rport-6:0-2: blocked FC remote port time out: saving binding lpfc 0000:07:00.0: 0:0203 Devloss timeout on WWPN 50:0:1f:e1:50:6:54:88 NPort x610513 Data: x8 x7 x4 Unable to handle kernel NULL pointer dereference at 0000000000000060 RIP: [<ffffffff80061625>] mutex_lock+0x10/0x1d PGD 11423f067 PUD 11415a067 PMD 0 Oops: 0002 [1] SMP last sysfs file: /class/scsi_host/host6/scan CPU 2 Modules linked in: lpfc(U) nfs lockd fscache nfs_acl autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 dm_mirror dmd Pid: 4451, comm: bash Not tainted 2.6.18-8.el5 #1 RIP: 0010:[<ffffffff80061625>] [<ffffffff80061625>] mutex_lock+0x10/0x1d RSP: 0018:ffff810115ed9dd8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000060 RCX: 00000000ffffffff RDX: 0000000000000000 RSI: 000000002f9806d8 RDI: 0000000000000060 RBP: 0000000000000060 R08: 0000000000000001 R09: 000000000000003c R10: 0000000000000000 R11: ffffffff8807c088 R12: ffff81012f9806f8 R13: 0000000000000001 R14: 00000000ffffffff R15: 0000000000000000 FS: 00002aaaaaabbdb0(0000) GS:ffff81012fcd7e40(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000060 CR3: 0000000118629000 CR4: 00000000000006e0 Process bash (pid: 4451, threadinfo ffff810115ed8000, task ffff81012f9ea080) Stack: 0000000000000001 00000000ffffffff 0000000000000000 ffffffff8807bea2 2f9806d8ffffffd8 ffff81012f980698 00000000ffffffff 00000000ffffffff ffff81010deea000 00000000ffffffff ffff81012592fd80 ffffffff880f6f75 Call Trace: [<ffffffff8807bea2>] :scsi_mod:scsi_scan_target+0x4e/0x83 [<ffffffff880f6f75>] :scsi_transport_fc:fc_user_scan+0x55/0x85 [<ffffffff8807c808>] :scsi_mod:store_scan+0x9b/0xc5 [<ffffffff800fa3a4>] sysfs_write_file+0xb9/0xe8 [<ffffffff80016121>] vfs_write+0xce/0x174 [<ffffffff800169b2>] sys_write+0x45/0x6e [<ffffffff8005b2c1>] tracesys+0xd1/0xdc Code: f0 ff 0f 0f 88 8d 01 00 00 59 5e 5b c3 41 54 55 48 89 fd 53 RIP [<ffffffff80061625>] mutex_lock+0x10/0x1d RSP <ffff810115ed9dd8> CR2: 0000000000000060 <0>Kernel panic - not syncing: Fatal exception Expected results: lun scan complete with no panics. Additional info:
This problem has been fixed in RHEL5.1 with the fix for bug 246023. Don, the relevant patch tracking file is: scsi_tranport_fc-check-portstates-before-invoking-target-scan.patch *** This bug has been marked as a duplicate of 246023 ***