Description of problem: When changing device handler from scsi_dh_alua to scsi_dh_emc for EMC VNX LUNs, got kenrel panic: === <1>BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 <1>IP: [<ffffffffa01a352c>] multipath_iterate_devices+0x3c/0xa0 [dm_multipath] <4>PGD 2380a2067 PUD 2391e0067 PMD 0 <4>Oops: 0000 [#1] SMP <4>last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/0000:07:00.0/host10/rport-10:0-4/target10:0:4/10:0:4:1/block/sdk/dev <4>CPU 1 <4>Modules linked in: bfa bridge bnx2fc cnic uio fcoe libfcoe 8021q libfc garp stp llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf dm_round_robin ipv6 dm_multipath bna igb dca ptp pps_core microcode serio_raw sg iTCO_wdt iTCO_vendor_support i7core_edac edac_core shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom ahci scsi_transport_fc scsi_tgt dm_mirror dm_region_hash dm_log dm_mod scsi_dh_alua scsi_dh_rdac scsi_dh_emc [last unloaded: bfa] <4> <4>Pid: 10877, comm: multipath Tainted: G W --------------- 2.6.32-355.el6.x86_64 #1 HP ProLiant DL160 G6 <4>RIP: 0010:[<ffffffffa01a352c>] [<ffffffffa01a352c>] multipath_iterate_devices+0x3c/0xa0 [dm_multipath] <4>RSP: 0018:ffff88023808bc78 EFLAGS: 00010213 <4>RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000028 <4>RDX: 0000000000000000 RSI: ffffffffa001be40 RDI: ffffc900124df040 <4>RBP: ffff88023808bcb8 R08: ffff88023c402400 R09: 0000000000000000 <4>R10: 0000000000000000 R11: 0000000000000000 R12: ffffc900124df040 <4>R13: ffff88023808bcd4 R14: ffffffffa001be40 R15: ffff88023808bcd4 <4>FS: 00007f4195cba7a0(0000) GS:ffff88002f620000(0000) knlGS:0000000000000000 <4>CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>CR2: 0000000000000038 CR3: 00000002393ea000 CR4: 00000000000007e0 <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>Process multipath (pid: 10877, threadinfo ffff88023808a000, task ffff880237788ae0) <4>Stack: <4> ffff880239393c38 0000000000000000 ffff88023808bce8 0000000000000000 <4><d> ffff88023ae24600 ffff88023808bcd4 0000000000000002 ffffffffffffffea <4><d> ffff88023808bcf8 ffffffffa001bec3 ffffc90012071040 0000000000000000 <4>Call Trace: <4> [<ffffffffa001bec3>] dm_table_has_no_data_devices+0x63/0x90 [dm_mod] <4> [<ffffffffa001a9b8>] dm_swap_table+0x58/0x2e0 [dm_mod] <4> [<ffffffffa001c168>] ? dm_table_postsuspend_targets+0x18/0x20 [dm_mod] <4> [<ffffffffa001a70c>] ? dm_suspend+0x3c/0x290 [dm_mod] <4> [<ffffffff81063310>] ? default_wake_function+0x0/0x20 <4> [<ffffffffa0020baf>] dev_suspend+0x12f/0x250 [dm_mod] <4> [<ffffffffa00219d4>] ctl_ioctl+0x1b4/0x270 [dm_mod] <4> [<ffffffffa0020a80>] ? dev_suspend+0x0/0x250 [dm_mod] <4> [<ffffffffa0021aa3>] dm_ctl_ioctl+0x13/0x20 [dm_mod] <4> [<ffffffff81194d42>] vfs_ioctl+0x22/0xa0 <4> [<ffffffff81194ee4>] do_vfs_ioctl+0x84/0x580 <4> [<ffffffff81195461>] sys_ioctl+0x81/0xa0 <4> [<ffffffff810dc565>] ? __audit_syscall_exit+0x265/0x290 <4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b <4>Code: 0f 1f 44 00 00 48 8b 47 38 49 89 d7 49 89 fc 49 89 f6 48 8b 50 38 48 83 c0 38 48 89 45 c0 48 39 c2 48 89 55 c8 74 64 48 8b 45 c8 <48> 8b 58 38 49 89 c5 49 83 c5 38 4c 39 eb 75 0c eb 3a 66 90 48 <1>RIP [<ffffffffa01a352c>] multipath_iterate_devices+0x3c/0xa0 [dm_multipath] <4> RSP <ffff88023808bc78> <4>CR2: 0000000000000038 ==== Will upload the kdump vmcore and dmesg in next comment. Version-Release number of selected component (if applicable): kernel -355. How reproducible: no sure. Steps to Reproduce: 1. Disable LUNZ (commpath) on EMC VNX/CX and enable ALUA (failover mode 4) 2. Use this configurations for multipath: ==== devices { # Device attributes for EMC CLARiiON ALUA (failover mode 4) device { vendor "DGC" product "*" path_grouping_policy group_by_prio prio alua hardware_handler "1 alua" #features #"1 queue_if_no_path" path_checker tur no_path_retry queue fast_io_fail_tmo 8 dev_loss_tmo 999 failback immediate product_blacklist "LUNZ" } } ==== 3. Start mulitpathd. 4. Remove all mpath via command "multipath -F" 5. Change EMC VNX/CX by enable LUNZ. 6. Remove configure above by using build-in configure of EMC CX. 6. Execute command 'multipath -r' Actual results: kernel panic. Expected results: no kernel panic Additional info:
I did a bit of crash analysis: #8 [ffff88023808bbc0] page_fault at ffffffff81510045 [exception RIP: multipath_iterate_devices+60] RIP: ffffffffa01a352c RSP: ffff88023808bc78 RFLAGS: 00010213 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000028 RDX: 0000000000000000 RSI: ffffffffa001be40 RDI: ffffc900124df040 RBP: ffff88023808bcb8 R8: ffff88023c402400 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffc900124df040 R13: ffff88023808bcd4 R14: ffffffffa001be40 R15: ffff88023808bcd4 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 RDI is the dm_target structure passed to multipath_iterate_devices. crash> dm_target ffffc900124df040 struct dm_target { features = 0, table = 0xffff88023ae24600, type = 0xffffffffa01a6820, begin = 0, len = 62914560, split_io = 0, num_flush_requests = 1, num_discard_requests = 1, private = 0xffff880239393c00, error = 0xffffffffa00239d5 "Unknown error", discards_supported = 0, flush_supported = 0, split_discard_requests = 0, discard_zeroes_data_unsupported = 0 } crash> multipath 0xffff880239393c00 doesn't yield memory that looks to be valid. The ultimate NULL pointer is due to the pg->pgpaths dereference in multipath_iterate_devices: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 crash> struct priority_group -o struct priority_group { [0x0] struct list_head list; [0x10] struct multipath *m; [0x18] struct path_selector ps; [0x28] unsigned int pg_num; [0x2c] unsigned int bypassed; [0x30] unsigned int nr_pgpaths; [0x38] struct list_head pgpaths; } But again, I don't think ti->private is a valid multipath structure. static int multipath_iterate_devices(struct dm_target *ti, iterate_devices_callout_fn fn, void *data) { struct multipath *m = ti->private; struct priority_group *pg; struct pgpath *p; int ret = 0; list_for_each_entry(pg, &m->priority_groups, list) { list_for_each_entry(p, &pg->pgpaths, list) { ret = fn(ti, p->path.dev, ti->begin, ti->len, data); if (ret) goto out; } } out: return ret; }
Gris, I don't have access to a system to try to reproduce this yet. It would be ideal if we could get comprehensive multipathd logging when you perform this sequence. In particular we need to see the libdevmapper logging that shows the DM table line that is being passed down from multipathd to the kernel. Given that you've flushed all multipath tables (via multipath -F) the multipath -r should just trigger the equivalent of starting a new. But it could be that multipathd is keeping some state for these devices. So in addition to getting multipathd logging from the original sequence described in comment#0 I'd recommend killing multipathd and restarting the service (instead of multipathd -r). If all works fine that at least tells us that multipathd's handling of this corner case is playing a role in this.
At the default log level, multipathd will log the table information after it loads it. multipath will instead pretty print the topology. Unfortunately, neither of these happen until after the table load completes, which is after the panic. I can make some debug packages that will print out the table before multipath tries a create or reload.
(In reply to comment #4) > At the default log level, multipathd will log the table information after it > loads it. multipath will instead pretty print the topology. Unfortunately, > neither of these happen until after the table load completes, which is after > the > panic. I can make some debug packages that will print out the table before > multipath tries a create or reload. OK, that would be helpful.
Issue reproduced with two scripts running at the same time: (It about take 2 hours to hit this race issue) 1. emc_vnx_fcoe_target_port_link_up_down.sh Bring all targets ports down on each SP of EMC with random interval (100s - 300s). ==== for X in `seq 1 100`;do for SPX in SPA SPB;do for PORT in 0 1;do libsan_utils -c link_down -a "emc_vnx_nay_${SPX}_FCoE${PORT}"; done; sleep $(($RANDOM % 200 + 100)); for PORT in 0 1;do libsan_utils -c link_up -a "emc_vnx_nay_${SPX}_FCoE${PORT}"; done; sleep $(($RANDOM % 200 + 100)); done; done ==== 2. BZ_902595_alua_2_emc.sh switching multipath configure from ALUA to emc mode. 1. Disable LUNZ, using alua (scsi_dh_alua) configure. 2. Disable LUNZ, using EMC (scsi_dh_emc) configure. (incorrect config) 3. Enable LUNZ, using EMC (scsi_dh_emc) configure. <===== This is when panic happen. 4. Enable LUNZ, using ALUA (scsi_dh_alua) configure. Since BZ_902595_alua_2_emc.sh contain password of our storage array, it only available in internal URL: http://lacrosse.corp.redhat.com/~fge/tmp/BZ_902595/BZ_902595_alua_2_emc.sh I will provide the debug log once debug multipath package rebuilded. Will try "I'd recommend killing multipathd and restarting the service (instead of multipathd -r). If all works fine that at least tells us that multipathd's handling of this corner case is playing a role in this." way and update later.
(In reply to comment #3) I'd recommend killing multipathd and restarting the service > (instead of multipathd -r). If all works fine that at least tells us that > multipathd's handling of this corner case is playing a role in this. Mike Snitzer, I got another kernel panic when change "multipath -r" to "service multipathd restart". It's Bug #912245. The multipathd restart script is http://lacrosse.corp.redhat.com/~fge/tmp/BZ_902595/BZ_902595_alua_2_emc_multipathd_restart.sh Thanks.
Do you have an earlier RHEL6.4 kernel that actually worked well with these various multipathd restart tests? It would be useful to attempt to isolate when you started seeing these problems. The nature of the failures are quite different each time (crash from comment#0 as compared to bug#912245).
Also, are all these crashes occurring on the same host? If so, have any of these crashes been reproduced on a different host?
Triggered the similar crash (multipath_iterate_devices+0x3c/0xa0) on kernel -279 (RHEL 6.3 GA) using "service multipathd restart". ^^^^^^^^^^^^^^^^^^^^^^^^^^^ http://lacrosse.corp.redhat.com/~fge/tmp/BZ_902595/kernel-279/ When using "multipath -r" test script (BZ_902595_alua_2_emc.sh) on kernel -279, "multipath -r" will hang. Yes. That's on the same host. I will try other host. Let me know if you still need me to bisect on older kernel.
Mike Snitzer, On different host (qla2xxx FC HBA), previous crash in this bug is found on (bfa FCoE HBA). "multipath -r" will not panic the kernel (I run it for 10 hours with about 1000 times). "service multipathd restart" will crash the kernel: https://bugzilla.redhat.com/show_bug.cgi?id=912245#c4
The patches I worked on to address this BZ never got included upstream: http://www.redhat.com/archives/dm-devel/2013-April/msg00039.html http://www.redhat.com/archives/dm-devel/2013-April/msg00040.html And I followed up with: http://www.redhat.com/archives/dm-devel/2013-April/msg00126.html The end of that last message stated: "I'm now inclined to not care about this issue. Take away is: don't switch the device handler (attach the correct one from the start)." That may not be a satisfying conclusion but with the scsi_dh attachment fixes/changes that went into RHEL6 users really shouldn't need to change the scsi_dh -- the correct scsi_dh should be attached from the beginning.