Description of problem: On ALUA enabled setups, it is seen that dm-multipath fails to update its maps after a path state transition (when an active/optimized path transitions to an active/non-optimized path & vice versa). Version-Release number of selected component (if applicable): RHEL 5.4 GA (2.6.18-164.el5) device-mapper-multipath-0.4.7-30.el5 iscsi-initiator-utils-6.2.0.871-0.10.el5 ALUA settings are used in the multipath.conf - the ALUA priority callout (/sbin/mpath_prio_alua) is used with group-by-prio enabled along with the ALUA hardware handler (as per bug 562080). How reproducible: Always Steps to Reproduce: 1. Map an iSCSI lun (with ALUA enabled) to a RHEL 5.4 host. In this case, I have 1 active/optimized paths + 4 active/non-optimized paths to the lun. Configure dm-multipath on it as follows: # multipath -ll mpath1 (360a98000572d42746b4a555039386553) dm-3 NETAPP,LUN [size=2.0G][features=1 queue_if_no_path][hwhandler=1 alua][rw] \_ round-robin 0 [prio=50][enabled] \_ 11:0:0:1 sdk 8:160 [active][ready] \_ round-robin 0 [prio=40][enabled] \_ 7:0:0:1 sdg 8:96 [active][ready] \_ 8:0:0:1 sdh 8:112 [active][ready] \_ 9:0:0:1 sdi 8:128 [active][ready] \_ 10:0:0:1 sdj 8:144 [active][ready] The individual path priority weights & RTPGs are as follows: # /sbin/mpath_prio_alua -v /dev/sdk Target port groups are implicitly supported. Reported target port group is 4 [active/optimized] 50 # /sbin/mpath_prio_alua -v /dev/sdg Target port groups are implicitly supported. Reported target port group is 2 [active/non-optimized] 10 # /sbin/mpath_prio_alua -v /dev/sdh Target port groups are implicitly supported. Reported target port group is 1 [active/non-optimized] 10 # /sbin/mpath_prio_alua -v /dev/sdi Target port groups are implicitly supported. Reported target port group is 3 [active/non-optimized] 10 # /sbin/mpath_prio_alua -v /dev/sdj Target port groups are implicitly supported. Reported target port group is 1 [active/non-optimized] 10 2) Now run IO on the above multipath device. IOstats show up as: Before path state transition: avg-cpu: %user %nice %system %iowait %steal %idle 48.25 0.00 51.75 0.00 0.00 0.00 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sdg 0.00 0.00 0.00 0 0 sdi 0.00 0.00 0.00 0 0 sdh 0.00 0.00 0.00 0 0 sdk 614.50 0.00 17184.00 0 34368 sdj 0.00 0.00 0.00 0 0 avg-cpu: %user %nice %system %iowait %steal %idle 28.32 0.00 43.86 17.29 0.00 10.53 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sdg 0.00 0.00 0.00 0 0 sdi 0.00 0.00 0.00 0 0 sdh 0.00 0.00 0.00 0 0 sdk 5826.37 0.00 204573.13 0 411192 sdj 0.00 0.00 0.00 0 0 IO is running fine till here. 3) Now trigger a path state transition on the target storage array. In this case, the active/optimized path transitions to RTPG 2 i.e. sdg. And the original active/optimized path in RTPG 4 i.e. sdk, transitions to an active/non-optimized path as shown below: # /sbin/mpath_prio_alua -v /dev/sdk Target port groups are implicitly supported. Reported target port group is 4 [active/non-optimized] 10 # /sbin/mpath_prio_alua -v /dev/sdg Target port groups are implicitly supported. Reported target port group is 2 [active/optimized] 50 # /sbin/mpath_prio_alua -v /dev/sdh Target port groups are implicitly supported. Reported target port group is 1 [active/non-optimized] 10 # /sbin/mpath_prio_alua -v /dev/sdi Target port groups are implicitly supported. Reported target port group is 3 [active/non-optimized] 10 # /sbin/mpath_prio_alua -v /dev/sdj Target port groups are implicitly supported. Reported target port group is 1 [active/non-optimized] 10 But the IOstats now show up as follows: avg-cpu: %user %nice %system %iowait %steal %idle 3.99 0.00 12.22 83.29 0.00 0.50 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sdg 725.50 0.00 23552.00 0 47104 sdi 736.00 0.00 25656.00 0 51312 sdh 639.50 0.00 23552.00 0 47104 sdk 0.00 0.00 0.00 0 0 sdj 752.00 0.00 26992.00 0 53984 avg-cpu: %user %nice %system %iowait %steal %idle 31.58 0.00 40.60 27.57 0.00 0.25 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sdg 619.00 0.00 22016.00 0 44032 sdi 702.50 0.00 22016.00 0 44032 sdh 672.50 0.00 22016.00 0 44032 sdk 0.00 0.00 0.00 0 0 sdj 694.00 0.00 22168.00 0 44336 And multipath -ll shows up as: # multipath -ll mpath1 (360a98000572d42746b4a555039386553) dm-3 NETAPP,LUN [size=2.0G][features=1 queue_if_no_path][hwhandler=1 alua][rw] \_ round-robin 0 [prio=10][enabled] \_ 11:0:0:1 sdk 8:160 [active][ready] \_ round-robin 0 [prio=80][active] \_ 7:0:0:1 sdg 8:96 [active][ready] \_ 8:0:0:1 sdh 8:112 [active][ready] \_ 9:0:0:1 sdi 8:128 [active][ready] \_ 10:0:0:1 sdj 8:144 [active][ready] Obviously the multipath path groups are messed up. IO is now running through all the underlying devices of the 2nd path group i.e. sdg, sdh, sdi & sdj, whereas it should have been actually running on sdg alone (since that is the only active/optimized path available now). Actual results: IO is running on all underlying paths of the 2nd path group, after path state transition. Expected results: IO should have been running on the active/optimized path alone, after path state transition. Additional info: Restarting the multipathd daemon or running multipathd -k"reconfigure" properly reconfigures the multipath maps. But this should have been automatically handled by dm-multipath.
Created attachment 395088 [details] Multipath.conf for the above scenario
NetApp: Is this item blocking your cert for 5.5?
(In reply to comment #10) > NetApp: Is this item blocking your cert for 5.5? For NetApp, this is definitely a must-have feature for 5.6 and a good-to-have feature for 5.5.z.
ported.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: device-mapper multipath failed to update its maps after a path state transition. multipathd now automatically updates the path groups when a path priority changes.
Reminder! There should be a fix present for this BZ in snapshot 3 -- unless otherwise noted in a previous comment. Please test and update this BZ with test results as soon as possible.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2011-0074.html