Description of problem: The events from udev can arrive in a different order that device mapper creates them. This can create a scenario when the current implementation of the multipath health reports a wrong number of valid_paths. Eventually, the engine could create wrong events. How reproducible: Block several paths of the same device at the same time: Steps to Reproduce: 1. echo "offline" > /sys/block/sdb/device/state echo "offline" > /sys/block/sdj/device/state echo "offline" > /sys/block/sdr/device/state 3514f0c5a516008d4 dm-1 XtremIO ,XtremApp size=150G features='0' hwhandler='0' wp=rw `-+- policy='queue-length 0' prio=0 status=active |- 3:0:0:1 sdb 8:16 failed faulty offline |- 4:0:0:1 sdj 8:144 failed faulty offline `- 5:0:0:1 sdr 65:16 failed faulty offline Actual results: multipath_health map: {'valid_paths': 2, 'failed_paths': [u'sdb', u'sdj', u'sdr']}} Expected results: multipath_health map: {'valid_paths': 0, 'failed_paths': [u'sdb', u'sdj', u'sdr']}} Additional info: This is not always reproducing
Trying 4.2.1, this should be an easy fix.
-------------------------------------- Tested with the following code: ---------------------------------------- rhvm-4.2.1-0.2.el7.noarch vdsm-4.20.11-1.el7ev.x86_64 Tested with the following scenario: Steps to Reproduce: 1. echo offline to multipath devices state files Actual results: "multipathHealth": { "3514f0c5a51600283": { "valid_paths": 1, "failed_paths": [ "sdb", "sdj", "sdz" ] } Expected results: Moving to VERIFIED!
This bugzilla is included in oVirt 4.2.1 release, published on Feb 12th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.1 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.