Bug 1526010
| Summary: | Storage: Incorrect valid_paths in multipath events in some cases when several paths change states at the same time | ||
|---|---|---|---|
| Product: | [oVirt] vdsm | Reporter: | Fred Rolland <frolland> |
| Component: | General | Assignee: | Fred Rolland <frolland> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Lilach Zitnitski <lzitnits> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.20.15 | CC: | bmarzins, bugs, nsoffer |
| Target Milestone: | ovirt-4.2.1 | Flags: | rule-engine:
ovirt-4.2+
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-02-12 11:47:30 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Trying 4.2.1, this should be an easy fix. --------------------------------------
Tested with the following code:
----------------------------------------
rhvm-4.2.1-0.2.el7.noarch
vdsm-4.20.11-1.el7ev.x86_64
Tested with the following scenario:
Steps to Reproduce:
1. echo offline to multipath devices state files
Actual results:
"multipathHealth": {
"3514f0c5a51600283": {
"valid_paths": 1,
"failed_paths": [
"sdb",
"sdj",
"sdz"
]
}
Expected results:
Moving to VERIFIED!
This bugzilla is included in oVirt 4.2.1 release, published on Feb 12th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.1 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. |
Description of problem: The events from udev can arrive in a different order that device mapper creates them. This can create a scenario when the current implementation of the multipath health reports a wrong number of valid_paths. Eventually, the engine could create wrong events. How reproducible: Block several paths of the same device at the same time: Steps to Reproduce: 1. echo "offline" > /sys/block/sdb/device/state echo "offline" > /sys/block/sdj/device/state echo "offline" > /sys/block/sdr/device/state 3514f0c5a516008d4 dm-1 XtremIO ,XtremApp size=150G features='0' hwhandler='0' wp=rw `-+- policy='queue-length 0' prio=0 status=active |- 3:0:0:1 sdb 8:16 failed faulty offline |- 4:0:0:1 sdj 8:144 failed faulty offline `- 5:0:0:1 sdr 65:16 failed faulty offline Actual results: multipath_health map: {'valid_paths': 2, 'failed_paths': [u'sdb', u'sdj', u'sdr']}} Expected results: multipath_health map: {'valid_paths': 0, 'failed_paths': [u'sdb', u'sdj', u'sdr']}} Additional info: This is not always reproducing