Bug 1786999

Summary: Multipath status changes are not displayed in Engine events
Product: [oVirt] ovirt-engine Reporter: Amit Bawer <abawer>
Component: BLL.StorageAssignee: Ahmad Khiet <akhiet>
Status: CLOSED CURRENTRELEASE QA Contact: Evelina Shames <eshames>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.4.0CC: bugs, eshenitz, frolland, michal.skrivanek, nsoffer, tnisan
Target Milestone: ovirt-4.4.0Flags: pm-rhel: ovirt-4.4+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-20 19:57:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Amit Bawer 2019-12-30 09:12:06 UTC
Description of problem:

Multipath status changes are not displayed in engine events, even though
the multipathHealth information is passed to engine by the GetStats API.


Version-Release number of selected component (if applicable):


ovirt-engine-4.4.0-0.0.master.20191222221018.gitc154119.el7.noarch
vdsm-4.40.0-1441.git37590dc04.el8.x86_64

How reproducible: 100%


Steps to Reproduce:

1. change multipath device status on host:

echo offline > /sys/dev/block/X\:Y/device/state
where X, Y are the device major and minor numbers respectively.

2. vdsm.log will shows the failing paths in multipath health report

3. engine.log in debug mode will show the rpc content with the received multipathHealth status 

4. Multipath messages will only show in Engine events log if host is re-activated.

Actual results:

Event viewer in engine web UI will not show the multipath status change message
unless host is re-activated (i.e. put host in maintenance and activate back).

Expected results:

Event viewer should show status changes messages accordingly, for example:

    Moving from "all paths are ok" state to "some paths are failed" - here we want to be able to warn the user about trouble

    Change in number of failed paths, again we can give useful event about state becoming worse or better

    Moving from "some paths are failed" to "all paths are failed" - 20 seconds after that event vms will stat to pause or have I/O error.

    Moving from "all paths are failed" to "some paths are failed" - here we can resume paused vms using the multiapth device

    Moving from "some paths are failed" to "all paths are ok" - here we can tell the user that everything is back to normal state.


Additional info:

Suspecting that MultipathHealthHandler.java flow [1] silently discards the status updates.

[1] https://github.com/oVirt/ovirt-engine/blob/ede62008318d924556bc9dfc5710d90e9519670d/backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/vdsbroker/MultipathHealthHandler.java#L41

Comment 2 Sandro Bonazzola 2020-05-20 19:57:45 UTC
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020.

Since the problem described in this bug report should be
resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.