Bug 1542886

Summary: [downstream clone - 4.1.10] "Failed to determine the metadata devices of Storage Domain" error is shown for every storage domains in 4.1 environment with 4.0 hosts
Product: Red Hat Enterprise Virtualization Manager Reporter: rhev-integ
Component: ovirt-engineAssignee: Tal Nisan <tnisan>
Status: CLOSED ERRATA QA Contact: Elad <ebenahar>
Severity: medium Docs Contact:
Priority: high    
Version: 4.1.8CC: amureini, audgiri, lsurette, ratamir, rbalakri, Rhev-m-bugs, srevivo, tnisan, ykaul, ylavi
Target Milestone: ovirt-4.1.10Keywords: Regression, ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1542034 Environment:
Last Closed: 2018-03-20 16:37:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1542034    
Bug Blocks:    

Description rhev-integ 2018-02-07 09:38:39 UTC
+++ This bug is a downstream clone. The original bug is: +++
+++   bug 1542034 +++
======================================================================

Description of problem:

"Failed to determine the metadata devices of Storage Domain" is shown for every storage domain in the event panel in a 4.1 environment which is having hosts with vdsm version less than 4.19.x.

The "StorageDomain.getInfo" will return "vgMetadataDevice" and "metadataDevice" only from vdsm 4.19.x. The 4.18.x will not report these values as seen below.

===
jsonrpc.Executor/5::DEBUG::2018-02-05 17:56:42,685::__init__::555::jsonrpc.JsonRpcServer::(_handle_request) Return 'StorageDomain.getInfo' in bridge with {'uuid': 'ad063d42-ba9e-4515-929d-3f6dd0c12628', 'vguuid': 'X7CUCg-yFp6-juim-4WUZ-ROFV-3sRd-GKsAWz', 'state': 'OK', 'version': '3', 'role': 'Master', 'type': 'ISCSI', 'class': 'Data', 'pool': ['dae2555d-f30e-4660-8581-63cc16474d5d'], 'name': 'test_storage'}
===

As per the https://gerrit.ovirt.org/#/c/77085/, the engine will give error mentioned above if either vgMetadataDevice or metadataDevice are null. Since vdsm doesn't return any values for these variables, we will get the error below in the engine log and the events panel.

===
2018-02-05 07:26:42,333-05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainInfoVDSCommand] (org.ovirt.thread.pool-6-thread-11) [2c9783ac] START, HSMGetStorageDomainInfoVDSCommand(HostName = 10.74.130.111, HSMGetStorageDomainInfoVDSCommandParameters:{runAsync='true', hostId='dba8d52f-1524-4f6c-8e30-0115de2990a4', storageDomainId='92227e03-01c8-4f41-999e-032175f20116'}), log id: 7c3b7ca1

2018-02-05 07:26:42,336-05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainInfoVDSCommand] (org.ovirt.thread.pool-6-thread-10) [22fc14b7] FINISH, HSMGetStorageDomainInfoVDSCommand, return: <StorageDomainStatic:{name='test_storage', id='ad063d42-ba9e-4515-929d-3f6dd0c12628'}, dae2555d-f30e-4660-8581-63cc16474d5d>, log id: 79e9e79f

2018-02-05 07:26:42,366-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-10) [22fc14b7] EVENT_ID: FAILED_DETERMINE_STORAGE_DOMAIN_METADATA_DEVICES(1,219), Correlation ID: null, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: Failed to determine the metadata devices of Storage Domain test_storage.
===

However, the storage domain is having the correct metadata.

===
pvs -o +pv_mda_free,pv_mda_count,pv_mda_used_count
  PV                                            VG                                   Fmt  Attr PSize  PFree  PMdaFree  #PMda #PMdaUse
  /dev/mapper/360014053f404fa44d844d9198cfee437 92227e03-01c8-4f41-999e-032175f20116 lvm2 a--  48.62g 42.38g        0      2        2
===


Version-Release number of selected component (if applicable):

vdsm 4.18.x
rhevm-4.1.8.2-0.1.el7.noarch


How reproducible:

100%

Steps to Reproduce:

1. Add 4.0 hosts in a 4.1 environment. The error will be shown every time when you activate the storage domain or restart the ovirt-engine service.


Actual results:

Incorrect error in the events panel of RHV-M with older vdsm version.

(Originally by Nijin Ashok)

Comment 1 rhev-integ 2018-02-07 09:38:46 UTC
I assume it's a regression?

(Originally by Yaniv Kaul)

Comment 4 rhev-integ 2018-02-07 09:39:00 UTC
This error is in fact bogus and should be removed:
The reduce VG command was introduced in 4.1 and as a requisite there was a
need to check what is the metadata device, this info was refreshed every
sync of the storage domain LUNs.
In an engine that support reduce and a cluster version lower than 4.1 the
hosts will not report the metadata device but engine will still try to
parse this data and will issue an error since the data is not there
For that reason I'm reducing the severity to medium.

(Originally by Tal Nisan)

Comment 6 Elad 2018-02-20 15:15:01 UTC
Added a 3.6 host to a 3.6 cluster in a 4.1 setup with storage domains in the DC. 
"Failed to determine the metadata devices of Storage Domain" message is not shown in engine.log.


Used:
rhevm-4.1.10-0.1.el7.noarch
vdsm-4.17.44-2.el7ev.noarch

Comment 9 errata-xmlrpc 2018-03-20 16:37:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0562

Comment 10 Franta Kust 2019-05-16 13:04:14 UTC
BZ<2>Jira Resync