Bug 1803597

Summary: rhv-image-discrepancies should skip storage domains in maintenance mode and ISO/Export
Product: Red Hat Enterprise Virtualization Manager Reporter: Germano Veit Michel <gveitmic>
Component: rhv-log-collector-analyzerAssignee: Benny Zlotnik <bzlotnik>
Status: CLOSED ERRATA QA Contact: Avihai <aefrat>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.3.8CC: aefrat, bzlotnik, gwatson, rdlugyhe
Target Milestone: ovirt-4.4.0Keywords: FieldEngineering
Target Release: ---Flags: bzlotnik: needinfo-
bzlotnik: needinfo-
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: rhv-log-collector-analyzer-1.0.0-1.el8ev Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-08-04 13:21:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Germano Veit Michel 2020-02-17 01:13:45 UTC
Description of problem:

The tool attempts to run vdsm-tool dump-volume-chains on all Storage Domains instead of just on the active ones. dump-volume-chains only works for connected storage domains.

# rhv-image-discrepancies
IT IS HIGHLY RECOMMENDED YOU RUN THIS WITH RED HAT SUPPORT INVOLVED
Do you want to continue? [y/N]
y
Using host host1.kvm on DC df14f0d4-365b-11ea-83b7-52540019c104 to run the storage check
Collecting dump from SD c54f714f-17fa-48da-b38f-c08b1d9d69c2...
Traceback (most recent call last):
  File "/usr/bin/vdsm-tool", line 220, in main
    return tool_command[cmd]["command"](*args)
  File "/usr/lib/python2.7/site-packages/vdsm/tool/dump_volume_chains.py", line 92, in dump_chains
    volumes_info = _get_volumes_info(cli, parsed_args.sd_uuid)
  File "/usr/lib/python2.7/site-packages/vdsm/tool/dump_volume_chains.py", line 180, in _get_volumes_info
    images_uuids = cli.StorageDomain.getImages(storagedomainID=sd_uuid)
  File "/usr/lib/python2.7/site-packages/vdsm/client.py", line 303, in _call
    method, kwargs, resp.error.code, str(resp.error))
ServerError: Command StorageDomain.getImages with args {'storagedomainID': 'c54f714f-17fa-48da-b38f-c08b1d9d69c2'} failed:
(code=358, message=Storage domain does not exist: (u'c54f714f-17fa-48da-b38f-c08b1d9d69c2',))

Version-Release number of selected component (if applicable):
rhv-log-collector-analyzer-0.2.15-0.el7ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Have SD in maintenance mode
2. Run tool

Doing the status check here should work:
SDS_PER_DC = ("SELECT id "
              "FROM storage_domains "
              "WHERE storage_pool_id = '{storage_pool_id}'; ")

Comment 1 Germano Veit Michel 2020-02-17 01:14:58 UTC
It is also getting ISO/Export, we should skip those.

Comment 2 Gordon Watson 2020-02-28 15:36:50 UTC
Just to elaborate on what Germano said in comment #1, it does check an ISO domain, but doesn't report any anomalies, e.g.

 Checking storage domain 08a2be2a-f889-4c61-8838-556f73c9cee0
	Looking for missing images...
	No missing images found
	Checking discrepancies between SD/DB attributes...
	No discrepancies found


However, for an Export domain, if it contains exported VMs, then it will report any volumes that it finds as missing in the d/b, e.g.

 Checking storage domain c2eee4fc-d44d-483f-ada5-dea0ae15c2f6
	Looking for missing images...
	missing images in DB: 	
	54862871-47cf-4fb4-be3d-56856514aca8
	feedec8a-d2fe-46e4-9cb1-5209a9a0abea
	49ed25b1-dcaa-4316-a9a7-5502021c9768
	4391b4ef-c7fb-470c-844f-320b04a0e6a3
	06d818d0-b7bd-4d2f-8cfd-1c081fe59c5b
	a35dfd65-4b36-4384-abeb-fd20ad579ac0 


This is not a big deal and can obviously be easily explained, but it could lead to false alarms.

Comment 8 Avihai 2020-04-21 10:54:20 UTC
Verified on ovirt-engine 4.4.0-0.33.master.el8ev (rhv-4.4.0-31).

Details:
Checked with 11 Storage domains:

3X ISCSI
3X NFS
3X gluster
1X export 
1x ISO

Ran the tool from engine machine and checked that:
1) ONLY active storage domains were in the list (moved some SD's to maintenance and rerun the took)
2) Export and ISO storage domain were not in the list at any point.
3) No error was seen in the output of the tool like in the initial description.

Comment 9 Avihai 2020-04-21 10:58:45 UTC
(In reply to Avihai from comment #8)
> Verified on ovirt-engine 4.4.0-0.33.master.el8ev (rhv-4.4.0-31).
> 
> Details:
> Checked with 11 Storage domains:
> 
> 3X ISCSI
> 3X NFS
> 3X gluster
> 1X export 
> 1x ISO
> 
> Ran the tool from engine machine and checked that:
> 1) ONLY active storage domains were in the list (moved some SD's to
> maintenance and rerun the took)
> 2) Export and ISO storage domain were not in the list at any point.
> 3) No error was seen in the output of the tool like in the initial
> description.

Benny it looks like this bug was fixed, I verified it.
Can you please add the relevant fix patch to this bug?

Comment 13 errata-xmlrpc 2020-08-04 13:21:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: RHV Manager (ovirt-engine) 4.4 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:3247