Bug 1870435 - StorageDomain.dump() can return {"key" : None} if metadata is missing
Summary: StorageDomain.dump() can return {"key" : None} if metadata is missing
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 4.4.1
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: ovirt-4.4.5
: ---
Assignee: Roman Bednář
QA Contact: Evelina Shames
URL:
Whiteboard:
Depends On:
Blocks: 1839444
TreeView+ depends on / blocked
 
Reported: 2020-08-20 05:24 UTC by Germano Veit Michel
Modified: 2021-11-04 19:28 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-04-14 11:38:43 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:1184 0 None None None 2021-04-14 11:39:25 UTC
oVirt gerrit 113133 0 master ABANDONED tests: convert vdsmdumpchains_test to py3 2021-02-09 13:38:57 UTC
oVirt gerrit 113134 0 master MERGED tests: convert vdsmdumpchains_test to pytest 2021-02-09 13:38:57 UTC
oVirt gerrit 113135 0 master ABANDONED tests: convert vdsmdumpchains methods to functions 2021-02-09 13:38:57 UTC
oVirt gerrit 113136 0 master MERGED blockSD: omit missing keys when dumping sd 2021-02-19 17:21:02 UTC
oVirt gerrit 113468 0 master MERGED dump_volume_chains: remove workarounds 2021-02-22 09:47:51 UTC
oVirt gerrit 113600 0 master ABANDONED tests: add test for sd dump missing keys 2021-02-19 16:59:42 UTC

Description Germano Veit Michel 2020-08-20 05:24:44 UTC
Description of problem:

See below, keys 'parent' and 'image' are present but have value 'None':

# vdsm-client StorageDomain dump sd_id=82e9c212-e4c5-4560-9ae0-bcb7b7521065
{
....
    "volumes": {
        "2ebcc4f9-5a1e-463e-9ee2-f478509c14a1": {
            "apparentsize": 1073741824,
            "image": null,        <-----
            "mdslot": 3,
            "parent": null,       <-----
            "status": "INVALID",
            "truesize": 1073741824
        },


According to our discussion in https://gerrit.ovirt.org/#/c/109325/ VDSM should not return None.

Metadata:

# lvs -o lv_name,tags | egrep 'LV|2ebcc'
  LV                                   LV Tags                                                                             
  2ebcc4f9-5a1e-463e-9ee2-f478509c14a1 MD_3  

# dd if=/dev/82e9c212-e4c5-4560-9ae0-bcb7b7521065/metadata bs=8k count=1 skip=131
CAP=10737418240
CTIME=1596691868
DESCRIPTION=
DISKTYPE=DATA
DOMAIN=82e9c212-e4c5-4560-9ae0-bcb7b7521065
FORMAT=COW
GEN=0
LEGALITY=LEGAL
PUUID=
TYPE=SPARSE
VOLTYPE=LEAF
EOF

Comment 1 Germano Veit Michel 2020-08-20 05:26:00 UTC
Version was: vdsm-4.40.24-1.gitd177ff577.el8.x86_64

Comment 2 Nir Soffer 2020-08-21 00:17:53 UTC
Did you hit this issue with real storage or it is a result
of modifying good metadata?

Comment 3 Germano Veit Michel 2020-08-21 00:36:03 UTC
(In reply to Nir Soffer from comment #2)
> Did you hit this issue with real storage or it is a result
> of modifying good metadata?

That's me modifying good metadata while testing dump-volume-chains for missing and corrupted keys.

Comment 5 Roman Bednář 2021-01-26 13:17:24 UTC
Reproducer does not seem to obvious, leaving a few notes here for future reference, might come in handy when verifying.

When dumping information about block storage domain vdsm attempts to read parent, image and metadata slot number from
metadata lv first. If some of those are missing it tries to use lvm tags on data lv instead(prefixed with PU_, IU_, MD_).
If both those lookups fail you should see 'null' value where the value is missing.

So to reproduce the metadata lv has to be corrupted first (harsh but works):

#dd if=/dev/random of=/dev/<VG_NAME>/metadata

Then remove some of the tags (parent uuid in this case), overriding lvm filter:

#lvchange --config 'devices/filter=[ "a|.*|" ]' --deltag "PU_00000000-0000-0000-0000-000000000000" <VG_NAME>/<LV_NAME>

Finally dump domain info, you should see parent value of 'null':

#vdsm-client StorageDomain dump sd_id=<VG_NAME>

Example output:
...
    "volumes": {
        "0bf8695a-0d18-44b0-a704-d84c664cf0f7": {
            "apparentsize": 1073741824,
            "image": "13d6169e-37d2-498c-9c3d-b752d7358861",
            "mdslot": 3,
            "parent": null,
            "status": "INVALID",
            "truesize": 1073741824
        },
...

Comment 6 Roman Bednář 2021-02-04 10:22:06 UTC
I found easier and less destructive way of reproducing the issue that can be used to verify the fix. Instead of destroying metadata lv using dd - if new lv is created in a domain/vg vdsm does not know about it which means volume information can't be found in metadata lv neither lv tags of the new lv. The example below shows how this is done with just creating lvm snapshot:

BEFORE FIX:

# lvcreate -s -L128m -n metadata_backup1 1e475c8a-8d1b-4eb6-ba2a-a95b26d568d6/metadata --config='devices/filter = ["a|.*|"]'
  Logical volume "metadata_backup1" created.
[root@host-vm ~]# vdsm-client StorageDomain dump sd_id=1e475c8a-8d1b-4eb6-ba2a-a95b26d568d6
{
    "metadata": {
        "alignment": 1048576,
        "block_size": 512,
        "class": "Data",
        "metadataDevice": "36001405dfd35bbfd15a472089cecec0c",
        "name": "block_iscsi_domain",
        "pool": [
            "4fc1c8aa-6245-11eb-83ae-525400ea2c38"
        ],
        "role": "Master",
        "state": "OK",
        "type": "ISCSI",
        "uuid": "1e475c8a-8d1b-4eb6-ba2a-a95b26d568d6",
        "version": "5",
        "vgMetadataDevice": "36001405dfd35bbfd15a472089cecec0c",
        "vguuid": "7iZgSK-GtaD-ZX8P-H364-S4aq-dqSe-3IWmjF"
    },
    "volumes": {
        "metadata_backup1": {
            "image": null,
            "parent": null,
            "status": "INVALID"
        }
    }
}


AFTER FIX:

#Restart vdsmd after applying the patch:
[root@host-vm ~]# systemctl restart vdsmd

#Missing values (image and parent) now don't show up in volumes dump:
[root@host-vm ~]# vdsm-client StorageDomain dump sd_id=1e475c8a-8d1b-4eb6-ba2a-a95b26d568d6
{
    "metadata": {
        "alignment": 1048576,
        "block_size": 512,
        "class": "Data",
        "metadataDevice": "36001405dfd35bbfd15a472089cecec0c",
        "name": "block_iscsi_domain",
        "pool": [
            "4fc1c8aa-6245-11eb-83ae-525400ea2c38"
        ],
        "role": "Master",
        "state": "OK",
        "type": "ISCSI",
        "uuid": "1e475c8a-8d1b-4eb6-ba2a-a95b26d568d6",
        "version": "5",
        "vgMetadataDevice": "36001405dfd35bbfd15a472089cecec0c",
        "vguuid": "7iZgSK-GtaD-ZX8P-H364-S4aq-dqSe-3IWmjF"
    },
    "volumes": {
        "metadata_backup1": {
            "status": "INVALID"
        }
    }
}

Comment 7 Roman Bednář 2021-02-22 11:34:40 UTC
Patches have been merged to master, should be available in next vdsm build: 4.40.50.7

Comment 13 Evelina Shames 2021-03-03 12:05:34 UTC
(In reply to Roman Bednář from comment #6)
> I found easier and less destructive way of reproducing the issue that can be
> used to verify the fix. Instead of destroying metadata lv using dd - if new
> lv is created in a domain/vg vdsm does not know about it which means volume
> information can't be found in metadata lv neither lv tags of the new lv. The
> example below shows how this is done with just creating lvm snapshot:
> 
> BEFORE FIX:
> 
> # lvcreate -s -L128m -n metadata_backup1
> 1e475c8a-8d1b-4eb6-ba2a-a95b26d568d6/metadata --config='devices/filter =
> ["a|.*|"]'
>   Logical volume "metadata_backup1" created.
> [root@host-vm ~]# vdsm-client StorageDomain dump
> sd_id=1e475c8a-8d1b-4eb6-ba2a-a95b26d568d6
> {
>     "metadata": {
>         "alignment": 1048576,
>         "block_size": 512,
>         "class": "Data",
>         "metadataDevice": "36001405dfd35bbfd15a472089cecec0c",
>         "name": "block_iscsi_domain",
>         "pool": [
>             "4fc1c8aa-6245-11eb-83ae-525400ea2c38"
>         ],
>         "role": "Master",
>         "state": "OK",
>         "type": "ISCSI",
>         "uuid": "1e475c8a-8d1b-4eb6-ba2a-a95b26d568d6",
>         "version": "5",
>         "vgMetadataDevice": "36001405dfd35bbfd15a472089cecec0c",
>         "vguuid": "7iZgSK-GtaD-ZX8P-H364-S4aq-dqSe-3IWmjF"
>     },
>     "volumes": {
>         "metadata_backup1": {
>             "image": null,
>             "parent": null,
>             "status": "INVALID"
>         }
>     }
> }
> 
> 
> AFTER FIX:
> 
> #Restart vdsmd after applying the patch:
> [root@host-vm ~]# systemctl restart vdsmd
> 
> #Missing values (image and parent) now don't show up in volumes dump:
> [root@host-vm ~]# vdsm-client StorageDomain dump
> sd_id=1e475c8a-8d1b-4eb6-ba2a-a95b26d568d6
> {
>     "metadata": {
>         "alignment": 1048576,
>         "block_size": 512,
>         "class": "Data",
>         "metadataDevice": "36001405dfd35bbfd15a472089cecec0c",
>         "name": "block_iscsi_domain",
>         "pool": [
>             "4fc1c8aa-6245-11eb-83ae-525400ea2c38"
>         ],
>         "role": "Master",
>         "state": "OK",
>         "type": "ISCSI",
>         "uuid": "1e475c8a-8d1b-4eb6-ba2a-a95b26d568d6",
>         "version": "5",
>         "vgMetadataDevice": "36001405dfd35bbfd15a472089cecec0c",
>         "vguuid": "7iZgSK-GtaD-ZX8P-H364-S4aq-dqSe-3IWmjF"
>     },
>     "volumes": {
>         "metadata_backup1": {
>             "status": "INVALID"
>         }
>     }
> }

Verified on vdsm-4.40.50.7-1.el8ev.x86_64 with the above steps.

Moving to 'VERIFIED'.

Comment 18 errata-xmlrpc 2021-04-14 11:38:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: RHV RHEL Host (ovirt-host) 4.4.z [ovirt-4.4.5] security, bug fix, enhancement), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:1184


Note You need to log in before you can comment on or make changes to this bug.