Bug 1372169 - [RFE] Include volUUID in storage domain metadata
Summary: [RFE] Include volUUID in storage domain metadata
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: RFEs
Version: 4.0.2
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: Rob Young
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-01 06:04 UTC by Germano Veit Michel
Modified: 2019-05-16 13:07 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-05 12:55:19 UTC
oVirt Team: Storage
Target Upstream Version:


Attachments (Terms of Use)

Description Germano Veit Michel 2016-09-01 06:04:34 UTC
1. Proposed title of this feature request

In storage domain metadata, include volUUID for each volume metadata.

2. Who is the customer behind the request?

Red Hat - GSS 

3. What is the nature and description of the request?

Currently, the metadata information for each volume looks like this:

DOMAIN=419ffff2-d21d-48a5-bfe1-d90bbf9dc947 <--- storage domain
VOLTYPE=INTERNAL
CTIME=1460990328
FORMAT=COW
IMAGE=8b95ac24-d6ea-47fd-9f92-13d5703a4d77  <--- image group
DISKTYPE=2
PUUID=3d455a17-23b1-4c66-9a7c-0622885ea128  <--- parent in chain
LEGALITY=LEGAL
MTIME=0
POOL_UUID=
DESCRIPTION=
TYPE=SPARSE
SIZE=629145600

It does not contain the volume UUID that this metadata belongs to. Of course, this is contained in a <volUUID>.meta file or in the respective block for the volume (MD_XXX) in the metadata LV when it's on block storage. But sometimes chains have 50-60-70 images. On file based SDs, it's a bit easier as one can gzip all the .meta files. On block storage is not that simple as we also have to parse lvs output.

If we had the volUUID replicated in the metadata as well, it would make our life easier to troubleshoot related issues and a single set of scripts would work on both file and block based storage. Also, this great tool provided to GSS via RFE BZ 1188263 could be easily adapted to run on the metatada LV dd'd out of the SD, with no need to look for LVM tags for each volume. Currently it's only runs "online".

Also this looks like a very simple change in vdsm. I'm just not sure if it wouldn't break compatibility with other versions, but as far as I can see it won't.

Comment 4 Nir Soffer 2016-09-04 15:56:58 UTC
(In reply to Germano Veit Michel from comment #0)
> If we had the volUUID replicated in the metadata as well, it would make our
> life easier to troubleshoot related issues and a single set of scripts would
> work on both file and block based storage. 

We cannot do this with current storage format, since we have only about 20 bytes
left in the metadata block since it is used now for volume description and alias.

To do this we have to split the user information (alias and description) from the
system information (format, parent uuid, etc.).

On block storage, we are using single block (512 bytes) for storing volume metadata. Since 3.4 we are storing user info such as disk alias and description
in json format. We allocated 210 bytes for the user info so there is no space
left for new uuid. 210 bytes is also not enough to hold the data engine accepts,
so we truncate the description and the alias if they are too long.

See https://github.com/oVirt/vdsm/blob/master/lib/vdsm/storage/constants.py#L129

> Also, this great tool provided to
> GSS via RFE BZ 1188263 could be easily adapted to run on the metatada LV
> dd'd out of the SD, with no need to look for LVM tags for each volume.
> Currently it's only runs "online".

We can adapt this tool to work offline without this change. Please open an RFE
and explain the use case if you need this.

Regarding command line tools working with vdsm storage - we will not support
them unless they are part of vdsm source. These scripts should be written in
Python and use vdsm storage modules instead of duplicating the logic in another
language.

Comment 5 Germano Veit Michel 2016-09-04 23:35:08 UTC
(In reply to Allon Mureinik from comment #3)
> I don't like the concept of having information held in two places (read: the
> name of the .meta file and its contents), but this request does seem to have
> its merits. Let's see what the storage VDSM maintainers have to say about
> this.

Hi Allon,

Indeed, I don't like this duplication as well. But thinking about it we already have many parts of this metadata replicated in many places. Example: for block storage we have the parent volume id in the qcow2 header, LVM tags, Metadata LV and DB. Sure, vdsm does not see the DB directly but the former 3 it does.

(In reply to Nir Soffer from comment #4)
> We cannot do this with current storage format, since we have only about 20
> bytes
> left in the metadata block since it is used now for volume description and
> alias.

This is what I was afraid of: it would not fit and would require bigger changes, probably not worth it.

> We can adapt this tool to work offline without this change. Please open an
> RFE and explain the use case if you need this.

This RFE would make life a bit easier for us in support, but not sure if it's worth investing too many hours on it. 

But how would that tool work offline just with the metadata LV? There is no way to guess the volume ids.

> Regarding command line tools working with vdsm storage - we will not support
> them unless they are part of vdsm source. These scripts should be written in
> Python and use vdsm storage modules instead of duplicating the logic in
> another language.

These scripts we use are mostly a bunch of cat, dd, awk, cut and friends and are tailored case by case to look into something the customer is facing. So it's not really generic enough to make it built in vdsm. We all use different scripts but I know Gordon (gfw) has a git repo for his. I often go with awk to parse qcow, SD metadata, LVM metadata and DB info and then throw everything into a Calc spreadsheet to walk all the chains, comparing them. And every case is a bit different.

Having this volUUID inside the meta would simplify a few things on our workflow, especially the differences between file and block SDs. But overall the whole thing would still be a tedious task. Bigger chains such as 40,50+ might be the new standard as it seems some customers are moving to automated backup solutions based on snapshots and the api.

In conclusion, if this is not a easy change then I believe we can drop it as it would not dramatically make things easier and I am sure there are more important things to be done (example BZ 1372163). Perhaps when we eventually move to 4k blocks it can be reassessed.

Comment 7 Nir Soffer 2017-09-12 17:46:07 UTC
This info is duplicated in the metadata files for every volume:

    DOMAIN=419ffff2-d21d-48a5-bfe1-d90bbf9dc947

This does not make sense since we can get the domain uuid from the path of the file
(nfs):

    /rhev/data-center/pool-id/domain-uuid/images/image-uuidd/volume-uuidd.meta

or the path to the metadata lv (block):

    /dev/domain-uuid/metadata

Both engine and vdsm or user using vdsm-client know the storage domain uuuid
since there is no way to get volume metadata without the storage domain id.

Removing the useless domain uuid makes room for the volume uuid:

    VUUID=5886406d-b48f-4978-b39f-20a25b5135fb

I did not check yet if vdsm is using the domain uuid in the metadata, if old
versions break when this uuid is missing, we cannot remove it.

Comment 8 Tal Nisan 2018-06-05 12:55:19 UTC
For now this seems as a risky change (breaking backwards compatibility) which will also take a lot of effort, closing as WONTFIX

Comment 9 Franta Kust 2019-05-16 13:07:21 UTC
BZ<2>Jira Resync


Note You need to log in before you can comment on or make changes to this bug.