Bug 1635614 - Vdsm-tool config-lvm-filter uses /dev/sdXXX which may change after boot
Summary: Vdsm-tool config-lvm-filter uses /dev/sdXXX which may change after boot
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: Tools
Version: 4.20.31
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ovirt-4.4.3
: 4.40.24
Assignee: Amit Bawer
QA Contact: Evelina Shames
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-03 11:43 UTC by Roman Hodain
Modified: 2020-11-11 06:45 UTC (History)
6 users (show)

Fixed In Version: vdsm-4.40.24
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-25 08:01:16 UTC
oVirt Team: Storage
Embargoed:
sbonazzo: ovirt-4.4?
pm-rhel: planning_ack+
pm-rhel: devel_ack+
rhodain: testing_ack?


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 110522 0 master MERGED lvmfilter: Use /dev/disk/by-id/lvm-pv-uuid devlinks for pv naming 2021-02-10 16:57:51 UTC

Description Roman Hodain 2018-10-03 11:43:43 UTC
Description of problem:
Since vdsm-4.20.39-1 the local disks are blacklisted from multipath due to
    
    Bug 1622700 - [downstream clone - 4.2.6] [RFE][Dalton] - Blacklist all local disk in multipath on RHEL / RHEV Host (RHEL 7.5)

This means that the config-lvm-filter detects the local disks as /dev/sdXX. instead of a multipath device. As the /dev/sdXX names are not guaranteed the name can change. Especially if the hosts an FC domain or pass through LUNs connected. 

Version-Release number of selected component (if applicable):
vdsm-4.20.39-1

How reproducible:
Depends on the configuration. The higher number of LUN devices the higher possibility of the device name change.

Steps to Reproduce:
1. Check the local device name 
2. Reboot the system
3. Check the device name again

Actual results:
The name may change and the lvm filter doe not applies any more.

Expected results:
Use instead the SCSI ID or any other static identifier.

Additional info:

Comment 1 Nir Soffer 2018-10-03 12:09:05 UTC
(In reply to Roman Hodain from comment #0)
> Description of problem:
> Since vdsm-4.20.39-1 the local disks are blacklisted from multipath due to
>     
>     Bug 1622700 - [downstream clone - 4.2.6] [RFE][Dalton] - Blacklist all
> local disk in multipath on RHEL / RHEV Host (RHEL 7.5)

The fix was reverted in 4.2.6.1 (async). Can you test again with latest version?
 
> This means that the config-lvm-filter detects the local disks as /dev/sdXX.
> instead of a multipath device.

This is expected, local disks should not never use multipath.

> As the /dev/sdXX names are not guaranteed the
> name can change. Especially if the hosts an FC domain or pass through LUNs
> connected.

Can you explain what is:
"Especially if the hosts an FC domain or pass through LUNs connected"
 
> Expected results:
> Use instead the SCSI ID or any other static identifier.

lvm filter can work only with concrete device names. We can use /dev/disk-by-* 
links, but people from LVM team suggested we avoid these.

David, do we have a reliable way to use consistent device names in lvm filter?

Comment 2 Roman Hodain 2018-10-03 12:39:14 UTC
> Especially if the hosts an FC domain or pass through LUNs connected
If the hosts have FC storage domain or LUNs for pass-through disks from FC connected. The number of SCSI devices is higher and the chance that the scsi device name is taken by another device during the boot is higher.

Comment 3 David Teigland 2018-10-03 15:59:23 UTC
It's difficult to blacklist devices in the filter because udev has a habit of inventing new symlink paths to devices that you don't know about or expect (e.g. disk/by-*).  It's whack-a-mole trying to blacklist all of them, which is why whitelisting is more reliable.

I don't have any great suggestion for dealing with changing sd names for the same device.  You can try whitelisting disk/by paths, which should work, but I haven't seen this widely used or tested.

Comment 4 Nir Soffer 2018-10-03 16:16:25 UTC
Thanks David! We already use a whiteslist approach.

Here is example of configuration run to make sure we are on the same page:

# vdsm-tool config-lvm-filter
Analyzing host...
Found these mounted logical volumes on this host:

  logical volume:  /dev/mapper/fedora_voodoo1-root
  mountpoint:      /
  devices:         /dev/vda2

  logical volume:  /dev/mapper/fedora_voodoo1-swap
  mountpoint:      [SWAP]
  devices:         /dev/vda2

This is the recommended LVM filter for this host:

  filter = [ "a|^/dev/vda2$|", "r|.*|" ]

This filter allows LVM to access the local devices used by the
hypervisor, but not shared storage owned by Vdsm. If you add a new
device to the volume group, you will need to edit the filter manually.

Configure LVM filter? [yes,NO] yes
Configuration completed successfully!

Please reboot to verify the LVM configuration.


Looking in /dev/disk/by*, we have several ways to refer to /dev/vda2:

# ls -lh /dev/disk/by-*
/dev/disk/by-id:
...
lrwxrwxrwx. 1 root root 10 Oct  2 22:58 lvm-pv-uuid-RI8Wfl-wNyT-dzWR-dup5-63yG-KCOO-fIsN1B -> ../../vda2

/dev/disk/by-partuuid:
...
lrwxrwxrwx. 1 root root 10 Oct  2 22:58 85bf8591-02 -> ../../vda2

/dev/disk/by-path:
...
lrwxrwxrwx. 1 root root 10 Oct  2 22:58 pci-0000:03:00.0-part2 -> ../../vda2
lrwxrwxrwx. 1 root root 10 Oct  2 22:58 virtio-pci-0000:03:00.0-part2 -> ../../vda2


# vgs -o pv_name,pv_uuid
  PV         PV UUID                               
  /dev/vda2  RI8Wfl-wNyT-dzWR-dup5-63yG-KCOO-fIsN1B


# lsblk -o NAME,UUID
NAME                    UUID
sr0                     
vda                     
├─vda1                  29919819-1832-4b82-912f-8e231d0401bb
└─vda2                  RI8Wfl-wNyT-dzWR-dup5-63yG-KCOO-fIsN1B
  ├─fedora_voodoo1-root abfa8901-489a-4b4a-ad59-4f129a377637
  └─fedora_voodoo1-swap 34e5b47b-17a5-48f0-8206-b65b31b4502f


/dev/disk/by-id/lvm-pv-uuid-RI8Wfl-wNyT-dzWR-dup5-63yG-KCOO-fIsN1B looks like the
best option.

Can we use it in lvm filter? will it be available early enough during boot?

Comment 5 David Teigland 2018-10-03 17:03:27 UTC
I'm looking into whether there are some disk/by links that have been used in the past for solving this problem.

In general I don't take the disk/by links very seriously, and I'd treat them as optional/informational; useful in one-off situations.  They depend on udev which I don't have a great deal of faith in.

udev ultimately has to get the PV uuids from lvm in order to create the disk/by-id/lvm-pv-uuid links.  So, making lvm depend on those links doesn't seem like a good idea.

Also, in the case where a disk is copied, it results in two disks with the same pv uuid, in which case the disk/by-id behavior is undefined.

Comment 6 Roman Hodain 2018-11-13 09:37:01 UTC
(In reply to David Teigland from comment #5)
> I'm looking into whether there are some disk/by links that have been used in
> the past for solving this problem.
> 
> In general I don't take the disk/by links very seriously, and I'd treat them
> as optional/informational; useful in one-off situations.  They depend on
> udev which I don't have a great deal of faith in.
> 
> udev ultimately has to get the PV uuids from lvm in order to create the
> disk/by-id/lvm-pv-uuid links.  So, making lvm depend on those links doesn't
> seem like a good idea.
> 
> Also, in the case where a disk is copied, it results in two disks with the
> same pv uuid, in which case the disk/by-id behavior is undefined.

But we do not have to use the lvm uuid. We should rather use scsi id or wwn or whatever that is not lvm ID. That shoudl work, right?

Comment 7 David Teigland 2018-11-13 14:07:12 UTC
> But we do not have to use the lvm uuid. We should rather use scsi id or wwn
> or whatever that is not lvm ID. That shoudl work, right?

Yes, that should work.

I also learned that the lvm-pv-uuid links are created without using lvm itself, there's some code in udev or blkid that can read and interpret the lvm metadata on the disk.

Comment 8 Sandro Bonazzola 2019-01-28 09:41:58 UTC
This bug has not been marked as blocker for oVirt 4.3.0.
Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.

Comment 10 Nir Soffer 2020-06-17 12:39:27 UTC
David, according to comment 5 we should not use /dev/disk/by-id, but
comment 7 gives me some hope that this can work.

Should we switch the lvm filter to use lvm-pv-uuid-*?

Comment 11 David Teigland 2020-06-17 13:56:13 UTC
You could certainly try using any of the /dev/disk links in the filter, including lvm-pv-uuid.  I don't know of any problems doing that, but I've not seen it done before either.  You will begin depending on udev (to process devs and set up those links).

Comment 13 Nir Soffer 2020-10-19 11:35:13 UTC
We don't have example sdk scripts using the new functionality, but we have
examples in imageio source here:

- https://github.com/oVirt/ovirt-imageio/blob/master/examples/libvirt-stream
- https://github.com/oVirt/ovirt-imageio/blob/master/examples/sparse-stream

The example contains documentation showing how you can run them with standalone
imageio server. You should be able to run them to verify the new functionallity.

You can also run these examples using RHV, by starting a transfer using the
SDK image_transfer.py example, and running the examples with the transfer URL.

Comment 14 Nir Soffer 2020-10-19 11:38:47 UTC
Another way to test the new functionality is to modify code in upload/download
disk examples to use ImageioClient instead of client.upload() and client.download().

You can do this based on the examples mentioned in comment 13, showing how to use
the new API.

Comment 16 Sandro Bonazzola 2020-11-11 06:45:33 UTC
This bugzilla is included in oVirt 4.4.3 release, published on November 10th 2020.

Since the problem described in this bug report should be resolved in oVirt 4.4.3 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.