Bug 1959997 - The device id changes in /dev/disk/by-id/ path after VM reboot
Summary: The device id changes in /dev/disk/by-id/ path after VM reboot
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: unspecified
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: ---
Assignee: Eyal Shenitzky
QA Contact: meital avital
URL:
Whiteboard:
Depends On:
Blocks: 1954916
TreeView+ depends on / blocked
 
Reported: 2021-05-12 19:01 UTC by kelwhite
Modified: 2024-10-01 18:11 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-06-17 10:24:13 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 6076251 0 None None None 2021-05-26 00:58:17 UTC

Comment 5 Germano Veit Michel 2021-05-13 04:40:09 UTC
So if I understand correctly, the problem on OCP was caused by a change of disk id.

From: scsi-0QEMU_QEMU_HARDDISK_d844e3a1-4fb0-4411-9
  to: scsi-0QEMU_QEMU_HARDDISK_d844e3a1-4fb0-4411-944b-7789f14ab435
  
I recall seeing these shorter serial numbers on older RHV...

And just reproduced it now, see below:

RHV 4.4, CL 4.3 - qemu-kvm-5.1.0-20.module+el8.3.1+9918+230f5c26.x86_64
-----------------------------------------------------------------------
    <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type>
    ...
   <disk type='block' device='disk' snapshot='no'>
      <driver name='qemu' type='raw' cache='none' error_policy='stop' io='native'/>
      <source dev='/rhev/data-center/mnt/blockSD/5f01590d-b3e7-499b-a799-72f3589783e9/images/b78c4cb1-fa98-410e-9542-42c60bd28e02/3350aa21-d12c-4966-a3e6-cfac5b4245dd' index='1'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore/>
      <target dev='sda' bus='scsi'/>
      <serial>b78c4cb1-fa98-410e-9542-42c60bd28e02</serial>   <--------
      <boot order='2'/>
      <alias name='ua-b78c4cb1-fa98-410e-9542-42c60bd28e02'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    
# udevadm info /dev/sda
E: ID_SERIAL=0QEMU_QEMU_HARDDISK_b78c4cb1-fa98-410e-9542-42c60bd28e02   <------ full serial

Link, full serial:
lrwxrwxrwx. 1 root root 9 May 12 23:50 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_b78c4cb1-fa98-410e-9542-42c60bd28e02 -> ../../sda

RHV 4.4, CL 4.3 - qemu-kvm-rhev-2.12.0-48.el7_9.2.x86_64
-----------------------------------------------
    <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type>
    ...
    <disk type='block' device='disk' snapshot='no'>
      <driver name='qemu' type='raw' cache='none' error_policy='stop' io='native'/>
      <source dev='/rhev/data-center/mnt/blockSD/5f01590d-b3e7-499b-a799-72f3589783e9/images/b78c4cb1-fa98-410e-9542-42c60bd28e02/3350aa21-d12c-4966-a3e6-cfac5b4245dd'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore/>
      <target dev='sda' bus='scsi'/>
      <serial>b78c4cb1-fa98-410e-9542-42c60bd28e02</serial>
      <boot order='1'/>
      <alias name='ua-b78c4cb1-fa98-410e-9542-42c60bd28e02'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    
# udevadm info /dev/sda | grep SERIAL
E: ID_SERIAL=0QEMU_QEMU_HARDDISK_b78c4cb1-fa98-410e-9   <------ missing
    
link, serial is :
lrwxrwxrwx. 1 root root 9 May 13 00:30 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_b78c4cb1-fa98-410e-9 -> ../../sda

So the XML given to libvirt is the same, and qemu-kvm command line contains the full serial.

-drive file=/rhev/data-center/mnt/blockSD/5f01590d-b3e7-499b-a799-72f3589783e9/images/b78c4cb1-fa98-410e-9542-42c60bd28e02/3350aa21-d12c-4966-a3e6-cfac5b4245dd,format=raw,if=none,id=drive-ua-b78c4cb1-fa98-410e-9542-42c60bd28e02,serial=b78c4cb1-fa98-410e-9542-42c60bd28e02,werror=stop,rerror=stop,cache=none,aio=native 
-device scsi-hd,bus=ua-9ca4a624-8c69-4860-bfac-2abea5c72381.0,channel=0,scsi-id=0,lun=0,drive=drive-ua-b78c4cb1-fa98-410e-9542-42c60bd28e02,id=ua-b78c4cb1-fa98-410e-9542-42c60bd28e02,bootindex=1,write-cache=o

I assume this is a difference between old (el7, 2.12) and newer (el8, 5.1) qemu-kvm, where the older version does not pass the full serial, maybe it doesn't fit somewhere.

So I don't think this is a RHV bug. And maybe not even a qemu-kvm/libvirt bug, as the behaviour of the new version seems more correct than the old (full serial).

Comment 7 Germano Veit Michel 2021-05-13 05:50:39 UTC
If this is production down, here is a *very* ugly hack if you want to truncate the disk serial number on newer RHV, so that it looks like the truncated serial on RHV 4.3/RHEL7 hypervisors.

Place this file on the RHV hypervisor, as executable:

# cat /usr/libexec/vdsm/hooks/before_vm_start/99_truncate_serial 
#!/usr/bin/python3

import hooking

def main():
        domxml = hooking.read_domxml()
        for disk in domxml.getElementsByTagName('disk'):
            if disk.getAttribute('device') != 'disk':
                continue
            serial = disk.getElementsByTagName('serial')[0]
            id = serial.firstChild.data
            serial.firstChild.data = id[:20]
        hooking.write_domxml(domxml)


if __name__ == "__main__":
    main()

From now any VMs started on it will have the serial truncated to the first 20 chars of the uuid, so that it looks similar to RHV4.3/RHEL7.

# ll /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_b78c4cb1-fa98-410e-9
lrwxrwxrwx. 1 root root 9 May 13 01:43 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_b78c4cb1-fa98-410e-9 -> ../../sda

Note it will affect all VMs started on the host, so use with care. Can be improved to run just on select VMs by using a custom property.

Comment 11 Michal Skrivanek 2021-05-13 13:01:39 UTC
(In reply to Germano Veit Michel from comment #7)
> If this is production down, here is a *very* ugly hack if you want to
> truncate the disk serial number on newer RHV, so that it looks like the
> truncated serial on RHV 4.3/RHEL7 hypervisors.

another possible workaround (i haven't tried though) could be switching to virtio-block. It is likely that this issue is happening only to virti-scsi disks. (based on input from pkrempa and commit https://gitlab.com/libvirt/libvirt/-/commit/a1dce96236f6d35167924fa7e6a70f58f394b23c)

Comment 13 Germano Veit Michel 2021-05-13 23:49:36 UTC
(In reply to Michal Skrivanek from comment #11)
> (In reply to Germano Veit Michel from comment #7)
> > If this is production down, here is a *very* ugly hack if you want to
> > truncate the disk serial number on newer RHV, so that it looks like the
> > truncated serial on RHV 4.3/RHEL7 hypervisors.
> 
> another possible workaround (i haven't tried though) could be switching to
> virtio-block. It is likely that this issue is happening only to virti-scsi
> disks. (based on input from pkrempa and commit
> https://gitlab.com/libvirt/libvirt/-/commit/
> a1dce96236f6d35167924fa7e6a70f58f394b23c)

Well, it does truncate the serial, but it changes the id anyway due to the interface change:

lrwxrwxrwx. 1 root root 9 May 13  01:43 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_b78c4cb1-fa98-410e-9 -> ../../sda
VS
lrwxrwxrwx. 1 root root  9 May 13 19:46 /dev/disk/by-id/virtio-b78c4cb1-fa98-410e-9 -> ../../vda

So I don't think this will help much...

Comment 14 Michal Skrivanek 2021-05-17 09:11:46 UTC
(In reply to Germano Veit Michel from comment #13)
> (In reply to Michal Skrivanek from comment #11)
> > (In reply to Germano Veit Michel from comment #7)
> > > If this is production down, here is a *very* ugly hack if you want to
> > > truncate the disk serial number on newer RHV, so that it looks like the
> > > truncated serial on RHV 4.3/RHEL7 hypervisors.
> > 
> > another possible workaround (i haven't tried though) could be switching to
> > virtio-block. It is likely that this issue is happening only to virti-scsi
> > disks. (based on input from pkrempa and commit
> > https://gitlab.com/libvirt/libvirt/-/commit/
> > a1dce96236f6d35167924fa7e6a70f58f394b23c)
> 
> Well, it does truncate the serial, but it changes the id anyway due to the
> interface change:
> 
> lrwxrwxrwx. 1 root root 9 May 13  01:43
> /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_b78c4cb1-fa98-410e-9 -> ../../sda
> VS
> lrwxrwxrwx. 1 root root  9 May 13 19:46
> /dev/disk/by-id/virtio-b78c4cb1-fa98-410e-9 -> ../../vda
> 
> So I don't think this will help much...

sure, it's a completely different interface from guest POV, but it should stay consistent between 4.3 and 4.4 hosts

Comment 15 Michal Skrivanek 2021-05-17 11:15:37 UTC
this seems to be a libvirt/qemu change in behavior between RHEL 7 and RHEL 8 (possibly related to commit https://gitlab.com/libvirt/libvirt/-/commit/a1dce96236f6d35167924fa7e6a70f58f394b23c)
Moving to libvirt

Comment 19 Hemant Kumar 2021-05-17 15:23:07 UTC
> Also, I have to reiterate my concern about using /dev/disk/by-* symlinks. Mission critical disk enumeration should be done by reaching to udev directly, e.g. via libudev or its bindings. That's the authoritative source of storage configuration, is able to handle duplicate identifiers and is a race-free access. The /dev/disk/by-* symlinks are being randomly overwritten in case a duplicate identifier (e.g. multipath) appear in the system.

@Tomas - I am the maintainer of local-storage-operator(LSO) in OCP and have some follow up questions. There are two ways LSO can be configured - in one mode user specifies individual devices via (/dev/sda or /dev/disk/by-id/wwwwwww) or they can ask LSO to "claim" all available devices. I would like us to fix device discovery process using libudev or something similar for second option (because user is not specifying devices anyways). Does calling udevadm via following command:

 udevadm info --query=path --name=/dev/sda

works same as using libudev? You have any more pointers?

For first option - where user/admin manually specifies individual device-ids via LocalVolume object:

apiVersion: "local.storage.openshift.io/v1"
kind: "LocalVolume"
metadata:
  name: "example"
spec:
  storageClassDevices:
    - storageClassName: "foobar"
      volumeMode: Filesystem
      fsType: ext4
      devicePaths:
        - /dev/disk/by-id/wwwwwwww

This is bit more tricky. I am not sure if it is "fixable" unless LSO decides to not use device that user provisioned.

Comment 20 Han Han 2021-05-19 08:23:46 UTC
Reprodueced:

Start a VM with a scsi disk like this:
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/jgao.qcow2'/>
      <backingStore/>
      <target dev='sda' bus='scsi'/>
      <serial>b78c4cb1-fa98-410e-9542-42c60bd28e02</serial>
      <alias name='ua-b78c4cb1-fa98-410e-9542-42c60bd28e02'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>


Then check the ID_SERIAL|ID_SCSI_SERIAL:

For VM on libvirt-4.5.0-36.el7_9.3.x86_64 qemu-kvm-rhev-2.12.0-48.el7_9.2.x86_64:
[root@fedora ~]# udevadm info --name sda|grep -E '(ID_SERIAL|ID_SCSI_SERIAL)'
E: ID_SERIAL=0QEMU_QEMU_HARDDISK_b78c4cb1-fa98-410e-9
E: ID_SERIAL_SHORT=b78c4cb1-fa98-410e-9
E: ID_SCSI_SERIAL=b78c4cb1-fa98-410e-9542-42c60bd28e02

For VM on libvirt-7.3.0-1.el8.x86_64 qemu-kvm-5.2.0-12.module+el8.4.0+10354+98272afe.x86_64:
E: ID_SERIAL=0QEMU_QEMU_HARDDISK_b78c4cb1-fa98-410e-9542-42c60bd28e02
E: ID_SERIAL_SHORT=b78c4cb1-fa98-410e-9542-42c60bd28e02
E: ID_SCSI_SERIAL=b78c4cb1-fa98-410e-9542-42c60bd28e02

Their ID_SERIAL are different. That's why the bug happens after OS upgrade.

Comment 21 Peter Krempa 2021-06-03 14:57:39 UTC
Hi,

I've investigated the history of how we've ended up with this change. I've summarized it in an upstream mailing list post:

https://listman.redhat.com/archives/libvir-list/2021-June/msg00066.html

A short summary is that qemu is silently truncating the serial in 3 different ways depending on how it's configured and how it's queried. Additionally the change was done more than 2 years ago, and reverting to silent truncation of less than we do now would also be considerable a regression, thus I don't think we can revert to 20 characters for now.

Looking at the logs/sosreports above, I can see that even the old version where the serial was truncated in one of the VPD pages already contains a qemu patch (see the message above where it's mentioned) which raised the limit to 36 (to fit an UUID) on the second query. This can be seen in the following snippet from Comment #2:

  P: /devices/pci0000:00/0000:00:06.0/virtio2/host0/target0:0:0/0:0:0:4/block/sdb                                      |  P: /devices/pci0000:00/0000:00:06.0/virtio2/host0/target0:0:0/0:0:0:4/block/sdb
  N: sdb                                                                                                               |  N: sdb
  S: disk/by-id/scsi-0QEMU_QEMU_HARDDISK_cb66542c-10d4-4578-8                                                          |  S: disk/by-id/scsi-0QEMU_QEMU_HARDDISK_e1baf1ef-e053-45be-9a51-64dd2fe59c60                                         
  S: disk/by-id/scsi-SQEMU_QEMU_HARDDISK_cb66542c-10d4-4578-8955-92265791f1ed                                          |  S: disk/by-id/scsi-SQEMU_QEMU_HARDDISK_e1baf1ef-e053-45be-9a51-64dd2fe59c60                                  

The link starting with '/dev/disk/by-id/scsi-S' already contains the full UUID in both cases, thus if that link is used instead the result will be identical in both deployments.

I've also filed the following 2 qemu bugs to prevent silent truncation and to synchronise the reported serial lengths:

https://bugzilla.redhat.com/show_bug.cgi?id=1967666
https://bugzilla.redhat.com/show_bug.cgi?id=1967668

Another possible stop-gap is to limit the values used in <serial> to 20 characters which behave the same in both versions.

Comment 22 Peter Krempa 2021-06-07 15:17:27 UTC
After confirming with upstream that reverting to silent truncation of the serial to 20 characters would be a regression too from upstream point of view I've added the following notice into the documentation:

commit 2c8b341af83501730f3767c8738e909ef72f6a28
Author: Peter Krempa <pkrempa>
Date:   Fri Jun 4 14:08:40 2021 +0200

    docs: formatdomain: Document disk serial truncation status quo
    
    Disk serials are truncated arbitrarily and silently by qemu depending on
    the device type and how they are configured. Since changing the current
    state would lead to more regressions than we have now, document that the
    truncation is arbitrary.
    
    Signed-off-by: Peter Krempa <pkrempa>
    Reviewed-by: Michal Privoznik <mprivozn>

diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
index c1ee9fedda..da4d93a787 100644
--- a/docs/formatdomain.rst
+++ b/docs/formatdomain.rst
@@ -3146,6 +3146,16 @@ paravirtualized driver is specified via the ``disk`` element.
    may look like ``<serial>WD-WMAP9A966149</serial>``. Not supported for
    scsi-block devices, that is those using disk ``type`` 'block' using
    ``device`` 'lun' on ``bus`` 'scsi'. :since:`Since 0.7.1`
+
+   Note that depending on hypervisor and device type the serial number may be
+   truncated silently. IDE/SATA devices are commonly limited to 20 characters.
+   SCSI devices depending on hypervisor version are limited to 20, 36 or 247
+   characters.
+
+   Hypervisors may also start rejecting overly long serials instead of
+   truncating them in the future so it's advised to avoid the implicit
+   truncation by testing the desired serial length range with the desired device
+   and hypervisor combination.
 ``wwn``
    If present, this element specifies the WWN (World Wide Name) of a virtual
    hard disk or CD-ROM drive. It must be composed of 16 hexadecimal digits.

Unfortunately from libvirt's point of view the resolution would be CLOSED_CANTFIX.

Returning to RHV to investigate other possible solutions. In case there's none please close as CANTFIX.

Comment 23 Hemant Kumar 2021-06-07 16:00:22 UTC
Peter - can you provide a recommendation to OCP about which symlink to use (in case symlinks aren't user specified) -  if a device has multiple QEMU/libvirt generated symlinks? I would probably need to ensure that whenever possible local-storage-operator uses longest (recommended?) version.

Comment 24 Peter Krempa 2021-06-08 10:46:10 UTC
The symlinks in /dev/disk/by-id/ are generated by udev based on the disk identification data.

Now based on what I've summarized in the upstream mailing list post the following applies:

There are two identification VPD pages of a SCSI disk:

0x80: - This is present only when a serial number is configured
      - results in symlinks with prefix of /dev/disk/by-id/scsi-S
      - length is limited to 36 chars (fits a full UUID including dashes)
      - the limit of 36 chars is since qemu-2.7
            - most of rhel-av-7 series
            - all of rhel-8 qemus

0x83: - if serial is not configured, it contains the device alias (limited to 247 chars)
      - if serial is configured, it contains the serial (limited to 20 chars, old versions, 247 chars new versions)
      - results in udev symlink with prefix of /dev/disk/by-id/scsi-0
      

Based on the above and the data reported from pre-upgrade and post-upgrade systems I'd suggest to use the symlinks starting with the prefix '/dev/disk/by-id/scsi-S' which in both cases fit an UUID fully.

In the future I hope that qemu will unify the handling of the serial numbers and also reject non-conformant serials rather than silently truncate them.

Comment 25 Michal Skrivanek 2021-06-10 15:42:52 UTC
Seems we now understand the reasons for this problem and also how to avoid it. Is there anything else to look at in RHV or OCP on RHV or is this now rather a documentation of OCP in general, or something LSO specific?

Comment 26 Tomáš Bžatek 2021-06-16 12:16:00 UTC
(In reply to Hemant Kumar from comment #19)
> @Tomas - I am the maintainer of local-storage-operator(LSO) in OCP and have
> some follow up questions. There are two ways LSO can be configured - in one
> mode user specifies individual devices via (/dev/sda or
> /dev/disk/by-id/wwwwwww) or they can ask LSO to "claim" all available
> devices. I would like us to fix device discovery process using libudev or
> something similar for second option (because user is not specifying devices
> anyways). Does calling udevadm via following command:
> 
>  udevadm info --query=path --name=/dev/sda
> 
> works same as using libudev? You have any more pointers?

Yes, it's reaching to the same database. However this way you get a composite path containing multiple layers with totally irrelevant and volatile information like port numbers, HBA bus ID and assigned kernel block device name.

> For first option - where user/admin manually specifies individual device-ids
> via LocalVolume object:
> 
> apiVersion: "local.storage.openshift.io/v1"
> kind: "LocalVolume"
> metadata:
>   name: "example"
> spec:
>   storageClassDevices:
>     - storageClassName: "foobar"
>       volumeMode: Filesystem
>       fsType: ext4
>       devicePaths:
>         - /dev/disk/by-id/wwwwwwww

My point was to stop using device paths and use e.g. filesystem UUID, partition UUID and similar unique identifiers. Of course there's always a risk of duplicate identifiers present in the system and care must be taken in choosing the most unique one under given conditions.

Comment 27 Michal Skrivanek 2021-06-17 10:24:13 UTC
Thanks for all the updates. There's not much we can do about it in RHV or libvirt so I'm moving this back to the original OCP bug 1954916.
Closing with resolution from comment #22, please continue the discussion on potential LSO changes in bug 1954916

Comment 28 Hemant Kumar 2021-10-08 18:50:16 UTC
> My point was to stop using device paths and use e.g. filesystem UUID, partition UUID and similar unique identifiers. Of course there's always a risk of duplicate identifiers present in the system and care must be taken in choosing the most unique one under given conditions.

we can't use filesystem UUID, that implies there should be a filesystem on it before it can be used in a workload. There are whole set of applications that consume raw block devices.

> Thanks for all the updates. There's not much we can do about it in RHV or libvirt so I'm moving this back to the original OCP.

These device ids were specified by end users in their applications. Say an application that consumes raw block device from /dev/disk/by-id/qemu-xxxx and after a minor upgrade and reboot the symlink no longer is created. OCP is basically just an agent between end-user and RHV, on its own it is not picking anything (at least in this case). 




> 0x83: - if serial is not configured, it contains the device alias (limited to 247 chars)
      - if serial is configured, it contains the serial (limited to 20 chars, old versions, 247 chars new versions)
      - results in udev symlink with prefix of /dev/disk/by-id/scsi-0

So problem seems to be that older version generated those 20 chars serial but newer version generates 247 chars serials? Is that accurate? What happens if both 20 char and 247 chars serials are generated both at the same time? Is that a breaking change?

Comment 29 Michal Skrivanek 2021-10-12 11:13:36 UTC
what do you mean by "at the same time"? It's just that when you boot the VM on RHEL7-based hypervisor it will contain truncated UUID, and on RHEL8-based hypervisor it will contain the whole UUID.
Isn't it better to use /dev/disk/by-id/scsi-S as Petr describes in comment #24? That should remain the same.

Comment 30 Jan Safranek 2021-10-15 13:53:31 UTC
Thanks for the explanation.

> Isn't it better to use /dev/disk/by-id/scsi-S as Petr describes in comment #24? That should remain the same.

LSO currently takes the first symlink found in /dev/disk/by-id that points to the right disk, which seems to be the bad one. How can it determine which symlink is the right one, without knowing anything about QEMU limitations in various versions? LSO runs on many platforms and so far we avoided platform specific hacks.

Comment 31 Tomáš Bžatek 2021-10-15 14:03:44 UTC
(In reply to Jan Safranek from comment #30)
> LSO currently takes the first symlink found in /dev/disk/by-id that points
> to the right disk, which seems to be the bad one. How can it determine which
> symlink is the right one, without knowing anything about QEMU limitations in
> various versions?

You might get an inspiration in SCSI identifiers sorting that we implemented for lsscsi: https://github.com/doug-gilbert/lsscsi/blob/main/src/lsscsi.c#L1698
Related RHEL bugzilla: bug 1846566

Note that there are various interpretations of the SCSI VPD standard pages and thus the identifier preference might vary.

Comment 32 Hemant Kumar 2021-10-18 16:04:28 UTC
> what do you mean by "at the same time"? It's just that when you boot the VM on RHEL7-based hypervisor it will contain truncated UUID, and on RHEL8-based hypervisor it will contain the whole UUID. 

Thanks for the answers. What I was trying to ask is, is it a breaking change - if RHEL8 based hypervisor(my apologies for simplifying who generates those ids) generated both 20 chars and 247 chars UUIDs? I don't see how it could be breaking change, since folks who depend on full UUID will have it and those who upgraded from RHEL-7 (or older version) can use shorter version of UUID if they were using it.

> Isn't it better to use /dev/disk/by-id/scsi-S as Petr describes in comment #24? That should remain the same.

In our case - it is an end user who asked us to use a device in /dev/disk/by-id/scsi-xxx and as per our own docs - https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/persistent_naming

"Red Hat Enterprise Linux automatically maintains the proper mapping from the WWID-based device name to a current /dev/sd name on that system. Applications can use the /dev/disk/by-id/ name to reference the data on the disk, even if the path to the device changes, and even when accessing the device from different systems."

So we are saying - the names in /dev/disk/by-id/* are not actually persistent and may disappear on upgrade. This breaks the entire premise of persistent disk names. I don't know why it is hard to see. 

To answer your question - yes we could use /dev/disk/by-id/scsi-Sxx name as Petr described, but that is not what user asked us to use. We will implementing a workaround and hardcoding an heuristic which is very much platform specific and breaks user expectations (user asked us to use device-A and we are going behind user and using symlink in device-B because reasons!).

Comment 33 Red Hat Bugzilla 2023-09-15 01:06:29 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.