Bug 1809053

Summary: [Azure][RHEL7.6]Inconsistent creation of symlinks in /dev/disk/by-path in Azure VMs
Product: Red Hat Enterprise Linux 7 Reporter: Jose Castillo <jcastillo>
Component: systemdAssignee: David Tardon <dtardon>
Status: CLOSED ERRATA QA Contact: xuli <xuli>
Severity: low Docs Contact:
Priority: medium    
Version: 7.6CC: cavery, dtardon, fsumsal, huzhao, jsynacek, mmorsy, ribarry, systemd-maint-list, vkuznets, xuli, yacao, yuxisun
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-09-29 20:32:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jose Castillo 2020-03-02 10:45:33 UTC
Description of problem:
In Azure VMS we are seeing that the paths inside the directory /dev/disk/by-path are created inconsistently. In the example below you can see that all devices both on scsi0 and scsi1 are mapped to '-scsi-0:0:0:0-' . Because of this the sdc on scsi1 devices  /dev/sdc disk devices and /dev/sdc1 partition devices are hiding/overwriting the /dev/sda and /dev/sda1.

	$ find /dev/disk/azure -ls
	 14229    0 drwxr-xr-x   3 root     root          140 Feb 17 15:53 /dev/disk/azure
	 23455    0 lrwxrwxrwx   1 root     root           10 Feb 19 18:34 /dev/disk/azure/resource-part1 -> ../../sdb1
	 15470    0 drwxr-xr-x   2 root     root           80 Feb 17 15:52 /dev/disk/azure/scsi1
	 15489    0 lrwxrwxrwx   1 root     root           13 Feb 19 18:34 /dev/disk/azure/scsi1/lun0-part1 -> ../../../sdc1
	 15471    0 lrwxrwxrwx   1 root     root           12 Feb 19 18:34 /dev/disk/azure/scsi1/lun0 -> ../../../sdc
	 14245    0 lrwxrwxrwx   1 root     root           10 Feb 19 18:34 /dev/disk/azure/root-part1 -> ../../sda1
	 14932    0 lrwxrwxrwx   1 root     root            9 Feb 19 18:34 /dev/disk/azure/resource -> ../../sdb
	 14230    0 lrwxrwxrwx   1 root     root            9 Feb 19 18:34 /dev/disk/azure/root -> ../../sda

	$ find /dev/disk/by-path -ls
	 10544    0 drwxr-xr-x   2 root     root          120 Feb 19 18:34 /dev/disk/by-path
	4732282    0 lrwxrwxrwx   1 root     root           10 Feb 19 18:34 /dev/disk/by-path/acpi-VMBUS:01-scsi-0:0:0:0-part1 -> ../../sdc1
	4732960    0 lrwxrwxrwx   1 root     root            9 Feb 19 18:34 /dev/disk/by-path/acpi-VMBUS:01-scsi-0:0:0:0 -> ../../sdc
	 23464    0 lrwxrwxrwx   1 root     root           10 Feb 19 18:34 /dev/disk/by-path/acpi-VMBUS:01-scsi-0:0:1:0-part1 -> ../../sdb1
	 10545    0 lrwxrwxrwx   1 root     root            9 Feb 19 18:34 /dev/disk/by-path/acpi-VMBUS:01-scsi-0:0:1:0 -> ../../sdb

	$ cat sos_commands/block/lsblk
	NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
	sda      8:0    0  32G  0 disk 
	`-sda1   8:1    0  32G  0 part /
	sdb      8:16   0  32G  0 disk 
	`-sdb1   8:17   0  32G  0 part /mnt/resource
	sdc      8:32   0  10G  0 disk 
	`-sdc1   8:33   0  10G  0 part /lvuser1

	$ cat sos_commands/scsi/lsscsi 
	[2:0:0:0]    disk    Msft     Virtual Disk     1.0   /dev/sda 
	[3:0:1:0]    disk    Msft     Virtual Disk     1.0   /dev/sdb 
	[5:0:0:0]    disk    Msft     Virtual Disk     1.0   /dev/sdc 

Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux 7
Reproduced so far in as early as systemd-219-62.el7.x86_64 and
as late as systemd-219-62.el7_6.11.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Add multiple virtual SCSI devices in an Azure VM.
2. Simply check the content of /dev/disk/by-path


Actual results:
Not all devices appear in /dev/disk/by-path

Expected results:
All devices' symlinks should appear inside /dev/disk/by-path

Comment 2 Jose Castillo 2020-03-03 08:22:21 UTC
I'm manipulating the systemd sources, specifically the functions:

sd_device_get_syspath()
udev_device_get_syspath()
builtin_path_id()
handle_scsi_hyperv()

And reproducing with scsi_debug, because I don't have an AZURE vm and I haven't been able to find a procedure to get one. But if you have one available where I can test, I can progress a bit more with my debug code.

Comment 3 xuli 2020-03-03 08:38:17 UTC
Test on Hyper-V 2019 host and Azure with RHEL 7.8 kernel 3.10.0-1127.el7.x86_64 by me and Yuxin, can reproduce this issue.

Test steps:
1. Add 3 new scsi disks, 2 disks are located under the same controller with system disk, and the 3rd disk added to the other SCSI controller.
2. After vm boot up, check below information.
 
Step 1: # ll /dev/sd*
brw-rw----. 1 root disk 8,  0 Mar  3 15:19 /dev/sda
brw-rw----. 1 root disk 8,  1 Mar  3 15:19 /dev/sda1
brw-rw----. 1 root disk 8,  2 Mar  3 15:19 /dev/sda2
brw-rw----. 1 root disk 8,  3 Mar  3 15:19 /dev/sda3
brw-rw----. 1 root disk 8, 16 Mar  3 15:19 /dev/sdb
brw-rw----. 1 root disk 8, 17 Mar  3 15:19 /dev/sdb1
brw-rw----. 1 root disk 8, 32 Mar  3 15:19 /dev/sdc
brw-rw----. 1 root disk 8, 48 Mar  3 15:19 /dev/sdd

# ls -alh /dev/disk/by-path                 (There is no ../../sdd path)

lrwxrwxrwx. 1 root root   9 Mar  3 15:19 acpi-VMBUS:00-scsi-0:0:0:0 -> ../../sda
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 acpi-VMBUS:00-scsi-0:0:0:0-part1 -> ../../sda1
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 acpi-VMBUS:00-scsi-0:0:0:0-part2 -> ../../sda2
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 acpi-VMBUS:00-scsi-0:0:0:0-part3 -> ../../sda3
lrwxrwxrwx. 1 root root   9 Mar  3 15:19 acpi-VMBUS:00-scsi-0:0:0:1 -> ../../sdb
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 acpi-VMBUS:00-scsi-0:0:0:1-part1 -> ../../sdb1
lrwxrwxrwx. 1 root root   9 Mar  3 15:19 acpi-VMBUS:00-scsi-0:0:0:2 -> ../../sdc

Step 2: Use "fdisk /dev/sdd" to show /dev/sdd information

Step 3: #ls -alh /dev/disk/by-path           (There is ../../sdd path, but without ../../sda after fdisk )

lrwxrwxrwx. 1 root root   9 Mar  3 15:34 acpi-VMBUS:00-scsi-0:0:0:0 -> ../../sdd
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 acpi-VMBUS:00-scsi-0:0:0:0-part1 -> ../../sda1
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 acpi-VMBUS:00-scsi-0:0:0:0-part2 -> ../../sda2
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 acpi-VMBUS:00-scsi-0:0:0:0-part3 -> ../../sda3
lrwxrwxrwx. 1 root root   9 Mar  3 15:19 acpi-VMBUS:00-scsi-0:0:0:1 -> ../../sdb
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 acpi-VMBUS:00-scsi-0:0:0:1-part1 -> ../../sdb1
lrwxrwxrwx. 1 root root   9 Mar  3 15:19 acpi-VMBUS:00-scsi-0:0:0:2 -> ../../sdc

Other information:
# rpm -qa | grep -i udev
python-gudev-147.2-7.el7.x86_64
system-config-printer-udev-1.4.1-23.el7.x86_64
libgudev1-219-73.el7.1.x86_64
python-pyudev-0.15-9.el7.noarch
# rpm -qa | grep -i systemd
systemd-python-219-73.el7.1.x86_64
systemd-libs-219-73.el7.1.x86_64
systemd-sysv-219-73.el7.1.x86_64
systemd-219-73.el7.1.x86_64

# ls -alh /dev/disk/by-id

lrwxrwxrwx. 1 root root  10 Mar  3 15:19 dm-name-rhel_bootp--73--199--15-root -> ../../dm-0
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 dm-name-rhel_bootp--73--199--15-swap -> ../../dm-1
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 dm-uuid-LVM-KAfrJASN3DD7gca3I0DuApYNE2Fqf9ScPv1DpKJoEBb6hnIld3w3FnuVc9MxXD0i -> ../../dm-1
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 dm-uuid-LVM-KAfrJASN3DD7gca3I0DuApYNE2Fqf9ScsMVpuPuq9CbMDqREkfQXKxWZrpdfEzfq -> ../../dm-0
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 lvm-pv-uuid-Jf3Sid-JxxL-wPT6-gtaE-11nZ-N87i-LJnXQQ -> ../../sda3
lrwxrwxrwx. 1 root root   9 Mar  3 15:19 scsi-360022480877f26dc71cc5975143d2769 -> ../../sdb
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 scsi-360022480877f26dc71cc5975143d2769-part1 -> ../../sdb1
lrwxrwxrwx. 1 root root   9 Mar  3 15:19 scsi-360022480b448dc2d2dc9ded2a07b28b6 -> ../../sdc
lrwxrwxrwx. 1 root root   9 Mar  3 15:19 scsi-360022480e6d94fd911b8026ca7e1290b -> ../../sda
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 scsi-360022480e6d94fd911b8026ca7e1290b-part1 -> ../../sda1
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 scsi-360022480e6d94fd911b8026ca7e1290b-part2 -> ../../sda2
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 scsi-360022480e6d94fd911b8026ca7e1290b-part3 -> ../../sda3
lrwxrwxrwx. 1 root root   9 Mar  3 15:34 scsi-360022480f57c4ce94a48ac7068d52a0f -> ../../sdd
lrwxrwxrwx. 1 root root   9 Mar  3 15:19 wwn-0x60022480877f26dc71cc5975143d2769 -> ../../sdb
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 wwn-0x60022480877f26dc71cc5975143d2769-part1 -> ../../sdb1
lrwxrwxrwx. 1 root root   9 Mar  3 15:19 wwn-0x60022480b448dc2d2dc9ded2a07b28b6 -> ../../sdc
lrwxrwxrwx. 1 root root   9 Mar  3 15:19 wwn-0x60022480e6d94fd911b8026ca7e1290b -> ../../sda
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 wwn-0x60022480e6d94fd911b8026ca7e1290b-part1 -> ../../sda1
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 wwn-0x60022480e6d94fd911b8026ca7e1290b-part2 -> ../../sda2
lrwxrwxrwx. 1 root root  10 Mar  3 15:19 wwn-0x60022480e6d94fd911b8026ca7e1290b-part3 -> ../../sda3
lrwxrwxrwx. 1 root root   9 Mar  3 15:34 wwn-0x60022480f57c4ce94a48ac7068d52a0f -> ../../sdd


Actual results:

Inconsistent creation of symlinks in /dev/disk/by-path for disks.

We also tested RHEL 8 kernel 4.18.0-184.el8.x86_64 and systemd-udev-239-27.el8.x86_64 on both Azure and Hyper-V, /dev/disk/by-path could display normally.

[root@bootp-73-199-166 ~]# ls /dev/disk/by-path -alh

lrwxrwxrwx. 1 root root   9 Mar  3 16:13 acpi-VMBUS:00-vmbus-3234cbc19d98492eb4628c35a7d4785e-lun-0 -> ../../sda
lrwxrwxrwx. 1 root root  10 Mar  3 16:13 acpi-VMBUS:00-vmbus-3234cbc19d98492eb4628c35a7d4785e-lun-0-part1 -> ../../sda1
lrwxrwxrwx. 1 root root  10 Mar  3 16:13 acpi-VMBUS:00-vmbus-3234cbc19d98492eb4628c35a7d4785e-lun-0-part2 -> ../../sda2
lrwxrwxrwx. 1 root root  10 Mar  3 16:13 acpi-VMBUS:00-vmbus-3234cbc19d98492eb4628c35a7d4785e-lun-0-part3 -> ../../sda3
lrwxrwxrwx. 1 root root   9 Mar  3 16:13 acpi-VMBUS:00-vmbus-3234cbc19d98492eb4628c35a7d4785e-lun-1 -> ../../sdb
lrwxrwxrwx. 1 root root  10 Mar  3 16:13 acpi-VMBUS:00-vmbus-3234cbc19d98492eb4628c35a7d4785e-lun-1-part1 -> ../../sdb1
lrwxrwxrwx. 1 root root   9 Mar  3 16:13 acpi-VMBUS:00-vmbus-3234cbc19d98492eb4628c35a7d4785e-lun-2 -> ../../sdc
lrwxrwxrwx. 1 root root   9 Mar  3 16:13 acpi-VMBUS:00-vmbus-e6f6aa9e972b43c08087db6d9c0b0428-lun-0 -> ../../sdd
lrwxrwxrwx. 1 root root   9 Mar  3 16:13 acpi-VMBUS:00-vmbus-e6f6aa9e972b43c08087db6d9c0b0428-lun-1 -> ../../sde

There is documention methioned that "The Path attribute is unreliable, and Red Hat does not recommend using it." 
@ https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/managing_storage_devices/index#assembly_overview-of-persistent-naming-attributes_managing-storage-devices.

In summary, this issue reproduces on RHEL 7.8 but not RHEL 8.2. May I ask should we fix it in RHEL 7.9 or only suggest users to use /dev/disk/by-id, or uuid but not by-path?

Thank you.

Comment 4 xuli 2020-03-03 09:02:23 UTC
(In reply to Jose Castillo from comment #2)
> I'm manipulating the systemd sources, specifically the functions:
> 
> sd_device_get_syspath()
> udev_device_get_syspath()
> builtin_path_id()
> handle_scsi_hyperv()
> 
> And reproducing with scsi_debug, because I don't have an AZURE vm and I
> haven't been able to find a procedure to get one. But if you have one
> available where I can test, I can progress a bit more with my debug code.

Hi Jose,

Have sent an email to you about Hyper-V RHEL 7 VM IP for your debugging.

If anything more needed, feel free to let us know.

Comment 11 Lukáš Nykrýn 2020-03-17 14:35:09 UTC
fix merged to github master branch -> https://github.com/systemd-rhel/rhel-7/pull/93

Comment 30 errata-xmlrpc 2020-09-29 20:32:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Low: systemd security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4007