RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2208039 - lvm handles trailing spaces in wwids differently in 9.2
Summary: lvm handles trailing spaces in wwids differently in 9.2
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: lvm2
Version: 9.2
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: David Teigland
QA Contact: cluster-qe
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-05-17 18:42 UTC by c.stackpole
Modified: 2023-11-07 11:28 UTC (History)
13 users (show)

Fixed In Version: lvm2-2.03.21-3.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-11-07 08:53:33 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
First - Boot issue (17.97 KB, image/jpeg)
2023-05-17 18:42 UTC, c.stackpole
no flags Details
Second - no var/home mounts, but swap is there (71.13 KB, image/jpeg)
2023-05-17 18:43 UTC, c.stackpole
no flags Details
Third - diff between system.devices and vgscan (36.82 KB, image/jpeg)
2023-05-17 18:43 UTC, c.stackpole
no flags Details
Fourth - delete system.devices as workaround (23.92 KB, image/jpeg)
2023-05-17 18:44 UTC, c.stackpole
no flags Details
Fifth - Differences when regenerating the file (38.75 KB, image/jpeg)
2023-05-17 20:16 UTC, c.stackpole
no flags Details
9.1 hexdump (80.94 KB, image/jpeg)
2023-05-17 21:47 UTC, c.stackpole
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker CLUSTERQE-6755 0 None None None 2023-06-08 21:16:23 UTC
Red Hat Issue Tracker RHELPLAN-157538 0 None None None 2023-05-17 18:42:56 UTC
Red Hat Product Errata RHBA-2023:6633 0 None None None 2023-11-07 08:53:59 UTC

Description c.stackpole 2023-05-17 18:42:29 UTC
Created attachment 1965214 [details]
First - Boot issue

Description of problem:

After update to 9.2, LVM partitions can not be mounted due to bad information in /etc/lvm/devices/system.devices .

This has been challenging to replicate in some environments, however, it seems that VirtualBox can replicated this problem 100% of the times I, or those I've asked to verify, have tried. It does not seem to matter what the OS under VirtualBox is nor the version (all variations are recent/updated though). Replication on physical machine HAS occurred, but not reliably in my own testing.

Version-Release number of selected component (if applicable):
9.1 -> 9.2

How reproducible:

Every time one devices where it seems there are sys_wwid such as VirtualBox.

Steps to Reproduce:
For best replication:
Install VirtualBox.
Download RHEL 9.1 - click through the install and set a root password and manually adjust the partitions. While this isn't the only layout that has caused problems, it's the one that I've been able to trigger this issue on every single time. Size doesn't matter, just the LVM partition layout:
/boot 1GiB
/ 20 GiB
LVM /home 20 GiB
LVM swap 10 GiB
LVM /var 20 GiB

Let the installer reboot into the distro. Subscribe with auto-attach to get updates. [Might also want to snapshot. ;-)] 

Verify you still have 9.1
$ cat /etc/redhat-release
Red Hat Enterprise Linux release 9.1 (Plow)


$ dnf clean all && dnf update -y # should pull in 9.2

$ reboot

See first attached image for /var and /home having issues mounting.

Give password for maintenance. Then notice that /var and /home didn't mount, but swap did! ?? ?? *shrug* See second attached picture.

$ lsblk

Try to scan for lvm or mount lvm's. Should get an error about device last seen isn't found. See third attached picture [please ignore the different LVM ID's - I retested when I wanted more then one screenshot to share]

Now delete /etc/lvm/devices/system.devices and rescan. See fourth attached picture.

Now it is safe to reboot and continue using 9.2


Actual results:

Failed boot due to missing mount points

Expected results:

A successful boot with mount points where they should be.

Additional info:

Comment 1 c.stackpole 2023-05-17 18:43:02 UTC
Created attachment 1965215 [details]
Second - no var/home mounts, but swap is there

Comment 2 c.stackpole 2023-05-17 18:43:44 UTC
Created attachment 1965216 [details]
Third - diff between system.devices and vgscan

Comment 3 c.stackpole 2023-05-17 18:44:14 UTC
Created attachment 1965217 [details]
Fourth - delete system.devices as workaround

Comment 4 David Teigland 2023-05-17 20:09:44 UTC
Could you show the contents of /etc/lvm/devices/system.devices in the 9.1 system, then after upgrading to 9.2 recreate system.devices (using vgimportdevices -a) and show the new contents of the file created from 9.2?

This might be related to the repeating underscores in the device id.  In 9.2 lvm adopted the more traditional approach of condensing repeated underscores to a single underscore, while still working with the form of repeating underscores that had been used in 9.1.

Comment 5 c.stackpole 2023-05-17 20:16:00 UTC
Created attachment 1965242 [details]
Fifth - Differences when regenerating the file

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/configuring_and_managing_logical_volumes/limiting-lvm-device-visibility-and-usage_configuring-and-managing-logical-volumes


Using the above as a guideline, I deleted the old, rebooted into 9.2, then regenerated the file with lvmdevices per the documentation. See the fifth screenshot for differences.

I reset the VM back to 9.1 and manually removed the excess _'s from the /etc/lvm/devices/system.devices file - reboot into 9.2 worked as expected.

(it's also weird that 9.1 has VERSION=1.1.2 but 9.2 has VERSION=1.1.1 but that makes no difference at all in my testing.

I suspect that there's a script that removed the excess _'s but doesn't account for them if they are there from a previous version. I wonder if the VERSION change might be related to a regression? I don't know - maybe it's just weird but it is a difference.

Comment 6 c.stackpole 2023-05-17 20:17:44 UTC
Thanks for the response David! Sorry I didn't see it until I posted above. Looks like we are both on the same path and I agree with your hunch after my recent testing.

Comment 7 David Teigland 2023-05-17 20:53:09 UTC
I suspect it's the underscore at the end of the 9.1 wwid causing the problem.  There is probably a space or control character in that position which 9.2 is ignoring and which 9.1 is converting to an underscore.  That would require a fix in 9.2 to cope with the value created in 9.1.  We could confirm this with the output of:

hexdump -C /sys/block/sda/device/wwid
hexdump -C /sys/block/sda/device/vpd_pg80
hexdump -C /sys/block/sda/device/vpd_pg83

from 9.1 and 9.2.

I'm interested in both 9.1 and 9.2 in case the upgraded kernel has made a slight change to the data it's reporting (which we've seen before.)  The vpd files may not exist, which is fine, but I'd like to collect that in case they are helpful in understanding the wwid value.
Thanks

Comment 8 c.stackpole 2023-05-17 21:47:04 UTC
Created attachment 1965271 [details]
9.1 hexdump

The difference between 9.1 and 9.2 is exactly the same. Checking it again to see if I goofed something in this test.

Comment 9 c.stackpole 2023-05-17 21:51:59 UTC
Nope. Diffs are the same. Verified on the 9.1-fresh-install snapshot, the 9.1-post-update-but-before-reboot snapshot, and reboots into both the broken state and post-manual-edit-removing-_ 9.2.

Comment 10 David Teigland 2023-05-18 17:22:03 UTC
Thanks, I've written a test case to verify that the line feed character at the end of the wwid is the cause of the problem.  In 9.0/9.1 the line feed character was replaced by underscore, and in 9.2 the line feed character is ignored.  This causes the wwid string comparison to not match in 9.2, so the device is not found.  This will require a simple fix in 9.2.

Comment 11 David Teigland 2023-05-19 16:10:03 UTC
My diagnosis in comment 10 was incorrect.  The problem is the trailing space character at the end of the WWID.  In 9.0 and 9.1 lvm would replace that trailing space with _, but in in 9.2 trailing spaces are ignored.  This causes the unmatching WWIDs which causes lvm to ignore the device.  The fix for 9.2 will be to ignore trailing _ in IDNAME values.

Comment 13 c.stackpole 2023-05-21 00:52:10 UTC
Thanks for your effort in figuring this out!

Comment 14 David Teigland 2023-05-22 16:23:34 UTC
The following test uses fake sysfs dirs containing the problematic wwid and pointing lvm at that.

1. In the lvm.conf devices section.
device_id_sysfs_dir = "/test/sys/"

2. Get major:minor of the device to use in testing.
$ ls -l /dev/sdh
brw-rw----. 1 root disk 8, 112 May 11 14:46 /dev/sdh

3. Set up sysfs files for major:minor.
$ mkdir -p /test/sys/dev/block/8:112/device
$ echo "t10.ATA     VBOX HARDDISK                           VB9c10d318-188d9ebc " > /test/sys/dev/block/8\:112/device/wwid

4. Check that the latest lvm version reduces repeated spaces and ignores trailing spaces.
$ pvcreate /dev/sdh
$ rm /etc/lvm/devices/system.devices
$ lvmdevices --adddev /dev/sdh
$ grep sdh /etc/lvm/devices/system.devices 
IDTYPE=sys_wwid IDNAME=t10.ATA_VBOX_HARDDISK_VB9c10d318-188d9ebc DEVNAME=/dev/sdh PVID=YZEDsWin3kJsoLrvlFj6f8vWNGS7hPdn
$ pvs -o name,deviceid /dev/sdh
  PV         DeviceID
  /dev/sdh           
$ vgcreate hh /dev/sdh
  Volume group "hh" successfully created with system ID dct-rhel9-cluster-n0
$ pvs -o name,deviceid /dev/sdh
  PV         DeviceID                                 
  /dev/sdh   t10.ATA_VBOX_HARDDISK_VB9c10d318-188d9ebc

5. Check that the latest lvm version recognizes previous IDNAME format with repeated underscores and trailing underscore (the repeated underscores are correctly recognized in 9.2, but not the trailing underscore.)
$ sed -e 's/t10.ATA_VBOX_HARDDISK_VB9c10d318-188d9ebc/t10.ATA_____VBOX_HARDDISK___________________________VB9c10d318-188d9ebc_/g' -i /etc/lvm/devices/system.devices
$ pvs /dev/sdh
  PV         VG Fmt  Attr PSize    PFree   
  /dev/sdh   hh lvm2 a--  1020.00m 1020.00m

Comment 15 Simrat Pal Singh Satia 2023-05-31 15:57:16 UTC
We are able to reproduce the issue for RHEL9.2 new installations as well. 

Steps to reproduce: 
1. Go to Azure Marketplace and deploy image of the following reference: 
"imageReference": {
                "publisher": "RedHat",
                "offer": "RHEL",
                "sku": "9_2",
                "version": "Latest",
                "exactVersion": "9.2.2023052501"
            },

2. Perform 'pvdisplay' command.

Error: [root@AYAN-2905-RHEL92 ~]# pvdisplay
  Devices file sys_wwid naa.6002248040317a85abbd3ecc0f15951e PVID qGGXfRI6eVu8m8W9n6g5UbCHyqhXy6NO last seen on /dev/sda4 not found.

ISO consumed to build the image: RHEL-9.2-RC-1.0

@

Comment 16 David Teigland 2023-05-31 16:04:18 UTC
(In reply to Simrat Pal Singh Satia from comment #15)
> We are able to reproduce the issue for RHEL9.2 new installations as well. 
> 
> Steps to reproduce: 
> 1. Go to Azure Marketplace and deploy image of the following reference: 
> "imageReference": {
>                 "publisher": "RedHat",
>                 "offer": "RHEL",
>                 "sku": "9_2",
>                 "version": "Latest",
>                 "exactVersion": "9.2.2023052501"
>             },




> 
> 2. Perform 'pvdisplay' command.
> 
> Error: [root@AYAN-2905-RHEL92 ~]# pvdisplay
>   Devices file sys_wwid naa.6002248040317a85abbd3ecc0f15951e PVID
> qGGXfRI6eVu8m8W9n6g5UbCHyqhXy6NO last seen on /dev/sda4 not found.
> 
> ISO consumed to build the image: RHEL-9.2-RC-1.0
> 
> @

Comment 17 David Teigland 2023-05-31 16:14:10 UTC
(In reply to Simrat Pal Singh Satia from comment #15)
> We are able to reproduce the issue for RHEL9.2 new installations as well. 

When this new 9.2 install is finished, what is the content of /etc/lvm/devices/system.devices?

> Steps to reproduce: 
> 1. Go to Azure Marketplace and deploy image of the following reference: 
> "imageReference": {
>                 "publisher": "RedHat",
>                 "offer": "RHEL",
>                 "sku": "9_2",
>                 "version": "Latest",
>                 "exactVersion": "9.2.2023052501"
>             },

I don't understand this step, it's not something I've ever seen before.

> # pvdisplay
>   Devices file sys_wwid naa.6002248040317a85abbd3ecc0f15951e PVID qGGXfRI6eVu8m8W9n6g5UbCHyqhXy6NO last seen on /dev/sda4 not found.

The device with this wwid must have been used during the 9.2 installation.  Can you check what the current wwid of the /dev/sda is?

$ cat /sys/block/sda/device/wwid

Comment 18 David Teigland 2023-05-31 16:18:29 UTC
(In reply to Simrat Pal Singh Satia from comment #15)
> We are able to reproduce the issue for RHEL9.2 new installations as well. 

This should be tracked in a new bug, it's not related to the handling of spaces/underscores in wwid.

Comment 19 Simrat Pal Singh Satia 2023-06-05 14:15:52 UTC
[root@rhel92 azureuser]# cat  /etc/lvm/devices/system.devices
# LVM uses devices listed in this file.
# Created by LVM command lvmdevices pid 3043 at Mon May 22 14:26:33 2023
VERSION=1.1.4
IDTYPE=sys_wwid IDNAME=naa.6002248040317a85abbd3ecc0f15951e DEVNAME=/dev/sda4 PVID=qGGXfRI6eVu8m8W9n6g5UbCHyqhXy6NO PART=4

Please let me know if this required a new bug or has similar RCA to the existing bug. 

Thanks,
Simrat

Comment 20 David Teigland 2023-06-05 14:56:42 UTC
(In reply to Simrat Pal Singh Satia from comment #19)
> [root@rhel92 azureuser]# cat  /etc/lvm/devices/system.devices
> # LVM uses devices listed in this file.
> # Created by LVM command lvmdevices pid 3043 at Mon May 22 14:26:33 2023
> VERSION=1.1.4
> IDTYPE=sys_wwid IDNAME=naa.6002248040317a85abbd3ecc0f15951e
> DEVNAME=/dev/sda4 PVID=qGGXfRI6eVu8m8W9n6g5UbCHyqhXy6NO PART=4

That looks normal.

> Please let me know if this required a new bug or has similar RCA to the
> existing bug. 

I don't think you're dealing with trailing spaces on wwids.  It sounds like you're problem is somehow related to creating/unconfiguring/recreating OS images, so you should create a new bug with the details surrounding how these OS images are meant to be configured for the instances running them.  When OS images are copied/cloned, the system.devices file cannot be copied, and needs to be recreated by each OS instance for the device it's using.

Comment 28 Corey Marthaler 2023-07-26 19:09:39 UTC
Marking Verified:Tested in the latest build. The trailing spaces are no longer present.


kernel-5.14.0-332.el9    BUILT: Mon Jun 26 06:16:51 PM CEST 2023
lvm2-2.03.21-3.el9    BUILT: Thu Jul 13 08:50:26 PM CEST 2023
lvm2-libs-2.03.21-3.el9    BUILT: Thu Jul 13 08:50:26 PM CEST 2023


[root@virt-009 ~]# grep device_id_sysfs_dir /etc/lvm/lvm.conf 
        device_id_sysfs_dir = "/test/sys/"


[root@virt-009 ~]# ls -l /dev/sdf
brw-rw----. 1 root disk 8, 80 Jul 25 21:36 /dev/sdf
[root@virt-009 ~]# mkdir -p /test/sys/dev/block/8:80/device
[root@virt-009 ~]# echo "t10.ATA     VBOX HARDDISK                           VB9c10d318-188d9ebc " > /test/sys/dev/block/8\:80/device/wwid


[root@virt-009 ~]# pvcreate /dev/sdf
  Physical volume "/dev/sdf" successfully created.
[root@virt-009 ~]# cat /etc/lvm/devices/system.devices
# LVM uses devices listed in this file.
# Created by LVM command pvcreate pid 2576021 at Wed Jul 26 21:04:10 2023
VERSION=1.1.13063
IDTYPE=devname IDNAME=/dev/vda2 DEVNAME=/dev/vda2 PVID=efoHgim4mazczPn658nUoSWQtlHz6kjn PART=2
IDTYPE=sys_wwid IDNAME=t10.ATA_VBOX_HARDDISK_VB9c10d318-188d9ebc DEVNAME=/dev/sdf PVID=buapmDqciGJPDooSQ5wYXCo9rXRSiQvw


[root@virt-009 ~]# rm /etc/lvm/devices/system.devices
rm: remove regular file '/etc/lvm/devices/system.devices'? y
[root@virt-009 ~]# lvmdevices --adddev /dev/sdf
[root@virt-009 ~]# cat /etc/lvm/devices/system.devices
# LVM uses devices listed in this file.
# Created by LVM command lvmdevices pid 2576042 at Wed Jul 26 21:04:36 2023
VERSION=1.1.1
IDTYPE=sys_wwid IDNAME=t10.ATA_VBOX_HARDDISK_VB9c10d318-188d9ebc DEVNAME=/dev/sdf PVID=buapmDqciGJPDooSQ5wYXCo9rXRSiQvw

Comment 32 Mike 2023-10-17 19:50:09 UTC
Any ETA on this?

Comment 33 David Teigland 2023-10-17 20:09:21 UTC
The change is in 9.3.

Comment 35 errata-xmlrpc 2023-11-07 08:53:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (lvm2 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6633


Note You need to log in before you can comment on or make changes to this bug.