+++ This bug was initially created as a clone of Bug #2052618 +++ Description of problem: After creating a localvolume referencing devices at /dev/disk/by-path/ccw-0.0.0004 , rebooting nodes causes duplicate pvs to be created if device switched locations in kernel assigned device names (/dev/vd*). The generated pv paths reference these locations, not by-path. Version-Release number of selected component (if applicable): Multiple versions, the latest being "ocp 4.10.0-rc.1" and lso version "4.10.0-202202071841 How reproducible: Anytime the real device changes path. Steps to Reproduce: 1. Create local volume referencing disks by /dev/disk/by-path 2. Reboot nodes, perhaps multiple times to force /dev/vd* path change Actual results: `oc get pv` shows more pvs than actual devices present Expected results: When referenced disk path, changing /dev/vd* locations should not affect the persistent volumes. localvolume: apiVersion: "local.storage.openshift.io/v1" kind: "LocalVolume" metadata: name: "local-disks" namespace: "openshift-local-storage" spec: nodeSelector: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - worker-0.pok-93.ocptest.pok.stglabs.ibm.com - worker-1.pok-93.ocptest.pok.stglabs.ibm.com storageClassDevices: - storageClassName: "lso-fs" volumeMode: Filesystem fsType: ext4 devicePaths: - /dev/disk/by-path/ccw-0.0.0004 PV Dump: ``` apiVersion: v1 items: - apiVersion: v1 kind: PersistentVolume metadata: annotations: pv.kubernetes.io/provisioned-by: local-volume-provisioner-worker-0.pok-93.ocptest.pok.stglabs.ibm.com-e4947a81-cf42-48d7-9819-ed77c5759955 storage.openshift.com/device-name: vdc creationTimestamp: "2022-02-09T16:15:10Z" finalizers: - kubernetes.io/pv-protection labels: kubernetes.io/hostname: worker-0.pok-93.ocptest.pok.stglabs.ibm.com storage.openshift.com/local-volume-owner-name: local-disks storage.openshift.com/local-volume-owner-namespace: openshift-local-storage storage.openshift.com/owner-kind: LocalVolume storage.openshift.com/owner-name: local-disks storage.openshift.com/owner-namespace: openshift-local-storage name: local-pv-7694dd7d ownerReferences: - apiVersion: v1 kind: Node name: worker-0.pok-93.ocptest.pok.stglabs.ibm.com uid: e4947a81-cf42-48d7-9819-ed77c5759955 resourceVersion: "978317" uid: 274b1b63-3040-4653-97e2-bb29d9dacac2 spec: accessModes: - ReadWriteOnce capacity: storage: 20Gi local: fsType: ext4 path: /mnt/local-storage/lso-fs/vdc nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - worker-0.pok-93.ocptest.pok.stglabs.ibm.com persistentVolumeReclaimPolicy: Delete storageClassName: lso-fs volumeMode: Filesystem status: phase: Available - apiVersion: v1 kind: PersistentVolume metadata: annotations: pv.kubernetes.io/provisioned-by: local-volume-provisioner-worker-0.pok-93.ocptest.pok.stglabs.ibm.com-e4947a81-cf42-48d7-9819-ed77c5759955 storage.openshift.com/device-name: vdb creationTimestamp: "2022-02-09T16:08:51Z" finalizers: - kubernetes.io/pv-protection labels: kubernetes.io/hostname: worker-0.pok-93.ocptest.pok.stglabs.ibm.com storage.openshift.com/local-volume-owner-name: local-disks storage.openshift.com/local-volume-owner-namespace: openshift-local-storage storage.openshift.com/owner-kind: LocalVolume storage.openshift.com/owner-name: local-disks storage.openshift.com/owner-namespace: openshift-local-storage name: local-pv-cfd12a48 ownerReferences: - apiVersion: v1 kind: Node name: worker-0.pok-93.ocptest.pok.stglabs.ibm.com uid: e4947a81-cf42-48d7-9819-ed77c5759955 resourceVersion: "975044" uid: 5139320d-81de-4e5f-8e64-a48f80e0be98 spec: accessModes: - ReadWriteOnce capacity: storage: 20Gi local: fsType: ext4 path: /mnt/local-storage/lso-fs/vdb nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - worker-0.pok-93.ocptest.pok.stglabs.ibm.com persistentVolumeReclaimPolicy: Delete storageClassName: lso-fs volumeMode: Filesystem status: phase: Available - apiVersion: v1 kind: PersistentVolume metadata: annotations: pv.kubernetes.io/provisioned-by: local-volume-provisioner-worker-1.pok-93.ocptest.pok.stglabs.ibm.com-a29a3b5b-cb66-4439-b107-ce1eb637eb17 storage.openshift.com/device-name: vdb creationTimestamp: "2022-02-09T16:08:45Z" finalizers: - kubernetes.io/pv-protection labels: kubernetes.io/hostname: worker-1.pok-93.ocptest.pok.stglabs.ibm.com storage.openshift.com/local-volume-owner-name: local-disks storage.openshift.com/local-volume-owner-namespace: openshift-local-storage storage.openshift.com/owner-kind: LocalVolume storage.openshift.com/owner-name: local-disks storage.openshift.com/owner-namespace: openshift-local-storage name: local-pv-f8753489 ownerReferences: - apiVersion: v1 kind: Node name: worker-1.pok-93.ocptest.pok.stglabs.ibm.com uid: a29a3b5b-cb66-4439-b107-ce1eb637eb17 resourceVersion: "975004" uid: 6f8e3d97-17ba-4f68-bdfc-c76f63a2dd99 spec: accessModes: - ReadWriteOnce capacity: storage: 20Gi local: fsType: ext4 path: /mnt/local-storage/lso-fs/vdb nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - worker-1.pok-93.ocptest.pok.stglabs.ibm.com persistentVolumeReclaimPolicy: Delete storageClassName: lso-fs volumeMode: Filesystem status: phase: Available kind: List metadata: resourceVersion: "" selfLink: "" ``` Additional info: persistent volume paths show /vd* while localvolume path shows /dev/disk/by-path/ccw-0.0.0004 ``` oc get pv -o yaml | grep path path: /mnt/local-storage/lso-fs/vdc path: /mnt/local-storage/lso-fs/vdb path: /mnt/local-storage/lso-fs/vdb ``` oc get localvolume -o yaml | grep path {"apiVersion":"local.storage.openshift.io/v1","kind":"LocalVolume","metadata":{"annotations":{},"name":"local-disks","namespace":"openshift-local-storage"},"spec":{"nodeSelector":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"kubernetes.io/hostname","operator":"In","values":["worker-0.pok-93.ocptest.pok.stglabs.ibm.com","worker-1.pok-93.ocptest.pok.stglabs.ibm.com"]}]}]},"storageClassDevices":[{"devicePaths":["/dev/disk/by-path/ccw-0.0.0004"],"fsType":"ext4","storageClassName":"lso-fs","volumeMode":"Filesystem"}]}} - /dev/disk/by-path/ccw-0.0.0004 --- Additional comment from Hemant Kumar on 2022-02-09 17:17:12 UTC --- LSO prefers paths in `/dev/disk/by-id` rather than `/dev/disk/by-path`. There are historical reasons for that and although in general I agree that - it might be suitable to fall back to `/dev/disk/by-path` if `/dev/disk/by-id` is not suitable, but we are not currently not doing that. So, can you please try to reproduce this issue by using `/dev/disk/by-id` ? We can still fix this issue but if it is possible to use `by-id` path, it will at least unblock you. --- Additional comment from Tom Dale on 2022-02-09 17:55:59 UTC --- In the environment I've been using (zKVM with qcow2 disk) my disk is only mapped to /dev/disk/by-path, not by-id. I'll continue to look for other solutions. ``` udevadm info /dev/vdb P: /devices/css0/0.0.0002/0.0.0004/virtio2/block/vdb N: vdb S: disk/by-path/ccw-0.0.0004 E: DEVLINKS=/dev/disk/by-path/ccw-0.0.0004 E: DEVNAME=/dev/vdb E: DEVPATH=/devices/css0/0.0.0002/0.0.0004/virtio2/block/vdb E: DEVTYPE=disk E: ID_PATH=ccw-0.0.0004 E: ID_PATH_TAG=ccw-0_0_0004 E: MAJOR=252 E: MINOR=16 E: SUBSYSTEM=block ``` --- Additional comment from Tom Dale on 2022-02-11 16:17:32 UTC --- I tested on zVM clusters that do have device links in /dev/disk/by-path and these do work as expected with lso. Whereas on zKVM using qcow or virtual device passthrough, both are not giving by-id links. I've also tried using /dev/disk/by-partuuid for localVolume devicePath, but the same problem as when using by-path occurs. While we wait for a fix could we add documentation that only "by-id" is supported? Currently in https://docs.openshift.com/container-platform/4.9/storage/persistent_storage/persistent-storage-local.html#local-volume-cr_persistent-storage-local the docs say "local disks filepath to the LocalVolume resource, such as /dev/disk/by-id/wwn" . Can we add a note that only "by-id" works? Should I create a separate bugzilla against the documentation? --- Additional comment from Jan Safranek on 2022-03-01 15:22:29 UTC --- LSO should use device name from /dev/disk/by-id, if it's available. If not, then the device name from LocalVolume CR should be used and not /dev/sdX. --- Additional comment from Jan Safranek on 2022-03-04 10:19:51 UTC --- --- Additional comment from OpenShift Automated Release Tooling on 2022-03-07 19:50:25 UTC --- Elliott changed bug status from MODIFIED to ON_QA. This bug is expected to ship in the next 4.11 release created. --- Additional comment from Chao Yang on 2022-03-09 14:16:42 UTC --- Hi @hekumar, I am trying to verify this bug with below steps. I could not get device path links here, but using device name may not correct. Could you give some suggestion? 1. exec `udevadm control -s` to stop execute events 2. attach volume udevadm info /dev/nvme2n1 P: /devices/pci0000:00/0000:00:1e.0/nvme/nvme2/nvme2n1 N: nvme2n1 E: DEVNAME=/dev/nvme2n1 E: DEVPATH=/devices/pci0000:00/0000:00:1e.0/nvme/nvme2/nvme2n1 E: DEVTYPE=disk E: MAJOR=259 E: MINOR=6 E: SUBSYSTEM=block 3. Create localvolume with device name oc get localvolume example -o json | jq .spec { "logLevel": "Normal", "managementState": "Managed", "storageClassDevices": [ { "devicePaths": [ "/dev/nvme1n1" ], "fsType": "ext4", "storageClassName": "test1", "volumeMode": "Filesystem" }, { "devicePaths": [ "/dev/nvme2n1" ], "fsType": "ext4", "storageClassName": "test2", "volumeMode": "Filesystem" } ] } 4.oc get pv/local-pv-f420c11b -o yaml | grep path path: /mnt/local-storage/test2/nvme2n1 --- Additional comment from Hemant Kumar on 2022-03-09 15:07:40 UTC --- I tested this by creating LVM volumes and having udev rule that creates disk-ids for LVM volumes disabled. I think rule in question is - /lib/udev/rules.d/13-dm-disk.rules , you can copy it to `/etc/udev/rules.d` folder and modify it. Tom Dale - Can you verify https://github.com/openshift/local-storage-operator/pull/328 fix in your environment btw? You should be able to build an image https://github.com/openshift/local-storage-operator/blob/master/hack/sync_bundle using it. --- Additional comment from Chao Yang on 2022-03-10 02:24:27 UTC --- 1.attach volume nvme1n1 2.check /dev/disk/by-id lrwxrwxrwx. 1 root root 13 Mar 9 03:24 nvme-nvme.1d0f-766f6c3038396438336233646338656165393135-416d617a6f6e20456c617374696320426c6f636b2053746f7265-00000001 -> ../../nvme1n1 lrwxrwxrwx. 1 root root 13 Mar 9 03:24 nvme-Amazon_Elastic_Block_Store_vol089d83b3dc8eae915 -> ../../nvme1n1 3.exec `udevadm control -s` 4.delete symlink in /dev/disk/by-id for volume /dev/nvme1n1 5. ls -lrt /dev/disk/by-path | grep nvme1n1 lrwxrwxrwx. 1 root root 13 Mar 9 03:24 pci-0000:00:1f.0-nvme-1 -> ../../nvme1n1 6.Create localvolume with device path oc get localvolume example -o json | jq .spec { "logLevel": "Normal", "managementState": "Managed", "storageClassDevices": [ { "devicePaths": [ "/dev/disk/by-path/pci-0000:00:1f.0-nvme-1" ], "fsType": "ext4", "storageClassName": "foobar", "volumeMode": "Filesystem" } ] } 7.check pv path oc get pv -o yaml | grep path path: /mnt/local-storage/foobar/pci-0000:00:1f.0-nvme-1 8. ls -lrt /mnt/local-storage/foobar/ total 0 lrwxrwxrwx. 1 root root 41 Mar 10 01:58 pci-0000:00:1f.0-nvme-1 -> /dev/disk/by-path/pci-0000:00:1f.0-nvme-1 udevadm info /dev/nvme1n1 P: /devices/pci0000:00/0000:00:1f.0/nvme/nvme1/nvme1n1 N: nvme1n1 S: disk/by-id/nvme-Amazon_Elastic_Block_Store_vol089d83b3dc8eae915 S: disk/by-id/nvme-nvme.1d0f-766f6c3038396438336233646338656165393135-416d617a6f6e20456c617374696320426c6f636b2053746f7265-00000001 S: disk/by-path/pci-0000:00:1f.0-nvme-1 E: DEVLINKS=/dev/disk/by-path/pci-0000:00:1f.0-nvme-1 /dev/disk/by-id/nvme-nvme.1d0f-766f6c3038396438336233646338656165393135-416d617a6f6e20456c617374696320426c6f636b2053746f7265-00000001 /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol089d83b3dc8eae915 E: DEVNAME=/dev/nvme1n1 E: DEVPATH=/devices/pci0000:00/0000:00:1f.0/nvme/nvme1/nvme1n1 E: DEVTYPE=disk E: ID_MODEL=Amazon Elastic Block Store E: ID_PATH=pci-0000:00:1f.0-nvme-1 E: ID_PATH_TAG=pci-0000_00_1f_0-nvme-1 E: ID_SERIAL=Amazon Elastic Block Store_vol089d83b3dc8eae915 E: ID_SERIAL_SHORT=vol089d83b3dc8eae915 E: ID_WWN=nvme.1d0f-766f6c3038396438336233646338656165393135-416d617a6f6e20456c617374696320426c6f636b2053746f7265-00000001 E: ID_WWN_WITH_EXTENSION=nvme.1d0f-766f6c3038396438336233646338656165393135-416d617a6f6e20456c617374696320426c6f636b2053746f7265-00000001 E: MAJOR=259 E: MINOR=5 E: SUBSYSTEM=block E: TAGS=:systemd: E: USEC_INITIALIZED=2743697481 Tested with local-storage-operator.4.11.0-202203071904 @hekumar can you help to double confirm this way is correct or not? --- Additional comment from Hemant Kumar on 2022-03-10 10:59:33 UTC --- Thanks for testing that. This seems correct. --- Additional comment from Red Hat Bugzilla on 2022-05-05 07:47:22 UTC --- remove performed by PnT Account Manager <pnt-expunge> --- Additional comment from errata-xmlrpc on 2022-06-15 17:49:34 UTC --- This bug has been added to advisory RHEA-2022:5069 by OpenShift Release Team Bot (ocp-build/buildvm.openshift.eng.bos.redhat.com)
passed with local-storage-operator.4.9.0-202207130116
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.9.45 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5879