Description of problem: On many systems, specially in VMs the /dev/sdX names can change across reboots. Once Heketi initializes the devices, it should also fetch a stable device-node name (i.e. /dev/disk/by-id/) and persist it in the database. All low-level operations that interact with device-nodes directly should use the stable device-node names in order to prevent issues when friendly-names get assigned more dynamically.
RFE is out of scope for the short 3.11.0 timeframe. Moving out. Niels: feel free to disagree and propose to include, if you can/want to fix it.
Initial patches posted at https://github.com/heketi/heketi/pull/1568
We have bz 1569317 which is nearly a duplicate of this, and the same patches will apply to both. Any value in keeping both bzs around or should we simply mark it a duplicate?
*** Bug 1569317 has been marked as a duplicate of this bug. ***
Closed bug 1569317 as a duplicate of this bug.
Verification steps: This needs to be verified on platforms that can easily change device paths across reboots. One approach would be to verify on hyperV or Azure cloud instances as these vm platforms frequently change the paths of the devices. Set up a typical cluster. Ensure that at least two devices exist on the node you will be testing against. Ensure these devices are part of the heketi topology. Record the "Vg id" for these devcies along with the current paths. Create as many volumes as needed such that there are at least 4-5 bricks on every device, but that the devices are at least half empty. Reboot the node as many times as needed until the "vg id" path combination is different from the initial boot. Verify that 'heketi-cli topology info lists "known paths" instead of "path". The actual paths may vary. Verify that the device can be removed from heketi using the commands 'heketi-cli device disable <device_id>' 'heketi-cli device remove <device_id>' 'heketi-cli device delete <device_id>' If possible it would be preferable to test with both converged and independent mode setups. This will test that the feature's dependencies work in both types of deployment.
Hello John, From the doc text i read that "The output of some commands have been updated to reflect the additional metadata when available." Can you please tell us what are the commands which were updated so that we can make sure that validation is covered for those as well ? Thanks kasturi
1. Topology info: Looks something like: Devices: Id:88dcfbc3aac263e12cf0f3d5cdaeadd4 State:online Size (GiB):499 Used (GiB):0 Free (GiB):499 Known Paths: /dev/disk/by-path/virtio-pci-0000:00:04.0 /dev/vdb Bricks: Id:93458e796be5556943421e7f15c74916 State:online Size (GiB):499 Used (GiB):0 Free (GiB):499 Known Paths: /dev/disk/by-path/virtio-pci-0000:00:05.0 /dev/vdc Instead of listing *the* name/path of the device we list that last known /dev/ paths we were able to auto-detect. Note that none of these paths are strictly necessary with the changes in place. They are just informational. 2. Device info: ./heketi-cli device info 88dcfbc3aac263e12cf0f3d5cdaeadd4 Device Id: 88dcfbc3aac263e12cf0f3d5cdaeadd4 State: online Size (GiB): 499 Used (GiB): 0 Free (GiB): 499 Create Path: /dev/vdb Physical Volume UUID: uu6s4R-6hnS-YY45-OBRl-gPfr-v6Wi-rFQoJc Known Paths: /dev/disk/by-path/virtio-pci-0000:00:04.0 /dev/vdb Bricks: Similar to topology info above but since we have more screen space here we also list the PV UUID. Older upgraded systems will not always have a uuid FYI. The path used to create the device is also shown.
Hello John, I have asked rachael to port the test cases to polarion and share it with you which would be more user friendly and clear and she is currently working on doing the same. Rachael, can you please update the bug with the info once you are done ? Thanks kasturi
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:3255