Bug 1609553 - [RFE] Automatically map /dev paths to stable /dev/disk/by-* paths for long term use
Summary: [RFE] Automatically map /dev paths to stable /dev/disk/by-* paths for long te...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: heketi
Version: cns-3.9
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: OCS 3.11.z Batch Update 4
Assignee: John Mulligan
QA Contact: Rachael
URL:
Whiteboard:
: 1569317 (view as bug list)
Depends On: 1707789 1725798
Blocks: 1622458
TreeView+ depends on / blocked
 
Reported: 2018-07-29 11:43 UTC by Niels de Vos
Modified: 2022-03-13 15:18 UTC (History)
22 users (show)

Fixed In Version: heketi-9.0.0-2.el7rhgs
Doc Type: Enhancement
Doc Text:
With this update, Heketi tracks additional metadata associated with disk devices even if the path of the device changes. The outputs of some commands have been updated to reflect the additional metadata.
Clone Of:
Environment:
Last Closed: 2019-10-30 12:34:04 UTC
Embargoed:
knarra: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github heketi heketi issues 1371 0 'None' closed Identify block devices by a stable UUID, not boot dependent /dev/sdX 2020-11-04 14:45:11 UTC
Red Hat Product Errata RHSA-2019:3255 0 None None None 2019-10-30 12:34:25 UTC

Description Niels de Vos 2018-07-29 11:43:10 UTC
Description of problem:

On many systems, specially in VMs the /dev/sdX names can change across reboots. Once Heketi initializes the devices, it should also fetch a stable device-node name (i.e. /dev/disk/by-id/) and persist it in the database.

All low-level operations that interact with device-nodes directly should use the stable device-node names in order to prevent issues when friendly-names get assigned more dynamically.

Comment 2 Michael Adam 2018-09-19 21:52:22 UTC
RFE is out of scope for the short 3.11.0 timeframe.

Moving out.

Niels: feel free to disagree and propose to include, if you can/want to fix it.

Comment 21 John Mulligan 2019-04-16 13:49:13 UTC
Initial patches  posted at https://github.com/heketi/heketi/pull/1568

Comment 22 John Mulligan 2019-04-16 13:51:03 UTC
We have bz 1569317 which is nearly a duplicate of this, and the same patches will apply to both. Any value in keeping both bzs around or should we simply mark it a duplicate?

Comment 23 Raghavendra Talur 2019-04-16 18:22:17 UTC
*** Bug 1569317 has been marked as a duplicate of this bug. ***

Comment 24 Raghavendra Talur 2019-04-16 18:23:10 UTC
Closed bug 1569317 as a duplicate of this bug.

Comment 28 John Mulligan 2019-06-21 18:49:54 UTC
Verification steps:

This needs to be verified on platforms that can easily change device paths across reboots. One approach would be to verify on hyperV or Azure cloud instances as these vm platforms frequently change the paths of the devices.


Set up a typical cluster. Ensure that at least two devices exist on the node you will be testing against. Ensure these devices are part of the heketi topology. Record the "Vg id" for these devcies along with the current paths.
Create as many volumes as needed such that there are at least 4-5 bricks on every device, but that the devices are at least half empty.

Reboot the node as many times as needed until the "vg id" path combination is different from the initial boot.


Verify that 'heketi-cli topology info lists "known paths" instead of "path". The actual paths may vary.


Verify that the device can be removed from heketi using the commands
'heketi-cli device disable <device_id>'
'heketi-cli device remove <device_id>'
'heketi-cli device delete <device_id>'


If possible it would be preferable to test with both converged and independent mode setups. This will test that the feature's dependencies work in both types of deployment.

Comment 34 RamaKasturi 2019-07-01 07:59:51 UTC
Hello John,

   From the doc text i read that "The output of some commands have been updated to reflect the additional metadata when available." Can you please tell us what are the commands which were updated so that we can make sure that validation is covered for those as well ?

Thanks
kasturi

Comment 35 John Mulligan 2019-07-03 15:05:17 UTC
1. Topology info:
Looks something like:
        Devices:
                Id:88dcfbc3aac263e12cf0f3d5cdaeadd4   State:online    Size (GiB):499     Used (GiB):0       Free (GiB):499     
                        Known Paths: /dev/disk/by-path/virtio-pci-0000:00:04.0 /dev/vdb

                        Bricks:
                Id:93458e796be5556943421e7f15c74916   State:online    Size (GiB):499     Used (GiB):0       Free (GiB):499     
                        Known Paths: /dev/disk/by-path/virtio-pci-0000:00:05.0 /dev/vdc

Instead of listing *the* name/path of the device we list that last known /dev/ paths we were able to auto-detect. Note that none of these paths are strictly necessary with the changes in place. They are just informational.

2. Device info:
./heketi-cli device info 88dcfbc3aac263e12cf0f3d5cdaeadd4
Device Id: 88dcfbc3aac263e12cf0f3d5cdaeadd4
State: online
Size (GiB): 499
Used (GiB): 0
Free (GiB): 499
Create Path: /dev/vdb
Physical Volume UUID: uu6s4R-6hnS-YY45-OBRl-gPfr-v6Wi-rFQoJc
Known Paths: /dev/disk/by-path/virtio-pci-0000:00:04.0 /dev/vdb
Bricks:

Similar to topology info above but since we have more screen space here we also list the PV UUID. Older upgraded systems will not always have a uuid FYI. The path used to create the device is also shown.

Comment 38 RamaKasturi 2019-07-12 06:37:47 UTC
Hello John,

   I have asked rachael to port the test cases to polarion and share it with you which would be more user friendly and clear and she is currently working on doing the same.
Rachael, can you please update the bug with the info once you are done ?

Thanks
kasturi

Comment 50 errata-xmlrpc 2019-10-30 12:34:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:3255


Note You need to log in before you can comment on or make changes to this bug.