Description of problem: Additional NVMe disks are not present in CoreOS(COS) until the ODF node is rebooted. Additional info: When adding an additional NVMe disk to a deployed ODF storage node as part of a scale up procedure using local storage operator[0] the additional NVMe disk is not visible in the table of block devices and until the node has been rebooted. This presents with the following challenges: 1. The disk discovery feature does not detect newly added NVMe devices until the host is rebooted. 2. The reboot of an ODF node could be considered a disruptive operation in production environments without properly cordoning and draining of the storage node prior to the reboot. COS does have the rescan-scsi-bus.sh script from sg3_utils RPM however that does not work for NVMe devices[1] so we're opening this BZ to see if there is an undocumented way to rescan for additional NVMe disks on an runing ODF COS node without requiring a reboot or if we should be pursuing an RFE to add the "nvme-cli" utility to COS. Version-Release number of selected component (if applicable): OCP/OCS/ODF 4.8 --- NAME="Red Hat Enterprise Linux CoreOS" VERSION="48.84.202112212304-0" ID="rhcos" ID_LIKE="rhel fedora" VERSION_ID="4.8" PLATFORM_ID="platform:el8" PRETTY_NAME="Red Hat Enterprise Linux CoreOS 48.84.202112212304-0 (Ootpa)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:8::coreos" HOME_URL="https://www.redhat.com/" DOCUMENTATION_URL="https://docs.openshift.com/container-platform/4.8/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform" REDHAT_BUGZILLA_PRODUCT_VERSION="4.8" REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform" REDHAT_SUPPORT_PRODUCT_VERSION="4.8" OPENSHIFT_VERSION="4.8" RHEL_VERSION="8.4" OSTREE_VERSION='48.84.202112212304-0' --- Infrastructure: VMware vSphere How reproducible: Always Steps to Reproduce: 1. Attach an NVMe device to an existing ODF node in vSphere 2. oc debug node/ or ssh to the node 3. Run an lsblk Actual results: NVMe device is not present in the OS block device list until the VM is rebooted. Expected results: Added NVMe device present to the OS without requiring a reboot. [0] https://access.redhat.com/documentation/en-us/red_hat_openshift_container_storage/4.8/html/scaling_storage/scaling-up-storage-capacity_rhocs#scaling-up-storage-by-adding-capacity-to-your-openshift-container-storage-nodes-using-local-storage-devices_rhocs [1] https://access.redhat.com/solutions/5317381
This does not seem like a bug, at least not on our end. That said, based on this line: > COS does have the rescan-scsi-bus.sh script from sg3_utils RPM however that does not work for NVMe devices[1] so we're opening this BZ to see if there is an undocumented way to rescan for additional NVMe disks on an runing ODF COS node without requiring a reboot or if we should be pursuing an RFE to add the "nvme-cli" utility to COS. The answer to this would be "no". An RFE would make the most sense. If there is a regression or known issue, it would probably be in RHCOS itself or the LSO. I don't know which component would be the appropriate target, so moving it out to ODF 4.12 now.
Hi All, Just checking in to see if anyone else has had a chance to take a look at this and what the status is.
As Jose mentioned earlier it should be coming from COS or LSO. we are not maintaining such scripts which rescan the devices and all. I am closing this BZ. Pls create the Jira issue with the LSO or COS.