Bug 2107346
| Summary: | NVMe-FC Namespaces are reported under Local Standard Disks | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Marco Patalano <mpatalan> | ||||
| Component: | anaconda | Assignee: | Vendula Poncova <vponcova> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Release Test Team <release-test-team-automation> | ||||
| Severity: | unspecified | Docs Contact: | Sagar Dubewar <sdubewar> | ||||
| Priority: | unspecified | ||||||
| Version: | 9.1 | CC: | emilne, gfialova, jkonecny, jmeneghi, jstodola, mlombard, nyewale, rvykydal, sbarcomb, sdubewar, tbzatek, vponcova, vslavik, vtrefny | ||||
| Target Milestone: | rc | Keywords: | Triaged | ||||
| Target Release: | --- | Flags: | sdubewar:
needinfo-
pm-rhel: mirror+ |
||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | NVMe_Feature, NVMe_92_Feature,NVMe_P1 | ||||||
| Fixed In Version: | anaconda-34.25.2.6-1.el9 | Doc Type: | Technology Preview | ||||
| Doc Text: |
.NVMe over Fibre Channel devices are now available in RHEL installation program as a Technology Preview
You can now add NVMe over Fibre Channel devices to your RHEL installation as a Technology Preview. In RHEL installation program, you can select these devices under the NVMe Fabrics Devices section while adding disks on the Installation Destination screen.
|
Story Points: | --- | ||||
| Clone Of: | |||||||
| : | 2123337 (view as bug list) | Environment: | |||||
| Last Closed: | 2023-05-09 07:35:38 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 2123337 | ||||||
| Bug Blocks: | |||||||
| Deadline: | 2023-01-10 | ||||||
| Attachments: |
|
||||||
Hi Marco, could you please give us installation logs. You can find them in the /tmp/*log in the installation environment. Hello Jiri, Please find the job below which is a RHEL-9.1 installation on a system with attached NVMe-FC namespaces as well as FC Luns. https://beaker.engineering.redhat.com/jobs/6817253 Marco Does this problem have anything to do with the blivet change that was made in bz 2073008? Hi Vojta could you please take a look on this issue if it is blivet change side effect? Hi John, could you please write here how we can detect the NVMe-oF devices and what is the impact of this issue? Based on my understanding, this issue is breaking the default auto partitioning when machine has a NVMe-oF devices, because we will install there LVM2 over the disks but the current kernel drivers are not able to set-up the devices early boot. Am I correct? (In reply to Jiri Konecny from comment #7) > Hi John, could you please write here how we can detect the NVMe-oF devices > and what is the impact of this issue? > > Based on my understanding, this issue is breaking the default auto > partitioning when machine has a NVMe-oF devices, because we will install > there LVM2 over the disks but the current kernel drivers are not able to > set-up the devices early boot. Am I correct? Sorry for my delayed response. Yes, the issue is: remote nvme-of devices are being incorrectly identified as local nvme devices so the default auto partitioning will include some or all of those devices in it's LVM2 configuration. This results in a non-bootable LVM2 configuration and the customer may not realize the mistake until the host is re-booted. Even when auto partitioning is not used we don't want anything but local, direct attached, nvme/pci devices to appear in the list of local devices. See the screen shot in https://bugzilla.redhat.com/attachment.cgi?id=1897189. Note: support for booting nvme/fc and nvme/tcp devices is being developed upstream and will eventually be enabled, but it is not supported in RHEL 9.1 yet. As demonstrated during our meeting last week: all nvme devices are listed under /sys/class/nvme and nvme-of remote devices can be distinguished by looking in /sys/class/nvme-fabrics/ctl. For example, here's a host with only remote nvme-of fabric nvme devices. # ls /sys/class/nvme nvme0 nvme1 nvme10 nvme11 nvme12 nvme13 nvme3 nvme8 nvme9 # ls /sys/class/nvme-fabrics/ctl nvme0 nvme1 nvme10 nvme11 nvme12 nvme13 nvme3 nvme8 nvme9 power subsystem uevent Here's another example of a host with only local attached nvme/pci devices # ls /sys/class/nvme nvme0 nvme1 # ls /sys/class/nvme-fabrics/ ls: cannot access '/sys/class/nvme-fabrics/': No such file or directory Another way to look at this is to simply examine all of the soft links in /sys/class/nvme. E.g.: # ls -al /sys/class/nvme total 0 drwxr-xr-x. 2 root root 0 Aug 15 04:23 . drwxr-xr-x. 74 root root 0 Aug 15 04:23 .. lrwxrwxrwx. 1 root root 0 Aug 15 08:23 nvme0 -> ../../devices/virtual/nvme-fabrics/ctl/nvme0 lrwxrwxrwx. 1 root root 0 Aug 15 08:23 nvme1 -> ../../devices/virtual/nvme-fabrics/ctl/nvme1 lrwxrwxrwx. 1 root root 0 Aug 15 08:23 nvme10 -> ../../devices/virtual/nvme-fabrics/ctl/nvme10 lrwxrwxrwx. 1 root root 0 Aug 15 08:23 nvme11 -> ../../devices/virtual/nvme-fabrics/ctl/nvme11 lrwxrwxrwx. 1 root root 0 Aug 15 08:23 nvme12 -> ../../devices/virtual/nvme-fabrics/ctl/nvme12 lrwxrwxrwx. 1 root root 0 Aug 15 08:23 nvme13 -> ../../devices/virtual/nvme-fabrics/ctl/nvme13 lrwxrwxrwx. 1 root root 0 Aug 15 08:47 nvme3 -> ../../devices/virtual/nvme-fabrics/ctl/nvme3 lrwxrwxrwx. 1 root root 0 Aug 15 08:23 nvme8 -> ../../devices/virtual/nvme-fabrics/ctl/nvme8 lrwxrwxrwx. 1 root root 0 Aug 15 08:23 nvme9 -> ../../devices/virtual/nvme-fabrics/ctl/nvme9 # ls -al /sys/class/nvme total 0 drwxr-xr-x. 2 root root 0 Jul 27 20:23 . drwxr-xr-x. 65 root root 0 Jul 27 20:23 .. lrwxrwxrwx. 1 root root 0 Jul 28 00:23 nvme0 -> ../../devices/pci0000:00/0000:00:1b.0/0000:02:00.0/nvme/nvme0 lrwxrwxrwx. 1 root root 0 Jul 28 00:23 nvme1 -> ../../devices/pci0000:00/0000:00:1d.0/0000:55:00.0/nvme/nvme1 All nvme remote devices are nvme-fabrics devices while local nvme devices are connected directly to the pci bus. Hope this helps. Thanks a lot for all the info John. In that case we will look on the possibility to hotfix this issue in RHEL 9.1 to avoid confusion for customers. As a hotfix for 9.1 (until the NVMe-oF are recognized completely and able to boot) will be to hide these devices from the installation. Users can use these disks after the installation which shouldn't be an issue because these can't be used for `/` or `/boot` anyway. Everything is clear for classifying nvme controller devices or determining the transport used. However, the filter needs to be run on nvme namespaces (as the blivet device enumeration is a udev query for subystem="block"). From the udev device hierarchy, there doesn't seem to be a direct link or an easy way to find corresponding nvme controllers providing the namespace in question. For RHEL 9.2 we have a working code that uses libnvme to crawl through the sysfs hierarchy and backtrack through associated nvme-subsystem back to controllers. That's not ready for RHEL 9.1 due to complexity and related dependencies (libblockdev-3.0). As a temporary workaround for RHEL 9.1 we could perhaps go through the "<namespace device sysfs path>/device/nvme*" link and read their targets: [anaconda root@storageqe-04 ~]# udevadm info --query=path /dev/nvme2n1 /devices/virtual/nvme-subsystem/nvme-subsys2/nvme2n1 [anaconda root@storageqe-04 ~]# readlink /sys/`udevadm info --query=path /dev/nvme2n1`/device/nvme[0-9]* ../../nvme-fabrics/ctl/nvme3 ../../nvme-fabrics/ctl/nvme4 ../../nvme-fabrics/ctl/nvme5 ../../nvme-fabrics/ctl/nvme6 However the consensus of today's (internal) discussion was a suggestion to postpone all this to 9.2. Draft PR - not moving to POST because not finished yet - https://github.com/rhinstaller/anaconda/pull/4423 Checked that anaconda-34.25.2.6-1.el9 is in nightly compose RHEL-9.2.0-20230201.12 Moving to VERIFIED *** Bug 2126300 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (anaconda bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2223 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |
Created attachment 1897189 [details] screenshot during install Description of problem: Beginning in RHEL-9.1, NVMe over Fibre Channel disks are now visible during the installation of the OS. What is confusing is that the NVMe-FC disks are showing up under Local Standard Disks in the GUI when selecting the Installation Destination. This is very different from what we see during install on a system with attached Fibre Channel Luns (non NVMe). In that case, they appear under Specialized and Network Disks which makes more sense. Please see the attached screenshot which is from a system with both NVMe-FC namespaces and FC Luns attached. Version-Release number of selected component (if applicable): anaconda 34.25.1.8-1.el9 How reproducible: Often Steps to Reproduce: 1. Provision system with NVMe-FC namespaces attached using RHEL-9.1