Bug 1801432

Summary: NVMe Auto-connect Script fails to connect to NVMe namespaces at bootup
Product: Red Hat Enterprise Linux 8 Reporter: Marco Patalano <mpatalan>
Component: nvme-cliAssignee: David Milburn <dmilburn>
Status: CLOSED CURRENTRELEASE QA Contact: Marco Patalano <mpatalan>
Severity: high Docs Contact:
Priority: high    
Version: 8.2CC: dmilburn, emilne, gcase, james.smart, jwboyer, lmiksik, marting, matt.schulte, ng-redhat-bugzilla, ricky.armas
Target Milestone: rc   
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: nvme-cli-1.9-5.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-31 21:24:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marco Patalano 2020-02-10 21:09:21 UTC
Description of problem: Using the following nightly compose, I deployed RHEL-8.2: 
RHEL-8.2.0-20200207.n.0

# uname -r
4.18.0-176.el8.x86_64

I then installed nvme-cli which includes the auto-connect scripts:

# rpm -qa nvme-cli
nvme-cli-1.9-2.el8.x86_64

After modifying the hostnqn file, I reboot the system. After the server is back up, I then issue an nvme list and it does not return anything. The status of nvmefc-boot-connections shows the following:

# systemctl status  nvmefc-boot-connections
● nvmefc-boot-connections.service - Auto-connect to subsystems on FC-NVME devices found during boot
   Loaded: loaded (/usr/lib/systemd/system/nvmefc-boot-connections.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

In order to connect to the nvme namespaces, I need to do one of the following:

#systemctl -q enable nvmefc-boot-connections
#/bin/sh -c "echo add > /sys/class/fc/fc_udev_device/nvme_discovery"

or

#rmmod lpfc
#modprobe lpfc

Previously, when installing Broadcom's version of the auto-connect scripts, I did not have to do this - the NVMe namespaces were available upon bootup.


Version-Release number of selected component (if applicable):
nvme-cli-1.9-2.el8.x86_64

How reproducible: Often


Steps to Reproduce:
1. Deploy RHEL-8.2
2. Install nvme-cli version 1-9.2
3. Reboot

Actual results: NVMe namespaces are not available after reboot


Expected results: NVMe namespaces should be present after reboot


Additional info:

Comment 1 ricky.armas 2020-02-11 20:53:26 UTC
The systemd unit nvmefc-boot-connections isn't enabled when the nvme-cli RPM is installed.

# systemctl status nvmefc-boot-connections
● nvmefc-boot-connections.service - Auto-connect to subsystems on FC-NVME devices found during boot
   Loaded: loaded (/usr/lib/systemd/system/nvmefc-boot-connections.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

The solution is to enable nvmefc-boot-connections when nvme-cli is installed. This is an example of how to manually enable it (as posted earlier):

# systemctl enable nvmefc-boot-connections
Created symlink /etc/systemd/system/default.target.wants/nvmefc-boot-connections.service → /usr/lib/systemd/system/nvmefc-boot-connections.service.

Comment 2 ricky.armas 2020-02-11 20:59:25 UTC
There may be one more issue. I'm using a test build and I don't know if this is fixed in a more recent build.
The nvme-cli postinstall scriptlet uses systemctl:

# rpm -qp nvme-cli-1.9-2.TEST.el8.x86_64.rpm --scripts
postinstall scriptlet (using /bin/sh):
if [ $1 -eq 1 ]; then # 1 : This package is being installed for the first time
        if [ ! -s /etc/nvme/hostnqn ]; then
                echo $(nvme gen-hostnqn) > /etc/nvme/hostnqn
        fi
        if [ ! -s /etc/nvme/hostid ]; then
                uuidgen > /etc/nvme/hostid
        fi

        # apply udev and systemd changes that we did
        systemctl daemon-reload
        udevadm control --reload-rules && udevadm trigger
fi


Unfortunately, the nvme-cli RPM requirements do not include a reference to systemd:

# rpm -qp nvme-cli-1.9-2.TEST.el8.x86_64.rpm --requires
/bin/sh
libc.so.6()(64bit)
libc.so.6(GLIBC_2.14)(64bit)
libc.so.6(GLIBC_2.2.5)(64bit)
libc.so.6(GLIBC_2.3)(64bit)
libc.so.6(GLIBC_2.3.4)(64bit)
libc.so.6(GLIBC_2.4)(64bit)
libc.so.6(GLIBC_2.8)(64bit)
libuuid.so.1()(64bit)
libuuid.so.1(UUID_1.0)(64bit)
rpmlib(CompressedFileNames) <= 3.0.4-1
rpmlib(FileDigests) <= 4.6.0-1
rpmlib(PayloadFilesHavePrefix) <= 4.0-1
rpmlib(PayloadIsXz) <= 5.2-1
rtld(GNU_HASH)

Comment 3 Ewan D. Milne 2020-02-11 21:56:54 UTC
(In reply to ricky.armas from comment #1)
> The systemd unit nvmefc-boot-connections isn't enabled when the nvme-cli RPM
> is installed.
> 
> # systemctl status nvmefc-boot-connections
> ● nvmefc-boot-connections.service - Auto-connect to subsystems on FC-NVME
> devices found during boot
>    Loaded: loaded (/usr/lib/systemd/system/nvmefc-boot-connections.service;
> disabled; vendor preset: disabled)
>    Active: inactive (dead)
> 
> The solution is to enable nvmefc-boot-connections when nvme-cli is
> installed. This is an example of how to manually enable it (as posted
> earlier):
> 
> # systemctl enable nvmefc-boot-connections
> Created symlink
> /etc/systemd/system/default.target.wants/nvmefc-boot-connections.service →
> /usr/lib/systemd/system/nvmefc-boot-connections.service.

I had tried enabling it manually via systemctl, but it didn't seem to work.

[root@storageqe-01 system]# systemctl enable nvmefc-boot-connections
[root@storageqe-01 system]#  systemctl status nvmefc-boot-connections
● nvmefc-boot-connections.service - Auto-connect to subsystems on FC-NVME devices found during boot
   Loaded: loaded (/usr/lib/systemd/system/nvmefc-boot-connections.service; enabled; vendor preset: disabled)
   Active: inactive (dead)

Comment 4 ricky.armas 2020-02-11 22:08:46 UTC
(In reply to Ewan D. Milne from comment #3)
> (In reply to ricky.armas from comment #1)
> > The systemd unit nvmefc-boot-connections isn't enabled when the nvme-cli RPM
> > is installed.
> > 
> > # systemctl status nvmefc-boot-connections
> > ● nvmefc-boot-connections.service - Auto-connect to subsystems on FC-NVME
> > devices found during boot
> >    Loaded: loaded (/usr/lib/systemd/system/nvmefc-boot-connections.service;
> > disabled; vendor preset: disabled)
> >    Active: inactive (dead)
> > 
> > The solution is to enable nvmefc-boot-connections when nvme-cli is
> > installed. This is an example of how to manually enable it (as posted
> > earlier):
> > 
> > # systemctl enable nvmefc-boot-connections
> > Created symlink
> > /etc/systemd/system/default.target.wants/nvmefc-boot-connections.service →
> > /usr/lib/systemd/system/nvmefc-boot-connections.service.
> 
> I had tried enabling it manually via systemctl, but it didn't seem to work.
> 
> [root@storageqe-01 system]# systemctl enable nvmefc-boot-connections
> [root@storageqe-01 system]#  systemctl status nvmefc-boot-connections
> ● nvmefc-boot-connections.service - Auto-connect to subsystems on FC-NVME
> devices found during boot
>    Loaded: loaded (/usr/lib/systemd/system/nvmefc-boot-connections.service;
> enabled; vendor preset: disabled)
>    Active: inactive (dead)

If the unit is already enabled, systemctl won't provide any output.
From the output you posted, the nvmefc-boot-connections unit is enabled.

Example:

# systemctl enable nvmefc-boot-connections
# systemctl is-enabled nvmefc-boot-connections
enabled
[root@dhcp-10-231-44-97 os-config]# systemctl status nvmefc-boot-connections
● nvmefc-boot-connections.service - Auto-connect to subsystems on FC-NVME devices found during boot
   Loaded: loaded (/usr/lib/systemd/system/nvmefc-boot-connections.service; enabled; vendor preset: disabled)
   Active: inactive (dead)

I'm not familiar with how Red Hat sets the vendor preset. Other distros use a conf file that sets units to enabled.

Comment 5 ricky.armas 2020-02-11 23:03:55 UTC
Ewan,

I modified 90-default.preset, but Red Hat will decide which preset file to modify.

# cat /usr/lib/systemd/system-preset/90-default.preset | grep nvme
# nvme auto connect
enable nvmefc-boot-connections.service

# systemctl daemon-reload
# systemctl status nvmefc-boot-connections
● nvmefc-boot-connections.service - Auto-connect to subsystems on FC-NVME devices found during boot
   Loaded: loaded (/usr/lib/systemd/system/nvmefc-boot-connections.service; enabled; vendor preset: enabled)
   Active: inactive (dead)

Comment 6 Martin George 2020-02-18 09:03:22 UTC
Any updates on this? Do we have a fix identified for this issue yet? If so, could you please share a nvme-cli test package for the same? Thanks.

Comment 7 David Milburn 2020-02-21 20:49:17 UTC
Hi Martin,

(In reply to Martin George from comment #6)
> Any updates on this? Do we have a fix identified for this issue yet? If so,
> could you please share a nvme-cli test package for the same? Thanks.

In order to change /usr/lib/systemd/system-preset/90-default.preset I
had to open up bug 1805466, I have added Broadcom and Netapp groups
so you should be able to access it. Thanks.

Comment 9 Matt Schulte 2020-03-04 15:07:21 UTC
Is there a test package that we could get for this fix?

Comment 10 David Milburn 2020-03-04 15:42:40 UTC
Hi Matt,

(In reply to Matt Schulte from comment #9)
> Is there a test package that we could get for this fix?

There is no test package for redhat-release, for now you would
have to make the changes manually

https://bugzilla.redhat.com/show_bug.cgi?id=1805466#c0

And then install nvme-cli-1.9-2.el8.x86_64.rpm, which is scheduled
to release for RHEL8.2. Not sure if you have access so I put it
here. Thanks.

http://people.redhat.com/dmilburn/.bz1801432.56293428471882384578/

Comment 11 Marco Patalano 2020-03-04 16:18:59 UTC
Hi David,

Running with compose RHEL-8.2.0-20200227.0 and kernel-4.18.0-184.el8, the correct version of nvme-cli is installed:

# rpm -qa nvme-cli
nvme-cli-1.9-2.el8.x86_64

I then modify the 90-default.preset file as you described and reboot:

# cat /usr/lib/systemd/system-preset/90-default.preset |grep nvme
# nvme auto connect
enable nvmefc-boot-connections.service 

# reboot


output of nvme list is still empty after the system boots. I then have to issue the following 2 commands for auto-connect to kick in:

# systemctl -q enable nvmefc-boot-connections
# /bin/sh -c 'echo add > /sys/class/fc/fc_udev_device/nvme_discovery'

# nvme list
Node             SN                   Model                                    Namespace Usage                      Format           FW Rev  
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     80BgLFM7xMJbAAAAAAAC NetApp ONTAP Controller                  1           0.00   B / 107.37  GB      4 KiB +  0 B   FFFFFFFF
/dev/nvme1n1     80BgLFM7xMJbAAAAAAAC NetApp ONTAP Controller                  1         107.37  GB / 107.37  GB      4 KiB +  0 B   FFFFFFFF
/dev/nvme2n1     80BgLFM7xMJbAAAAAAAC NetApp ONTAP Controller                  1         107.37  GB / 107.37  GB      4 KiB +  0 B   FFFFFFFF
/dev/nvme3n1     80BgLFM7xMJbAAAAAAAC NetApp ONTAP Controller                  1           0.00   B / 107.37  GB      4 KiB +  0 B   FFFFFFFF

Auto-connect now works post reboot. Could you let me know if this is expected or if I am doing something wrong. Thanks

Marco

Comment 12 David Milburn 2020-03-04 16:32:37 UTC
Hi Marco,

(In reply to Marco Patalano from comment #11)
> Hi David,
> 
> Running with compose RHEL-8.2.0-20200227.0 and kernel-4.18.0-184.el8, the
> correct version of nvme-cli is installed:
> 
> # rpm -qa nvme-cli
> nvme-cli-1.9-2.el8.x86_64
> 
> I then modify the 90-default.preset file as you described and reboot:
> 
> # cat /usr/lib/systemd/system-preset/90-default.preset |grep nvme
> # nvme auto connect
> enable nvmefc-boot-connections.service 
> 

Please do a clean install after making the above change, once redhat-release
has been updated for RHEL8.2, those changes will be there before installing
nvme-cli. Thanks.

> # reboot
> 
> 
> output of nvme list is still empty after the system boots. I then have to
> issue the following 2 commands for auto-connect to kick in:
> 
> # systemctl -q enable nvmefc-boot-connections
> # /bin/sh -c 'echo add > /sys/class/fc/fc_udev_device/nvme_discovery'
> 
> # nvme list
> Node             SN                   Model                                 
> Namespace Usage                      Format           FW Rev  
> ---------------- --------------------
> ---------------------------------------- ---------
> -------------------------- ---------------- --------
> /dev/nvme0n1     80BgLFM7xMJbAAAAAAAC NetApp ONTAP Controller               
> 1           0.00   B / 107.37  GB      4 KiB +  0 B   FFFFFFFF
> /dev/nvme1n1     80BgLFM7xMJbAAAAAAAC NetApp ONTAP Controller               
> 1         107.37  GB / 107.37  GB      4 KiB +  0 B   FFFFFFFF
> /dev/nvme2n1     80BgLFM7xMJbAAAAAAAC NetApp ONTAP Controller               
> 1         107.37  GB / 107.37  GB      4 KiB +  0 B   FFFFFFFF
> /dev/nvme3n1     80BgLFM7xMJbAAAAAAAC NetApp ONTAP Controller               
> 1           0.00   B / 107.37  GB      4 KiB +  0 B   FFFFFFFF
> 
> Auto-connect now works post reboot. Could you let me know if this is
> expected or if I am doing something wrong. Thanks
> 
> Marco