Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1846056

Summary: [NetApp RHEL 8.2 Bug]: nvme connect --reconnect-delay option is broken
Product: Red Hat Enterprise Linux 8 Reporter: Martin George <marting>
Component: nvme-cliAssignee: David Milburn <dmilburn>
Status: CLOSED ERRATA QA Contact: Zhang Yi <yizhan>
Severity: medium Docs Contact:
Priority: medium    
Version: 8.2CC: dmilburn, emilne, mpatalan, ng-redhat-bugzilla, revers
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: 8.3   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: nvme-cli-1.12-2.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-04 01:43:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Martin George 2020-06-10 16:29:53 UTC
Description of problem:
The --reconnect-delay option in nvme connect is currently broken in the RHEL 8.2 nvme-cli. Despite specifying a reconnect delay value here with this option, it fails to apply during an nvme connect.

Version-Release number of selected component (if applicable):
RHEL 8.2 GA

# uname -r
4.18.0-193.el8.x86_64

# rpm -qa|grep nvme-cli
nvme-cli-1.9-5.el8.x86_64

Comment 1 Martin George 2020-06-10 16:30:54 UTC
Addressed with the upstream patch at https://github.com/linux-nvme/nvme-cli/commit/b2a0aba1176aa26f2b5ce0c0360c4be67dff63d8.

Requesting RH to pull this into the RHEL nvme-cli.

Comment 2 Martin George 2020-06-22 07:35:14 UTC
Any updates on this, David? Is this targeted for RHEL 8.3 nvme-cli now?

Comment 3 David Milburn 2020-06-23 14:40:24 UTC
Hi Martin,

(In reply to Martin George from comment #2)
> Any updates on this, David? Is this targeted for RHEL 8.3 nvme-cli now?

Sorry for the delay, to re-spin, re-QE a new nvme-cli package we will
need to get an exception, if you need this fix for RHEL8.3, would you
please state your business use case, how it negatively affects NetApp?
Thanks.

Comment 6 David Milburn 2020-06-24 20:34:39 UTC
Hi Martin,

(In reply to Martin George from comment #2)
> Any updates on this, David? Is this targeted for RHEL 8.3 nvme-cli now?

Even though this is straight forward fix committed to v1.12, would
you do a sanity check on nvme-cli-1.10.1-3.el8 before we run it thru
the build system? Thanks.

http://people.redhat.com/dmilburn/.bz1846056.56293428471882384578/

Comment 7 Martin George 2020-06-25 07:34:07 UTC
(In reply to David Milburn from comment #6)
> Hi Martin,
> 
> (In reply to Martin George from comment #2)
> > Any updates on this, David? Is this targeted for RHEL 8.3 nvme-cli now?
> 
> Even though this is straight forward fix committed to v1.12, would
> you do a sanity check on nvme-cli-1.10.1-3.el8 before we run it thru
> the build system? Thanks.
> 
> http://people.redhat.com/dmilburn/.bz1846056.56293428471882384578/

It doesn't look this nvme-cli package contains the above patch. I tried with the -c option (i.e. setting reconnect_delay) to the nvme connect-all command at usr/lib/systemd/system/nvmf-connect@.service and rebooted the host, but that failed to apply during a nvme port down.

Could you provide a v1.12 package containing this fix? And do provide the .src package as well. Will give it a try.

Comment 8 David Milburn 2020-06-25 16:44:57 UTC
(In reply to Martin George from comment #7)
> (In reply to David Milburn from comment #6)
> > Hi Martin,
> > 
> > (In reply to Martin George from comment #2)
> > > Any updates on this, David? Is this targeted for RHEL 8.3 nvme-cli now?
> > 
> > Even though this is straight forward fix committed to v1.12, would
> > you do a sanity check on nvme-cli-1.10.1-3.el8 before we run it thru
> > the build system? Thanks.
> > 
> > http://people.redhat.com/dmilburn/.bz1846056.56293428471882384578/
> 
> It doesn't look this nvme-cli package contains the above patch. I tried with
> the -c option (i.e. setting reconnect_delay) to the nvme connect-all command
> at usr/lib/systemd/system/nvmf-connect@.service and rebooted the host, but
> that failed to apply during a nvme port down.
> 
> Could you provide a v1.12 package containing this fix? And do provide the
> .src package as well. Will give it a try.

Here you go Martin, as it turns out v1.12 is in Fedora, so I will try to
update RHEL8.3 to v1.12. To preserve /etc/nvme we need to upgrade
from nvme-cli-1.9-7.el8_2.x86_64 (rhel 8.2.z) or the nvme-cli-1.10.1-2.el8.x86_64.rpm
(current rhel 8.3 package). Please let me know if the reconnect-delay option
works for you. Thanks.

http://people.redhat.com/dmilburn/.bz1846056.56293428471882384578/

Comment 9 Martin George 2020-06-26 15:26:36 UTC
(In reply to David Milburn from comment #8)
> (In reply to Martin George from comment #7)
> > (In reply to David Milburn from comment #6)
> > > Hi Martin,
> > > 
> > > (In reply to Martin George from comment #2)
> > > > Any updates on this, David? Is this targeted for RHEL 8.3 nvme-cli now?
> > > 
> > > Even though this is straight forward fix committed to v1.12, would
> > > you do a sanity check on nvme-cli-1.10.1-3.el8 before we run it thru
> > > the build system? Thanks.
> > > 
> > > http://people.redhat.com/dmilburn/.bz1846056.56293428471882384578/
> > 
> > It doesn't look this nvme-cli package contains the above patch. I tried with
> > the -c option (i.e. setting reconnect_delay) to the nvme connect-all command
> > at usr/lib/systemd/system/nvmf-connect@.service and rebooted the host, but
> > that failed to apply during a nvme port down.
> > 
> > Could you provide a v1.12 package containing this fix? And do provide the
> > .src package as well. Will give it a try.
> 
> Here you go Martin, as it turns out v1.12 is in Fedora, so I will try to
> update RHEL8.3 to v1.12. To preserve /etc/nvme we need to upgrade
> from nvme-cli-1.9-7.el8_2.x86_64 (rhel 8.2.z) or the
> nvme-cli-1.10.1-2.el8.x86_64.rpm
> (current rhel 8.3 package). Please let me know if the reconnect-delay option
> works for you. Thanks.
> 
> http://people.redhat.com/dmilburn/.bz1846056.56293428471882384578/

Yes, I see the reconnect-delay option working properly now with the nvme-cli-1.12-2.el8 package. As a simple experiment, I tried this v1.12 nvme-cli package on top of the test kernel from https://bugzilla.redhat.com/show_bug.cgi?id=1846049#c5. Here I passed on '-c 2' & '-l 45' to the nvme connect-all command at /usr/lib/systemd/system/nvmf-connect@.service and rebooted the host. I then brought down an nvme rport, and saw the following entry in the /var/log/messages exactly after 46 seconds:

kernel: nvme nvme0: NVME-FC{0}: dev_loss_tmo (46) expired while waiting for remoteport connectivity.

i.e. the nvme-fc rport devloss_tmo value of 46 seconds was printed here derived from max_reconnects * reconnect_delay (i.e. 13 * 2 = 46 seconds).

Comment 10 David Milburn 2020-06-26 15:54:26 UTC
Thanks Martin.

Comment 16 errata-xmlrpc 2020-11-04 01:43:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (nvme-cli bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4476