Bug 1846056
| Summary: | [NetApp RHEL 8.2 Bug]: nvme connect --reconnect-delay option is broken | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Martin George <marting> |
| Component: | nvme-cli | Assignee: | David Milburn <dmilburn> |
| Status: | CLOSED ERRATA | QA Contact: | Zhang Yi <yizhan> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 8.2 | CC: | dmilburn, emilne, mpatalan, ng-redhat-bugzilla, revers |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | 8.3 | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | nvme-cli-1.12-2.el8 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-11-04 01:43:18 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Martin George
2020-06-10 16:29:53 UTC
Addressed with the upstream patch at https://github.com/linux-nvme/nvme-cli/commit/b2a0aba1176aa26f2b5ce0c0360c4be67dff63d8. Requesting RH to pull this into the RHEL nvme-cli. Any updates on this, David? Is this targeted for RHEL 8.3 nvme-cli now? Hi Martin, (In reply to Martin George from comment #2) > Any updates on this, David? Is this targeted for RHEL 8.3 nvme-cli now? Sorry for the delay, to re-spin, re-QE a new nvme-cli package we will need to get an exception, if you need this fix for RHEL8.3, would you please state your business use case, how it negatively affects NetApp? Thanks. Hi Martin, (In reply to Martin George from comment #2) > Any updates on this, David? Is this targeted for RHEL 8.3 nvme-cli now? Even though this is straight forward fix committed to v1.12, would you do a sanity check on nvme-cli-1.10.1-3.el8 before we run it thru the build system? Thanks. http://people.redhat.com/dmilburn/.bz1846056.56293428471882384578/ (In reply to David Milburn from comment #6) > Hi Martin, > > (In reply to Martin George from comment #2) > > Any updates on this, David? Is this targeted for RHEL 8.3 nvme-cli now? > > Even though this is straight forward fix committed to v1.12, would > you do a sanity check on nvme-cli-1.10.1-3.el8 before we run it thru > the build system? Thanks. > > http://people.redhat.com/dmilburn/.bz1846056.56293428471882384578/ It doesn't look this nvme-cli package contains the above patch. I tried with the -c option (i.e. setting reconnect_delay) to the nvme connect-all command at usr/lib/systemd/system/nvmf-connect@.service and rebooted the host, but that failed to apply during a nvme port down. Could you provide a v1.12 package containing this fix? And do provide the .src package as well. Will give it a try. (In reply to Martin George from comment #7) > (In reply to David Milburn from comment #6) > > Hi Martin, > > > > (In reply to Martin George from comment #2) > > > Any updates on this, David? Is this targeted for RHEL 8.3 nvme-cli now? > > > > Even though this is straight forward fix committed to v1.12, would > > you do a sanity check on nvme-cli-1.10.1-3.el8 before we run it thru > > the build system? Thanks. > > > > http://people.redhat.com/dmilburn/.bz1846056.56293428471882384578/ > > It doesn't look this nvme-cli package contains the above patch. I tried with > the -c option (i.e. setting reconnect_delay) to the nvme connect-all command > at usr/lib/systemd/system/nvmf-connect@.service and rebooted the host, but > that failed to apply during a nvme port down. > > Could you provide a v1.12 package containing this fix? And do provide the > .src package as well. Will give it a try. Here you go Martin, as it turns out v1.12 is in Fedora, so I will try to update RHEL8.3 to v1.12. To preserve /etc/nvme we need to upgrade from nvme-cli-1.9-7.el8_2.x86_64 (rhel 8.2.z) or the nvme-cli-1.10.1-2.el8.x86_64.rpm (current rhel 8.3 package). Please let me know if the reconnect-delay option works for you. Thanks. http://people.redhat.com/dmilburn/.bz1846056.56293428471882384578/ (In reply to David Milburn from comment #8) > (In reply to Martin George from comment #7) > > (In reply to David Milburn from comment #6) > > > Hi Martin, > > > > > > (In reply to Martin George from comment #2) > > > > Any updates on this, David? Is this targeted for RHEL 8.3 nvme-cli now? > > > > > > Even though this is straight forward fix committed to v1.12, would > > > you do a sanity check on nvme-cli-1.10.1-3.el8 before we run it thru > > > the build system? Thanks. > > > > > > http://people.redhat.com/dmilburn/.bz1846056.56293428471882384578/ > > > > It doesn't look this nvme-cli package contains the above patch. I tried with > > the -c option (i.e. setting reconnect_delay) to the nvme connect-all command > > at usr/lib/systemd/system/nvmf-connect@.service and rebooted the host, but > > that failed to apply during a nvme port down. > > > > Could you provide a v1.12 package containing this fix? And do provide the > > .src package as well. Will give it a try. > > Here you go Martin, as it turns out v1.12 is in Fedora, so I will try to > update RHEL8.3 to v1.12. To preserve /etc/nvme we need to upgrade > from nvme-cli-1.9-7.el8_2.x86_64 (rhel 8.2.z) or the > nvme-cli-1.10.1-2.el8.x86_64.rpm > (current rhel 8.3 package). Please let me know if the reconnect-delay option > works for you. Thanks. > > http://people.redhat.com/dmilburn/.bz1846056.56293428471882384578/ Yes, I see the reconnect-delay option working properly now with the nvme-cli-1.12-2.el8 package. As a simple experiment, I tried this v1.12 nvme-cli package on top of the test kernel from https://bugzilla.redhat.com/show_bug.cgi?id=1846049#c5. Here I passed on '-c 2' & '-l 45' to the nvme connect-all command at /usr/lib/systemd/system/nvmf-connect@.service and rebooted the host. I then brought down an nvme rport, and saw the following entry in the /var/log/messages exactly after 46 seconds: kernel: nvme nvme0: NVME-FC{0}: dev_loss_tmo (46) expired while waiting for remoteport connectivity. i.e. the nvme-fc rport devloss_tmo value of 46 seconds was printed here derived from max_reconnects * reconnect_delay (i.e. 13 * 2 = 46 seconds). Thanks Martin. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (nvme-cli bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4476 |