Bug 2177411

Summary: nvme-cli - 2748 Segmentation fault (core dumped) nvme connect-all
Product: Red Hat Enterprise Linux 9 Reporter: Marco Patalano <mpatalan>
Component: nvme-cliAssignee: Maurizio Lombardi <mlombard>
Status: CLOSED MIGRATED QA Contact: Marco Patalano <mpatalan>
Severity: medium Docs Contact:
Priority: medium    
Version: 9.2CC: tbzatek
Target Milestone: rcKeywords: MigratedToJIRA, Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-23 12:58:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marco Patalano 2023-03-11 16:23:52 UTC
Description of problem: When connecting to a Broadcom software target from the initiator using 'nvme connect-all', I am presented with the following output:

# nvme connect-all
Failed to open ctrl nvme6, errno 11
Failed to open ctrl nvme6, errno 11
Segmentation fault (core dumped)

Below are the log messages when the issue occurred:

Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: nvme[2026]: segfault at 8 ip 000055ce445d7a41 sp 00007ffda0b65360 error 4 in nvme[55ce445d5000+88000]
Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: Code: f2 ff ff 48 8d 74 24 40 ba 0a 00 00 00 48 89 df 48 89 44 24 08 e8 2f f9 ff ff 41 89 c6 85 c0 0f 85 74 01 00 00 4c 8b 7c 24 40 <49> 8b 47 08 48 89 04 24 4d 85 e4 74 6a ba 80 01 00 00 be 42 02 00
Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: Created slice Slice /system/systemd-coredump.
Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: Started Process Core Dump (PID 2027/UID 0).
Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd-coredump[2028]: Resource limits disable core dumping for process 2026 (nvme).
Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd-coredump[2028]: [🡕] Process 2026 (nvme) of user 0 dumped core.
Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: systemd-coredump: Deactivated successfully.
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: nvmf-connect@--device\x3dnone\t--transport\x3dfc\t--traddr\x3dnn-0x20000090fac7b6c2:pn-0x10000090fac7b6c2\t--trsvcid\x3dnone\t--host-traddr\x3dnn-0x20000090fad17933:pn-0x10000090fad17933.service: Deactivated successfully.
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: nvmf-connect@--device\x3dnone\t--transport\x3dfc\t--traddr\x3dnn-0x20000090fac7b6c3:pn-0x10000090fac7b6c3\t--trsvcid\x3dnone\t--host-traddr\x3dnn-0x20000090fad17933:pn-0x10000090fad17933.service: Deactivated successfully.
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: nvmf-connect@--device\x3dnone\t--transport\x3dfc\t--traddr\x3dnn-0x20000090fac7b6c2:pn-0x10000090fac7b6c2\t--trsvcid\x3dnone\t--host-traddr\x3dnn-0x20000090fad17934:pn-0x10000090fad17934.service: Deactivated successfully.
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: nvmf-connect@--device\x3dnone\t--transport\x3dfc\t--traddr\x3dnn-0x20000090fac7b6c3:pn-0x10000090fac7b6c3\t--trsvcid\x3dnone\t--host-traddr\x3dnn-0x20000090fad17934:pn-0x10000090fad17934.service: Deactivated successfully.
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: lpfc 0000:04:00.1: Disconnect LS failed: No Association
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: lpfc 0000:04:00.1: Disconnect LS failed: No Association
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: lpfc 0000:04:00.0: Disconnect LS failed: No Association
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: lpfc 0000:04:00.0: Disconnect LS failed: No Association

The connection to the nvme namespace appears to establish correctly:

# nvme list
Node                  Generic               SN                   Model                                    Namespace Usage                      Format           FW Rev  
--------------------- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme1n1          /dev/ng1n1            a79585895d0a700e     Linux                                    1           1.60  TB /   1.60  TB      4 KiB +  0 B   5.14.0-2


Below is my discovery.conf in case it is helpful:

# cat /etc/nvme/discovery.conf 
--transport=fc --traddr=nn-0x20000090fac7b6c2:pn-0x10000090fac7b6c2 --host-traddr=nn-0x20000090fad17933:pn-0x10000090fad17933
--transport=fc --traddr=nn-0x20000090fac7b6c2:pn-0x10000090fac7b6c2 --host-traddr=nn-0x20000090fad17934:pn-0x10000090fad17934
--transport=fc --traddr=nn-0x20000090fac7b6c3:pn-0x10000090fac7b6c3 --host-traddr=nn-0x20000090fad17933:pn-0x10000090fad17933
--transport=fc --traddr=nn-0x20000090fac7b6c3:pn-0x10000090fac7b6c3 --host-traddr=nn-0x20000090fad17934:pn-0x10000090fad17934


Version-Release number of selected component (if applicable):
# rpm -qa nvme-cli
nvme-cli-2.2.1-2.el9.x86_64

How reproducible: Often


Steps to Reproduce:
1. see above


Additional info: Below is the job where the issue was first seen:

https://beaker.engineering.redhat.com/jobs/7613610

Comment 1 RHEL Program Management 2023-09-23 12:55:45 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 2 RHEL Program Management 2023-09-23 12:58:19 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.