RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1793585 - [RHEL-7][Regression] ibacm[1221]: segfault at 0 ip 0000000000404f3e sp 00007ffe54e819a0 error 4 in ibacm[400000+e000]
Summary: [RHEL-7][Regression] ibacm[1221]: segfault at 0 ip 0000000000404f3e sp 00007f...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: rdma-core
Version: 7.8
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Honggang LI
QA Contact: zguo
URL:
Whiteboard:
Depends On:
Blocks: 1793736 1812444
TreeView+ depends on / blocked
 
Reported: 2020-01-21 15:46 UTC by zguo
Modified: 2020-09-29 19:25 UTC (History)
5 users (show)

Fixed In Version: rdma-core-22.4-2.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1793736 1812444 (view as bug list)
Environment:
Last Closed: 2020-09-29 19:25:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:3870 0 None None None 2020-09-29 19:25:42 UTC

Description zguo 2020-01-21 15:46:38 UTC
Description of problem:

It works on ibacm-22.3-1.el7.x86_64, but failed on ibacm-22.4-1.el7.x86_64 on the same test bed(rdma-perf-00/01).
It might be caused by the recent change of rdma-core. Hit this segfault only on non-IB hardware. rdma-perf-01 are with both mlx4 IB and mlx4 RoCE device.

=============
[root@rdma-perf-01 ~]$ rpm -q ibacm rdma-core
ibacm-22.3-1.el7.x86_64
rdma-core-22.3-1.el7.x86_64
[root@rdma-perf-01 ~]$ systemctl restart ibacm
[root@rdma-perf-01 ~]$ systemctl status ibacm
● ibacm.service - InfiniBand Address Cache Manager Daemon
   Loaded: loaded (/usr/lib/systemd/system/ibacm.service; disabled; vendor preset: disabled)
   Active: active (running) since Tue 2020-01-21 10:34:51 EST; 8s ago
     Docs: man:ibacm
           file:/etc/rdma/ibacm_opts.cfg
 Main PID: 850 (ibacm)
    Tasks: 4
   CGroup: /system.slice/ibacm.service
           └─850 /usr/sbin/ibacm --systemd

Jan 21 10:34:51 rdma-perf-01.lab.bos.redhat.com systemd[1]: Starting InfiniBand Address Cache Manager Daemon...
Jan 21 10:34:51 rdma-perf-01.lab.bos.redhat.com systemd[1]: Started InfiniBand Address Cache Manager Daemon.

============

[root@rdma-perf-01 ~]$ rpm -q ibacm rdma-core
ibacm-22.4-1.el7.x86_64
rdma-core-22.4-1.el7.x86_64
[root@rdma-perf-01 ~]$ systemctl status ibacm
● ibacm.service - InfiniBand Address Cache Manager Daemon
   Loaded: loaded (/usr/lib/systemd/system/ibacm.service; disabled; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:ibacm
           file:/etc/rdma/ibacm_opts.cfg

Jan 21 10:29:55 rdma-perf-01.lab.bos.redhat.com systemd[1]: Unit ibacm.service entered failed state.
Jan 21 10:29:55 rdma-perf-01.lab.bos.redhat.com systemd[1]: ibacm.service failed.
Jan 21 10:33:48 rdma-perf-01.lab.bos.redhat.com systemd[1]: Starting InfiniBand Address Cache Manager Daemon...
Jan 21 10:33:48 rdma-perf-01.lab.bos.redhat.com systemd[1]: Started InfiniBand Address Cache Manager Daemon.
Jan 21 10:34:51 rdma-perf-01.lab.bos.redhat.com systemd[1]: Stopping InfiniBand Address Cache Manager Daemon...
Jan 21 10:34:51 rdma-perf-01.lab.bos.redhat.com systemd[1]: Stopped InfiniBand Address Cache Manager Daemon.
Jan 21 10:34:51 rdma-perf-01.lab.bos.redhat.com systemd[1]: Starting InfiniBand Address Cache Manager Daemon...
Jan 21 10:34:51 rdma-perf-01.lab.bos.redhat.com systemd[1]: Started InfiniBand Address Cache Manager Daemon.
Jan 21 10:35:47 rdma-perf-01.lab.bos.redhat.com systemd[1]: Stopping InfiniBand Address Cache Manager Daemon...
Jan 21 10:35:47 rdma-perf-01.lab.bos.redhat.com systemd[1]: Stopped InfiniBand Address Cache Manager Daemon.
[root@rdma-perf-01 ~]$ systemctl start ibacm
Job for ibacm.service failed because a fatal signal was delivered to the control process. See "systemctl status ibacm.service" and "journalctl -xe" for details.
[root@rdma-perf-01 ~]$ dmesg
[20013.737212] ibacm[1221]: segfault at 0 ip 0000000000404f3e sp 00007ffe54e819a0 error 4 in ibacm[400000+e000]
==========
Actual results:

ibacm[1221]: segfault at 0 ip 0000000000404f3e sp 00007ffe54e819a0 error 4 in ibacm[400000+e000] on machine having non-IB hardware.

Expected results:

No this segfault on non-IB hardware.

Additional info:

Comment 2 zguo 2020-01-21 16:07:36 UTC
It works well on cxgb4 with ibacm-22.4-1.el7.x86_64. So the scenario in bug description is the only case I hit this issue so far.

[root@rdma-dev-13 ~]$ rpm -q ibacm
ibacm-22.4-1.el7.x86_64
[root@rdma-dev-13 ~]$ systemctl status ibacm
● ibacm.service - InfiniBand Address Cache Manager Daemon
   Loaded: loaded (/usr/lib/systemd/system/ibacm.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2020-01-21 11:02:16 EST; 1min 32s ago
     Docs: man:ibacm
           file:/etc/rdma/ibacm_opts.cfg
  Process: 10487 ExecStart=/usr/sbin/ibacm --systemd (code=exited, status=255)
 Main PID: 10487 (code=exited, status=255)

Jan 21 11:02:16 rdma-dev-13.lab.bos.redhat.com systemd[1]: Starting InfiniBand Address Cache Manager Daemon...
Jan 21 11:02:16 rdma-dev-13.lab.bos.redhat.com systemd[1]: ibacm.service: main process exited, code=exited, status=255/n/a
Jan 21 11:02:16 rdma-dev-13.lab.bos.redhat.com systemd[1]: Failed to start InfiniBand Address Cache Manager Daemon.
Jan 21 11:02:16 rdma-dev-13.lab.bos.redhat.com systemd[1]: Unit ibacm.service entered failed state.
Jan 21 11:02:16 rdma-dev-13.lab.bos.redhat.com systemd[1]: ibacm.service failed.

Comment 3 zguo 2020-01-22 02:34:45 UTC
No this issue on rdma-perf-03 with mlx5 IB and mlx5 RoCE.

[root@rdma-perf-03 ~]$ rpm -q  ibacm rdma-core
ibacm-22.4-1.el7.x86_64
rdma-core-22.4-1.el7.x86_64

[root@rdma-perf-03 ~]$ systemctl status ibacm
● ibacm.service - InfiniBand Address Cache Manager Daemon
   Loaded: loaded (/usr/lib/systemd/system/ibacm.service; disabled; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:ibacm
           file:/etc/rdma/ibacm_opts.cfg
[root@rdma-perf-03 ~]$ systemctl start ibacm
[root@rdma-perf-03 ~]$ dmesg
[root@rdma-perf-03 ~]$ dmesg
[root@rdma-perf-03 ~]$ systemctl status ibacm
● ibacm.service - InfiniBand Address Cache Manager Daemon
   Loaded: loaded (/usr/lib/systemd/system/ibacm.service; disabled; vendor preset: disabled)
   Active: active (running) since Tue 2020-01-21 21:26:55 EST; 8s ago
     Docs: man:ibacm
           file:/etc/rdma/ibacm_opts.cfg
 Main PID: 11676 (ibacm)
   CGroup: /system.slice/ibacm.service
           └─11676 /usr/sbin/ibacm --systemd

Jan 21 21:26:55 rdma-perf-03.lab.bos.redhat.com systemd[1]: Starting InfiniBand Address Cache Manager Daemon...
Jan 21 21:26:55 rdma-perf-03.lab.bos.redhat.com systemd[1]: Started InfiniBand Address Cache Manager Daemon.

ip a | egrep  'roce|ib' 
5: mlx5_roce: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mqprio state UP group default qlen 1000
    inet 172.31.40.183/24 brd 172.31.40.255 scope global noprefixroute dynamic mlx5_roce
7: mlx5_ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4092 qdisc mq state UP group default qlen 256
    link/infiniband 00:00:10:8a:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a3:19:6c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    inet 172.31.0.183/24 brd 172.31.0.255 scope global noprefixroute dynamic mlx5_ib0
8: mlx5_roce.45@mlx5_roce: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    inet 172.31.45.183/24 brd 172.31.45.255 scope global noprefixroute dynamic mlx5_roce.45
9: mlx5_roce.43@mlx5_roce: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    inet 172.31.43.183/24 brd 172.31.43.255 scope global noprefixroute dynamic mlx5_roce.43
10: mlx5_ib0.8008@mlx5_ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group default qlen 256
    link/infiniband 00:00:11:37:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a3:19:6c brd 00:ff:ff:ff:ff:12:40:1b:80:08:00:00:00:00:00:00:ff:ff:ff:ff
    inet 172.31.8.183/24 brd 172.31.8.255 scope global noprefixroute dynamic mlx5_ib0.8008
11: mlx5_ib0.8002@mlx5_ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4092 qdisc mq state UP group default qlen 256
    link/infiniband 00:00:11:b9:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a3:19:6c brd 00:ff:ff:ff:ff:12:40:1b:80:02:00:00:00:00:00:00:ff:ff:ff:ff
    inet 172.31.2.183/24 brd 172.31.2.255 scope global noprefixroute dynamic mlx5_ib0.8002
12: mlx5_ib0.8006@mlx5_ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group default qlen 256
    link/infiniband 00:00:12:3b:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a3:19:6c brd 00:ff:ff:ff:ff:12:40:1b:80:06:00:00:00:00:00:00:ff:ff:ff:ff
    inet 172.31.6.183/24 brd 172.31.6.255 scope global noprefixroute dynamic mlx5_ib0.8006
13: mlx5_ib0.8004@mlx5_ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4092 qdisc mq state UP group default qlen 256
    link/infiniband 00:00:12:bd:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a3:19:6c brd 00:ff:ff:ff:ff:12:40:1b:80:04:00:00:00:00:00:00:ff:ff:ff:ff
    inet 172.31.4.183/24 brd 172.31.4.255 scope global noprefixroute dynamic mlx5_ib0.8004

Comment 4 zguo 2020-01-22 02:46:35 UTC
No this issue on below test env:

[root@rdma-dev-15 ~]$ rpm -q rdma-core ibacm
rdma-core-22.4-1.el7.x86_64
ibacm-22.4-1.el7.x86_64
[root@rdma-dev-15 ~]$ ip a | egrep 'opa|roce'
3: bnxt_roce: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
14: hfi1_opa0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc pfifo_fast state UP group default qlen 256
17: hfi1_opa0.8024@hfi1_opa0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 4092 qdisc pfifo_fast state LOWERLAYERDOWN group default qlen 256
18: hfi1_opa0.8022@hfi1_opa0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 4092 qdisc pfifo_fast state LOWERLAYERDOWN group default qlen 256
23: bnxt_roce.43@bnxt_roce: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
24: bnxt_roce.45@bnxt_roce: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
[root@rdma-dev-15 ~]$ systemctl status ibacm
● ibacm.service - InfiniBand Address Cache Manager Daemon
   Loaded: loaded (/usr/lib/systemd/system/ibacm.service; disabled; vendor preset: disabled)
   Active: active (running) since Tue 2020-01-21 21:44:42 EST; 1min 12s ago
     Docs: man:ibacm
           file:/etc/rdma/ibacm_opts.cfg
 Main PID: 10550 (ibacm)
   CGroup: /system.slice/ibacm.service
           └─10550 /usr/sbin/ibacm --systemd

Jan 21 21:44:41 rdma-dev-15.lab.bos.redhat.com systemd[1]: Starting InfiniBand Address Cache Manager Daemon...
Jan 21 21:44:42 rdma-dev-15.lab.bos.redhat.com systemd[1]: Started InfiniBand Address Cache Manager Daemon.

Comment 17 Whitney Chadwick 2020-03-05 13:59:23 UTC
Need clone for this BZ and approved for 7.8 0day

Comment 18 John W. Linville 2020-03-05 18:26:49 UTC
OK -- intent is to approve this for 7.8.z and get the Z-stream update approved for 0day. My interpretation of the required process steps is to remove the 0day indication from the developer whiteboard on this bug and set the 7.8.z flag. When that Z-stream bug is approved, we will than add 0day to the developer whiteboard on the Z-stream bug. If above is incorrect, please illustrate the intended process to follow.

Comment 22 errata-xmlrpc 2020-09-29 19:25:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (rdma-core bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3870


Note You need to log in before you can comment on or make changes to this bug.