Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1391299

Summary:

[LLNL 7.4 Bug] Crash in Infiniband rdmavt layer when kernel consumer exhausts queue pairs

Product:

Red Hat Enterprise Linux 7

Reporter:

Jim Foraker <foraker1>

Component:

kernel

Assignee:

Jonathan Toppins <jtoppins>

kernel sub component:

Infiniband

QA Contact:

Mike Stowell <mstowell>

Status:

CLOSED ERRATA

Docs Contact:

Severity:

high

Priority:

high

CC:

dhoward, jshortt, mstowell, rdma-dev-team, tgummels

Version:

7.3

Keywords:

ZStream

Target Milestone:

Target Release:

7.4

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

kernel-3.10.0-549.el7

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Clones:

1417191 (view as bug list)

Environment:

Last Closed:

2017-08-02 04:25:56 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1353018, 1381646, 1417191, 1446211

Attachments:

Description	Flags
kernel module reproducer	none
enhanced kernel module to accept module params - makes automation easier	none

Description Jim Foraker 2016-11-03 00:57:52 UTC

Created attachment 1216823 [details]
kernel module reproducer

Description of problem:

Several of our nodes have experienced crashes similar to the following:

2016-10-31 14:48:04 [246684.429255] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
2016-10-31 14:48:04 [246684.438112] IP: [<ffffffffa09ac5cc>] rvt_create_qp+0x3fc/0xa60 [rdmavt]
2016-10-31 14:48:04 [246684.445605] PGD 1ffc7e4067 PUD 2021a63067 PMD 0 
2016-10-31 14:48:04 [246684.450883] Oops: 0002 [#1] SMP 
2016-10-31 14:48:04 [246684.454598] Modules linked in: lmv(OE) fld(OE) mgc(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) fid(OE) ptlrpc(OE) obdclass(OE) rpcsec_gss_krb5 nfsv4 dns_resolver ko2iblnd(OE) lnet(OE) sha512_ssse3 sha512_generic crypto_null libcfs(OE) nfsv3 nfs fscache ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_powerclamp coretemp intel_rapl iosf_mbi hfi1 kvm irqbypass iTCO_wdt mei_me rdmavt ipmi_devintf iTCO_vendor_support mei sb_edac sg lpc_ich pcspkr shpchp i2c_i801 edac_core ipmi_si ipmi_msghandler acpi_power_meter acpi_cpufreq binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace ip_tables ext4 mbcache jbd2 dm_service_time sd_mod crc_t10dif crct10dif_generic be2iscsi bnx2i cnic uio cxgb4i iw_cxgb4 cxgb4 cxgb3i iw_cxgb3 ib_core cxgb3 mdio libcxgbi qla4xxx iscsi_boot_sysfs crct10dif_pclmul crct10dif_common crc32_pclmul 8021q crc32c_intel mgag200 garp ghash_clmulni_intel stp drm_kms_helper llc syscopyarea sysfillrect mrp dm_multipath sysimgblt aesni_intel fb_sys_fops lrw igb gf128mul glue_helper ttm ablk_helper ahci dca cryptd libahci ptp drm pps_core libata i2c_algo_bit i2c_core mxm_wmi fjes wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi zfs(POE) zunicode(POE) zavl(POE) zcommon(POE) znvpair(POE) spl(OE) zlib_deflate
2016-10-31 14:48:04 [246684.581974] CPU: 18 PID: 134712 Comm: kworker/u384:1 Tainted: P           OE  ------------   3.10.0-510.0.0.2chaos.ch6.x86_64 #1
2016-10-31 14:48:04 [246684.594977] Hardware name: Penguin Computing Relion OCP1930e/S2600KPR, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016
2016-10-31 14:48:04 [246684.607403] Workqueue: rdma_cm cma_work_handler [rdma_cm]
2016-10-31 14:48:04 [246684.613531] task: ffff880e47a05e20 ti: ffff88201db14000 task.ti: ffff88201db14000
2016-10-31 14:48:04 [246684.621979] RIP: 0010:[<ffffffffa09ac5cc>]  [<ffffffffa09ac5cc>] rvt_create_qp+0x3fc/0xa60 [rdmavt]
2016-10-31 14:48:04 [246684.632182] RSP: 0018:ffff88201db17bf0  EFLAGS: 00010246
2016-10-31 14:48:04 [246684.638205] RAX: 0000000000000000 RBX: ffff88103a2c0000 RCX: 0000000000000000
2016-10-31 14:48:04 [246684.646263] RDX: 0000000000008dd4 RSI: 0000000000000000 RDI: 0000000000000028
2016-10-31 14:48:04 [246684.654320] RBP: ffff88201db17c80 R08: ffff880178ee8000 R09: ffff88031e961800
2016-10-31 14:48:04 [246684.662378] R10: 0000000000000000 R11: 0000000000000000 R12: fffffffffffffff4
2016-10-31 14:48:04 [246684.670436] R13: ffff88103a2c099c R14: ffff88031e9c1000 R15: 00000000000006e8
2016-10-31 14:48:04 [246684.678495] FS:  0000000000000000(0000) GS:ffff88203de00000(0000) knlGS:0000000000000000
2016-10-31 14:48:04 [246684.687619] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2016-10-31 14:48:04 [246684.694127] CR2: 0000000000000028 CR3: 000000202faa2000 CR4: 00000000003407e0
2016-10-31 14:48:04 [246684.702185] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
2016-10-31 14:48:04 [246684.710243] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
2016-10-31 14:48:04 [246684.718300] Stack:
2016-10-31 14:48:04 [246684.720637]  8000000000000163 000080d21db17fd8 ffff88031e34d308 00000000b54cba81
2016-10-31 14:48:04 [246684.729026]  ffff88201db17c28 ffffffff811c4732 ffffc90200000102 ffff880178ee8030
2016-10-31 14:48:04 [246684.737415]  0000000000000020 ffffc90203907000 0000000000000000 00000000ffffffff
2016-10-31 14:48:04 [246684.745803] Call Trace:
2016-10-31 14:48:04 [246684.748630]  [<ffffffff811c4732>] ? find_vmap_area+0x42/0x70
2016-10-31 14:48:04 [246684.755041]  [<ffffffffa06a5a3f>] ib_create_qp+0x3f/0x250 [ib_core]
2016-10-31 14:48:04 [246684.762131]  [<ffffffffa09ef5a4>] rdma_create_qp+0x34/0xb0 [rdma_cm]
2016-10-31 14:48:04 [246684.769325]  [<ffffffffa0bb93d3>] kiblnd_create_conn+0xc83/0x1a70 [ko2iblnd]
2016-10-31 14:48:04 [246684.777290]  [<ffffffffa0bc9b39>] kiblnd_active_connect+0x79/0x540 [ko2iblnd]
2016-10-31 14:48:04 [246684.785343]  [<ffffffff810cb7f5>] ? sched_clock_cpu+0xa5/0xe0
2016-10-31 14:48:04 [246684.791852]  [<ffffffffa0bcb0e0>] kiblnd_cm_callback+0x10e0/0x1260 [ko2iblnd]
2016-10-31 14:48:04 [246684.799911]  [<ffffffffa09f346c>] cma_work_handler+0x6c/0xa0 [rdma_cm]
2016-10-31 14:48:04 [246684.807294]  [<ffffffff810aab2b>] process_one_work+0x18b/0x4d0
2016-10-31 14:48:04 [246684.813897]  [<ffffffff810aba66>] worker_thread+0x126/0x430
2016-10-31 14:48:04 [246684.820209]  [<ffffffff810ab940>] ? rescuer_thread+0x4b0/0x4b0
2016-10-31 14:48:04 [246684.826814]  [<ffffffff810b34cf>] kthread+0xcf/0xe0
2016-10-31 14:48:04 [246684.832353]  [<ffffffff810b3400>] ? kthread_create_on_node+0x140/0x140
2016-10-31 14:48:04 [246684.839735]  [<ffffffff816acfd8>] ret_from_fork+0x58/0x90
2016-10-31 14:48:04 [246684.845854]  [<ffffffff810b3400>] ? kthread_create_on_node+0x140/0x140
2016-10-31 14:48:04 [246684.853234] Code: 49 8d 75 20 ba 08 00 00 00 4c 89 ff e8 5e 67 98 e0 85 c0 0f 84 f9 01 00 00 49 c7 c4 f2 ff ff ff 49 8b 86 10 01 00 00 48 8d 78 28 <f0> 83 68 28 01 0f 94 c2 84 d2 74 05 e8 a3 d6 ff ff 41 8b 96 a0 
2016-10-31 14:48:04 [246684.875054] RIP  [<ffffffffa09ac5cc>] rvt_create_qp+0x3fc/0xa60 [rdmavt]
2016-10-31 14:48:04 [246684.882638]  RSP <ffff88201db17bf0>
2016-10-31 14:48:04 [246684.886623] CR2: 0000000000000028
2016-10-31 14:48:04 [246684.893660] ---[ end trace d73e3a2bbac48f14 ]---

When rvt_create_qp() runs out of queue pairs to allocate, it will attempt to put a reference to qp->ip, but this structure is NULL if the request comes from kernel space.  In our case, it appears that this is being caused by Lustre (ko2iblnd) churning through queue pairs, but this should be triggerable via any in-kernel verbs consumer, including IPoIB on a sufficiently large fabric.

I have posted a patch to linux-rdma, which contains a simple workaround for the issue:

"IB/rdmavt: Only put mmap_info ref if it exists"
http://marc.info/?l=linux-rdma&m=147803367001588&w=2

To verify the issue and fix, I created a reproducer that simply spawns thousands of queue pairs from within the kernel, which I've attached.

Version-Release number of selected component (if applicable):

3.10.0-510.el6

How reproducible:

With reproducer, always.

Steps to Reproduce:
1. Download the reproducer, manyqp.c
2. Modify TEST_ADDR to be an IP address on the IB device to be tested, and MAX_CONNS to be larger than the number of queue pairs configured for the device.
3. Compile as a kernel module and install.

Actual results:

Generates a NULL pointer dereference with the same address and at the same offset into rvt_create_qp() as above.

Expected results:

Doesn't crash with the NULL pointer exception.  Once the QP pool is exhausted, subsequent attempts to create QP's should fail with ENOMEM (-12).

Comment 5 Jonathan Toppins 2017-01-06 18:45:55 UTC

Created attachment 1238084 [details]
enhanced kernel module to accept module params - makes automation easier

Comment 7 Jonathan Toppins 2017-01-19 14:23:17 UTC

Posted:
http://patchwork.usersys.redhat.com/patch/162825/

Comment 8 Rafael Aquini 2017-01-26 14:56:40 UTC

Patch(es) committed on kernel repository and an interim kernel build is undergoing testing

Comment 11 Rafael Aquini 2017-01-27 14:05:39 UTC

Patch(es) available on kernel-3.10.0-549.el7

Comment 15 errata-xmlrpc 2017-08-02 04:25:56 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:1842