Bug 1522651 - rdma transport may access an obsolete item in gf_rdma_device_t->all_mr, and causes glusterfsd/glusterfs process crash.
Summary: rdma transport may access an obsolete item in gf_rdma_device_t->all_mr, and c...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: rdma
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
Assignee: Mohammed Rafi KC
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Depends On:
Blocks: 1525850 1527699
TreeView+ depends on / blocked
 
Reported: 2017-12-06 07:48 UTC by Yi Wang
Modified: 2018-03-15 11:22 UTC (History)
3 users (show)

Fixed In Version: glusterfs-4.0.0
Clone Of:
: 1525850 1527699 (view as bug list)
Environment:
Last Closed: 2018-03-15 11:22:36 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Yi Wang 2017-12-06 07:48:18 UTC
Description of problem:
        In the rdma.c file, gf_rdma_device_t->all_mr is a __gf_rdma_arena_mr(include RDMA Memory Region(MR) content) kind of list in the rdma rpc-transport.
        The rdma rpc-transport will add/delete items to the gf_rdma_device_t->all_mr when MRs register, deregister, and free.
        Because gf_rdma_device_t->all_mr is used by different threads and it is not mutex protected, rdma transport maybe access obsolete items in it.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
        At a heavy condition, the items in the gf_rdma_device_t->all_mr should be released by threads. As a result, glusterfsd/glusterfs process will crash.

Expected results:
        gf_rdma_device_t->all_mr must be mutex protected.

Additional info:
        None

Comment 2 Yi Wang 2017-12-06 08:48:49 UTC
correct product name to GlusterFS.

Comment 3 Yi Wang 2017-12-06 10:01:57 UTC
correct the hardware selection.

Comment 4 Worker Ant 2017-12-12 05:50:05 UTC
COMMIT: https://review.gluster.org/18943 committed in master by \"Yi Wang\" <wangyi> with a commit message- rpc-transport/rdma: Add a mutex for the list of RDMA Memory Region(MR) access

Problem: gf_rdma_device_t->all_mr is a __gf_rdma_arena_mr(includes MR content)
	 kind of list in the rdma rpc-transport. The rdma rpc-transport will
	 add/delete items to the list when MRs register, deregister, and free.
	 Because gf_rdma_device_t->all_mr is used by different threads
	 and it is not mutex protected, rdma transport maybe access obsolete
	 items in it.

Solution: Add a mutex protection for the gf_rdma_device_t->all_mr.

Change-Id: I2b7de0f7aa516b90bb6f3c6aae3aadd23b243900
BUG: 1522651
Signed-off-by: Yi Wang <wangyi>

Comment 5 Worker Ant 2017-12-14 03:05:02 UTC
REVIEW: https://review.gluster.org/19032 (rpc-transport/rdma: Add a mutex for the list of RDMA Memory Region(MR) access) posted (#1) for review on release-3.12 by Yi Wang

Comment 6 Worker Ant 2017-12-14 03:10:19 UTC
REVIEW: https://review.gluster.org/19033 (rpc-transport/rdma: Add a mutex for the list of RDMA Memory Region(MR) access) posted (#1) for review on release-3.13 by Yi Wang

Comment 7 Yi Wang 2017-12-14 03:34:51 UTC
The bugfix of this bug needs to backport to release-3.11~3.13. And the commits depend on the bug status. Please help me change the bug status.
>> BUG id 1522651 has an invalid status as MODIFIED. Acceptable status values are NEW, ASSIGNED or POST.

Comment 8 Worker Ant 2017-12-14 08:13:09 UTC
REVISION POSTED: https://review.gluster.org/19032 (rpc-transport/rdma: Add a mutex for the list of RDMA Memory Region(MR) access) posted (#2) for review on release-3.12 by mohammed rafi  kc

Comment 9 Worker Ant 2017-12-14 08:42:07 UTC
REVIEW: https://review.gluster.org/19035 (rpc-transport/rdma: Add a mutex for the list of RDMA Memory Region(MR) access) posted (#1) for review on release-3.11 by Yi Wang

Comment 10 Worker Ant 2017-12-19 23:58:25 UTC
REVISION POSTED: https://review.gluster.org/19033 (rpc-transport/rdma: Add a mutex for the list of RDMA Memory Region(MR) access) posted (#2) for review on release-3.13 by Shyamsundar Ranganathan

Comment 11 Shyamsundar 2018-03-15 11:22:36 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.0.0, please open a new bug report.

glusterfs-4.0.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-March/000092.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.