Bug 1153610 - libgfapi crashes in glfs_fini for RDMA type volumes
Summary: libgfapi crashes in glfs_fini for RDMA type volumes
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: libgfapi
Version: mainline
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1171662
TreeView+ depends on / blocked
 
Reported: 2014-10-16 10:27 UTC by Anoop C S
Modified: 2015-05-14 17:44 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.7.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1171662 (view as bug list)
Environment:
Last Closed: 2015-05-14 17:28:00 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
libgfapi test program (29.75 KB, text/x-csrc)
2014-10-16 10:27 UTC, Anoop C S
no flags Details

Description Anoop C S 2014-10-16 10:27:34 UTC
Created attachment 947560 [details]
libgfapi test program

Description of problem:

C program which uses libgfapi for RDMA volume crashes in glfs_fini().

Version-Release number of selected component (if applicable):

How reproducible:
Always

Steps to Reproduce:
1. Create a 1-brick RDMA volume.
2. Compile the attached C program that uses libgfapi
gcc -pthreads -g -O0  -Wall --pedantic -o gfapi_perf_test -I /usr/include/glusterfs/api gfapi_perf_test.c  -lgfapi -lrt

3. Do the following exports
export LD_LIBRARY_PATH=/usr/local/lib
export GFAPI_HOSTNAME=<server ip>
export GFAPI_TRANSPORT=rdma
export GFAPI_VOLNAME=<volume name>

4.Run the compiled output as
GFAPI_FILES=1 GFAPI_RECSZ=1024 GFAPI_FSZ=1048576 ./gfapi_perf_test

Actual results:
Segmentation fault (core dumped)

Expected results:
Program executes successfully.


Additional info:

Comment 1 Anoop C S 2014-11-03 06:23:39 UTC
Root cause of the crash was identified as follows:

When main() returns from a C program, global prioritized destructor functions are called in priority order before the process terminates. According to rdmacm standard library, rdma_cma_fini() is defined as a destructor function. The same function is also invoked as part of rdma_disconnect() initiated through xlator_notify() inside glfs_fini().

Due to the increased delay in rdma_disconnect() and associated cleanup, there will be a race between the main thread and rdma thread to execute rdma_cm_fini(), which will result in a segmentation fault.

Comment 2 Anand Avati 2014-11-06 04:45:26 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#1) for review on master by Anoop C S (achiraya)

Comment 3 Anand Avati 2014-11-06 17:14:58 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#2) for review on master by Anoop C S (achiraya)

Comment 4 Anand Avati 2014-11-07 04:35:12 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#3) for review on master by Anoop C S (achiraya)

Comment 5 Anand Avati 2014-11-27 09:28:57 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#4) for review on master by Anoop C S (achiraya)

Comment 6 Anand Avati 2014-11-27 13:30:46 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#5) for review on master by Anoop C S (achiraya)

Comment 7 Anand Avati 2014-11-28 10:23:47 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#6) for review on master by Anoop C S (achiraya)

Comment 8 Anand Avati 2014-12-01 07:47:28 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#7) for review on master by Poornima G (pgurusid)

Comment 9 Anand Avati 2014-12-02 06:23:21 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#8) for review on master by Anoop C S (achiraya)

Comment 10 Anand Avati 2014-12-02 09:23:54 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#9) for review on master by Anoop C S (achiraya)

Comment 11 Anand Avati 2014-12-03 11:55:55 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#10) for review on master by Anoop C S (achiraya)

Comment 12 Anand Avati 2014-12-03 14:32:38 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#11) for review on master by Anoop C S (achiraya)

Comment 13 Anand Avati 2014-12-03 17:09:47 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#12) for review on master by Anoop C S (achiraya)

Comment 14 Anand Avati 2014-12-04 12:40:34 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#13) for review on master by Anoop C S (achiraya)

Comment 15 Anand Avati 2014-12-04 14:58:54 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#14) for review on master by Anoop C S (achiraya)

Comment 16 Anand Avati 2014-12-05 08:39:42 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#15) for review on master by Anoop C S (achiraya)

Comment 17 Anand Avati 2014-12-05 13:03:28 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#16) for review on master by Anoop C S (achiraya)

Comment 18 Anand Avati 2014-12-05 13:23:56 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#17) for review on master by Anoop C S (achiraya)

Comment 19 Anand Avati 2014-12-05 17:05:58 UTC
REVIEW: http://review.gluster.org/9060 (libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()) posted (#18) for review on master by Anoop C S (achiraya)

Comment 20 Anand Avati 2014-12-08 09:54:52 UTC
COMMIT: http://review.gluster.org/9060 committed in master by Niels de Vos (ndevos) 
------
commit cd6ffa93dc2a3cb1fcc5438086aebc54f368c2e9
Author: Anoop C S <achiraya>
Date:   Wed Oct 29 09:12:46 2014 -0400

    libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()
    
    Whenever glfs_fini() is being called, currently no
    check is made inside the function to determine whether
    the child is already down or not. This patch will wait
    for GF_EVENT_CHILD_DOWN for the active subvol and
    then exits.
    
    TBD:
    Apart from the active subvol, wait for other CHILD_DOWN
    events generated through operations like volume set in
    future.
    
    Change-Id: I81c64ac07b463bfed48bf306f9e8f46ba0f0a76f
    BUG: 1153610
    Signed-off-by: Anoop C S <achiraya>
    Reviewed-on: http://review.gluster.org/9060
    Reviewed-by: Shyamsundar Ranganathan <srangana>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Raghavendra G <rgowdapp>
    Reviewed-by: Niels de Vos <ndevos>

Comment 21 Niels de Vos 2015-05-14 17:28:00 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 22 Niels de Vos 2015-05-14 17:35:39 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 23 Niels de Vos 2015-05-14 17:38:01 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 24 Niels de Vos 2015-05-14 17:44:09 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.