Bug 1234408 - STACK_RESET may crash with concurrent statedump requests to a glusterfs process
Summary: STACK_RESET may crash with concurrent statedump requests to a glusterfs process
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: 3.7.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: krishnan parthasarathi
QA Contact:
URL:
Whiteboard:
Depends On: 1229658
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-06-22 13:40 UTC by krishnan parthasarathi
Modified: 2015-11-03 23:06 UTC (History)
3 users (show)

Fixed In Version: glusterfs-3.7.3
Doc Type: Bug Fix
Doc Text:
Clone Of: 1229658
Environment:
Last Closed: 2015-07-30 09:51:33 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description krishnan parthasarathi 2015-06-22 13:40:51 UTC
+++ This bug was initially created as a clone of Bug #1229658 +++

Description of problem:
statedump requests that traverse call frames of all call stacks in execution may race with a STACK_RESET on a stack. This could crash the corresponding glusterfs process. For e.g, recently we observed this in a regression test case, tests/basic/afr/sparse-self-heal.t.

Version-Release number of selected component (if applicable):
N/A

How reproducible:
Intermittent

Steps to Reproduce:
1. Maintain constant I/O on a GlusterFS volume.
2. Issue a statedump request, using kill -SIGUSR1 <process-pid> concurrently.
3.

Actual results:
glusterfs process may crash

Expected results:
glusterfs process shouldn't crash and the statedump must be logged successfully.

Additional info:

--- Additional comment from Anand Avati on 2015-06-09 07:43:20 EDT ---

REVIEW: http://review.gluster.org/11095 (stack: use list_head for managing frames) posted (#5) for review on master by Krishnan Parthasarathi (kparthas@redhat.com)

Comment 1 Anand Avati 2015-06-22 13:41:17 UTC
REVIEW: http://review.gluster.org/11352 (stack: use list_head for managing frames) posted (#1) for review on release-3.7 by Krishnan Parthasarathi (kparthas@redhat.com)

Comment 2 Anand Avati 2015-07-01 11:18:25 UTC
COMMIT: http://review.gluster.org/11352 committed in release-3.7 by Pranith Kumar Karampuri (pkarampu@redhat.com) 
------
commit 8ad92bbde3a17ce9aa44e32ae42df5db259fa2ce
Author: Krishnan Parthasarathi <kparthas@redhat.com>
Date:   Fri Jun 5 10:33:11 2015 +0530

    stack: use list_head for managing frames
    
    PROBLEM
    --------
    
    statedump requests that traverse call frames of all call stacks in
    execution may race with a STACK_RESET on a stack.  This could crash the
    corresponding glusterfs process. For e.g, recently we observed this in a
    regression test case tests/basic/afr/sparse-self-heal.t.
    
    FIX
    ---
    
    gf_proc_dump_pending_frames takes a (TRY_LOCK) call_pool->lock before
    iterating through call frames of all call stacks in progress.  With this
    fix, STACK_RESET removes its call frames under the same lock.
    
    Additional info
    ----------------
    
    This fix makes call_stack_t to use struct list_head in place of custom
    doubly-linked list implementation. This makes call_frame_t manipulation
    easier to maintain in the context of STACK_WIND et al.
    
    BUG: 1234408
    Change-Id: I7e43bccd3994cd9184ab982dba3dbc10618f0d94
    Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
    Reviewed-on: http://review.gluster.org/11095
    Reviewed-by: Niels de Vos <ndevos@redhat.com>
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
    Tested-by: NetBSD Build System <jenkins@build.gluster.org>
    (cherry picked from commit 79e4c7b2fad6db15863efb4e979525b1bd4862ea)
    Reviewed-on: http://review.gluster.org/11352

Comment 3 Kaushal 2015-07-30 09:51:33 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.3, please open a new bug report.

glusterfs-3.7.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/12078
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.