Bug 1234408

Summary: STACK_RESET may crash with concurrent statedump requests to a glusterfs process
Product: [Community] GlusterFS Reporter: krishnan parthasarathi <kparthas>
Component: coreAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.7.1CC: bugs, gluster-bugs, nsathyan
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.7.3 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1229658 Environment:
Last Closed: 2015-07-30 09:51:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1229658    
Bug Blocks:    

Description krishnan parthasarathi 2015-06-22 13:40:51 UTC
+++ This bug was initially created as a clone of Bug #1229658 +++

Description of problem:
statedump requests that traverse call frames of all call stacks in execution may race with a STACK_RESET on a stack. This could crash the corresponding glusterfs process. For e.g, recently we observed this in a regression test case, tests/basic/afr/sparse-self-heal.t.

Version-Release number of selected component (if applicable):
N/A

How reproducible:
Intermittent

Steps to Reproduce:
1. Maintain constant I/O on a GlusterFS volume.
2. Issue a statedump request, using kill -SIGUSR1 <process-pid> concurrently.
3.

Actual results:
glusterfs process may crash

Expected results:
glusterfs process shouldn't crash and the statedump must be logged successfully.

Additional info:

--- Additional comment from Anand Avati on 2015-06-09 07:43:20 EDT ---

REVIEW: http://review.gluster.org/11095 (stack: use list_head for managing frames) posted (#5) for review on master by Krishnan Parthasarathi (kparthas)

Comment 1 Anand Avati 2015-06-22 13:41:17 UTC
REVIEW: http://review.gluster.org/11352 (stack: use list_head for managing frames) posted (#1) for review on release-3.7 by Krishnan Parthasarathi (kparthas)

Comment 2 Anand Avati 2015-07-01 11:18:25 UTC
COMMIT: http://review.gluster.org/11352 committed in release-3.7 by Pranith Kumar Karampuri (pkarampu) 
------
commit 8ad92bbde3a17ce9aa44e32ae42df5db259fa2ce
Author: Krishnan Parthasarathi <kparthas>
Date:   Fri Jun 5 10:33:11 2015 +0530

    stack: use list_head for managing frames
    
    PROBLEM
    --------
    
    statedump requests that traverse call frames of all call stacks in
    execution may race with a STACK_RESET on a stack.  This could crash the
    corresponding glusterfs process. For e.g, recently we observed this in a
    regression test case tests/basic/afr/sparse-self-heal.t.
    
    FIX
    ---
    
    gf_proc_dump_pending_frames takes a (TRY_LOCK) call_pool->lock before
    iterating through call frames of all call stacks in progress.  With this
    fix, STACK_RESET removes its call frames under the same lock.
    
    Additional info
    ----------------
    
    This fix makes call_stack_t to use struct list_head in place of custom
    doubly-linked list implementation. This makes call_frame_t manipulation
    easier to maintain in the context of STACK_WIND et al.
    
    BUG: 1234408
    Change-Id: I7e43bccd3994cd9184ab982dba3dbc10618f0d94
    Signed-off-by: Krishnan Parthasarathi <kparthas>
    Reviewed-on: http://review.gluster.org/11095
    Reviewed-by: Niels de Vos <ndevos>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Tested-by: NetBSD Build System <jenkins.org>
    (cherry picked from commit 79e4c7b2fad6db15863efb4e979525b1bd4862ea)
    Reviewed-on: http://review.gluster.org/11352

Comment 3 Kaushal 2015-07-30 09:51:33 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.3, please open a new bug report.

glusterfs-3.7.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/12078
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user