Bug 919352

Summary: glusterd segfaults/core dumps on "gluster volume status ... detail"
Product: [Community] GlusterFS Reporter: Lars Ellenberg <lars.ellenberg>
Component: glusterdAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.3.1CC: gluster-bugs, jdarcy, nsathyan, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 18:00:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 918917    

Description Lars Ellenberg 2013-03-08 08:56:27 UTC
Description of problem:

glusterd segfaults/core dumps on "gluster volume status ... detail",
if stat() or similar returns errors on one of the bricks. 

Version-Release number of selected component (if applicable):

both release-3.3 and master, probably all of them.


How reproducible:

always :-)
patch already posted at http://review.gluster.org/#change,4646

Steps to Reproduce:

 * use two xfs backends, create a volume
 * mount that volume
 * cause one of backends to "force shutdown"
   (e.g. by inject IO errors on the backing device, easy with dmsetup)
 * call sync
   (ok, maybe you need to mkdir or touch something in the mountpoint first;
    but in fact, it reproduced with only "sleep $sufficient_seconds",
    or simply sync)
 * ask for "volume status .... detail"
  
Actual results:

glusterd segfaults/core dumps, due to a double free,
as explained with the patch above.

Expected results:

Should be obvious ;-)


Additional info:

I have difficulties reproducing this with ext4 backends.
Apparently the "remounted ro" ext4 still allows for successful stat(),
whereas the "force-shutdown"ed xfs fails stat() with -EINVAL or -EIO.

Comment 1 Vijay Bellur 2013-03-08 17:18:06 UTC
CHANGE: http://review.gluster.org/4646 (glusterd: fix segfault on volume status detail) merged in master by Vijay Bellur (vbellur)

Comment 2 krishnan parthasarathi 2013-03-19 12:17:05 UTC
*** Bug 915329 has been marked as a duplicate of this bug. ***

Comment 3 Anand Avati 2013-04-16 14:15:21 UTC
REVIEW: http://review.gluster.org/4841 (glusterd: fix segfault on volume status detail) posted (#1) for review on release-3.4 by Vijay Bellur (vbellur)

Comment 4 Anand Avati 2013-04-16 16:43:55 UTC
COMMIT: http://review.gluster.org/4841 committed in release-3.4 by Vijay Bellur (vbellur) 
------
commit 4c8bb7c4b0471fe2a5095639f0fd44f50ba28dc8
Author: Lars Ellenberg <lars>
Date:   Sat Mar 2 00:59:15 2013 +0100

    glusterd: fix segfault on volume status detail
    
    If for some reason glusterd_get_brick_root() fails,
    it frees the gf_strdup'ed *mount_point in its own error path,
    and returns -1.
    
    Unfortunately it already had assigned that pointer value
    to the output argument, the caller function
    glusterd_add_brick_detail() sees a non-NULL pointer,
    and free() again: segfault.
    
    Could be fixed with a one-liner (*mount_point = NULL)
    in the error path, but I think glusterd_get_brick_root()
    should only assign to the output argument once all checks passed,
    so I use a local temporary pointer, which increases the patch a bit.
    
    Change-Id: I3f3035f01e80a5e9bdf2da895e4cf7baa3dfbd2f
    BUG: 919352
    Signed-off-by: Lars Ellenberg <lars>
    Reviewed-on: http://review.gluster.org/4646
    Reviewed-by: Krishnan Parthasarathi <kparthas>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-on: http://review.gluster.org/4841
    Reviewed-by: Jeff Darcy <jdarcy>