Bug 1215592

Summary: Crash in dht_getxattr_cbk
Product: [Community] GlusterFS Reporter: Susant Kumar Palai <spalai>
Component: distributeAssignee: Susant Kumar Palai <spalai>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, nbalacha
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8rc2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1217386 1245565 (view as bug list) Environment:
Last Closed: 2016-06-16 12:55:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1217386, 1245565    

Description Susant Kumar Palai 2015-04-27 08:43:41 UTC
Description of problem:

 1. When two threads execute in parallel in dht_getxattr_cbk
it may so happen that, both may find local->xattr to be NULL. As
a result dht_aggregate_xattr may not get executed.

 2. In dht_getxattr_cbk,

             thread1                         thread2
T1      this_call_cnt = 2 -1
T2                                this_call_cnt = 1 - 1
T3                                fills local_xattr
T4                                DHT_STACK_UNWIND -> local_wipe
T5      tries to dereference local
        which is already freed,
        leading to crash.

Version-Release number of selected component (if applicable):


How reproducible:
 Got the crash from gerrit: http://build.gluster.org/job/rackspace-regression-2GB-triggered/7345/consoleFull

Steps to Reproduce:
1.
2.
3.

Actual results:
client process crashes 

Expected results:


Additional info:

Comment 1 Anand Avati 2015-04-27 08:56:56 UTC
REVIEW: http://review.gluster.org/10389 (dht: tackle thread race in dht_getxattr_cbk) posted (#3) for review on master by Susant Palai (spalai)

Comment 2 Anand Avati 2015-04-29 14:02:03 UTC
COMMIT: http://review.gluster.org/10389 committed in master by Shyamsundar Ranganathan (srangana) 
------
commit 6bde16f7dc4a43d85e488f25ad679abfd24e72d1
Author: Susant Palai <spalai>
Date:   Sun Apr 26 23:49:56 2015 +0530

    dht: tackle thread race in dht_getxattr_cbk
    
    problem:
     1. When two threads execute in parallel in dht_getxattr_cbk
    it may so happen that, both may find local->xattr to be NULL. As
    a result dht_aggregate_xattr may not get executed.
    
     2. In dht_getxattr_cbk,
    
                 thread1                         thread2
    T1      this_call_cnt = 2 -1
    T2                                this_call_cnt = 1 - 1
    T3                                fills local_xattr
    T4                                DHT_STACK_UNWIND -> local_wipe
    T5      tries to dereference local
            which is already freed,
            leading to crash.
    
    Solution:
     for problem1: Execute critical section inside frame lock
    to resolve race.
    
     for problem2: Calculate this_call_count just before out section.
    
    Change-Id: I9827ac8fafebb0c733a4e4f3c710b752f1cd45fa
    BUG: 1215592
    Signed-off-by: Susant Palai <spalai>
    Reviewed-on: http://review.gluster.org/10389
    Reviewed-by: Anuradha Talur <atalur>
    Reviewed-by: N Balachandran <nbalacha>
    Reviewed-by: Kotresh HR <khiremat>
    Tested-by: NetBSD Build System
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Shyamsundar Ranganathan <srangana>

Comment 3 Kaleb KEITHLEY 2015-10-22 17:00:12 UTC
changing version to mainline in order to retire pre-release.

If you know the appropriate, correct version for this bug, please
set it.

Comment 4 Kaleb KEITHLEY 2015-10-22 17:06:51 UTC
changing version to mainline in order to retire pre-release.

If you know the appropriate, correct version for this bug, please
set it.

Comment 5 Kaleb KEITHLEY 2015-10-22 17:11:18 UTC
changing version to mainline in order to retire pre-release.

If you know the appropriate, correct version for this bug, please
set it.

Comment 6 Niels de Vos 2016-06-16 12:55:58 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user