Bug 1217386 - Crash in dht_getxattr_cbk
Summary: Crash in dht_getxattr_cbk
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: 3.7.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Susant Kumar Palai
QA Contact:
URL:
Whiteboard:
Depends On: 1215592 1245565
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-04-30 09:29 UTC by Susant Kumar Palai
Modified: 2015-07-22 10:12 UTC (History)
3 users (show)

Fixed In Version: glusterfs-3.7.0
Doc Type: Bug Fix
Doc Text:
Clone Of: 1215592
Environment:
Last Closed: 2015-05-14 17:29:31 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Susant Kumar Palai 2015-04-30 09:29:14 UTC
+++ This bug was initially created as a clone of Bug #1215592 +++

Description of problem:

 1. When two threads execute in parallel in dht_getxattr_cbk
it may so happen that, both may find local->xattr to be NULL. As
a result dht_aggregate_xattr may not get executed.

 2. In dht_getxattr_cbk,

             thread1                         thread2
T1      this_call_cnt = 2 -1
T2                                this_call_cnt = 1 - 1
T3                                fills local_xattr
T4                                DHT_STACK_UNWIND -> local_wipe
T5      tries to dereference local
        which is already freed,
        leading to crash.

Version-Release number of selected component (if applicable):


How reproducible:
 Got the crash from gerrit: http://build.gluster.org/job/rackspace-regression-2GB-triggered/7345/consoleFull

Steps to Reproduce:
1.
2.
3.

Actual results:
client process crashes 

Expected results:


Additional info:

--- Additional comment from Anand Avati on 2015-04-27 13:56:56 MVT ---

REVIEW: http://review.gluster.org/10389 (dht: tackle thread race in dht_getxattr_cbk) posted (#3) for review on master by Susant Palai (spalai)

--- Additional comment from Anand Avati on 2015-04-29 19:02:03 MVT ---

COMMIT: http://review.gluster.org/10389 committed in master by Shyamsundar Ranganathan (srangana) 
------
commit 6bde16f7dc4a43d85e488f25ad679abfd24e72d1
Author: Susant Palai <spalai>
Date:   Sun Apr 26 23:49:56 2015 +0530

    dht: tackle thread race in dht_getxattr_cbk
    
    problem:
     1. When two threads execute in parallel in dht_getxattr_cbk
    it may so happen that, both may find local->xattr to be NULL. As
    a result dht_aggregate_xattr may not get executed.
    
     2. In dht_getxattr_cbk,
    
                 thread1                         thread2
    T1      this_call_cnt = 2 -1
    T2                                this_call_cnt = 1 - 1
    T3                                fills local_xattr
    T4                                DHT_STACK_UNWIND -> local_wipe
    T5      tries to dereference local
            which is already freed,
            leading to crash.
    
    Solution:
     for problem1: Execute critical section inside frame lock
    to resolve race.
    
     for problem2: Calculate this_call_count just before out section.
    
    Change-Id: I9827ac8fafebb0c733a4e4f3c710b752f1cd45fa
    BUG: 1215592
    Signed-off-by: Susant Palai <spalai>
    Reviewed-on: http://review.gluster.org/10389
    Reviewed-by: Anuradha Talur <atalur>
    Reviewed-by: N Balachandran <nbalacha>
    Reviewed-by: Kotresh HR <khiremat>
    Tested-by: NetBSD Build System
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Shyamsundar Ranganathan <srangana>

Comment 1 Anand Avati 2015-04-30 09:32:29 UTC
REVIEW: http://review.gluster.org/10467 (dht: tackle thread race in dht_getxattr_cbk) posted (#1) for review on release-3.7 by Susant Palai (spalai)

Comment 2 Anand Avati 2015-05-01 14:48:28 UTC
COMMIT: http://review.gluster.org/10467 committed in release-3.7 by Vijay Bellur (vbellur) 
------
commit 85a0fb6d304babb9d7f35e26a45677d8210da8eb
Author: Susant Palai <spalai>
Date:   Sun Apr 26 23:49:56 2015 +0530

    dht: tackle thread race in dht_getxattr_cbk
    
    problem:
     1. When two threads execute in parallel in dht_getxattr_cbk
    it may so happen that, both may find local->xattr to be NULL. As
    a result dht_aggregate_xattr may not get executed.
    
     2. In dht_getxattr_cbk,
    
                 thread1                         thread2
    T1      this_call_cnt = 2 -1
    T2                                this_call_cnt = 1 - 1
    T3                                fills local_xattr
    T4                                DHT_STACK_UNWIND -> local_wipe
    T5      tries to dereference local
            which is already freed,
            leading to crash.
    
    Solution:
     for problem1: Execute critical section inside frame lock
    to resolve race.
    
     for problem2: Calculate this_call_count just before out section.
    
    BUG: 1217386
    Change-Id: I14fdb0cb1825896721670d71f48c93053448be7b
    Signed-off-by: Susant Palai <spalai>
    Reviewed-on: http://review.gluster.org/10389
    Reviewed-by: Anuradha Talur <atalur>
    Reviewed-by: N Balachandran <nbalacha>
    Reviewed-by: Kotresh HR <khiremat>
    Reviewed-by: Shyamsundar Ranganathan <srangana>
    Signed-off-by: Susant Palai <spalai>
    Reviewed-on: http://review.gluster.org/10467
    Tested-by: Gluster Build System <jenkins.com>

Comment 3 Niels de Vos 2015-05-14 17:29:31 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 4 Niels de Vos 2015-05-14 17:35:58 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 5 Niels de Vos 2015-05-14 17:38:19 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 6 Niels de Vos 2015-05-14 17:47:00 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.