Bug 1346158 - Possible crash due to a timer cancellation race
Summary: Possible crash due to a timer cancellation race
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: disperse
Version: 3.8.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Xavi Hernandez
QA Contact:
URL:
Whiteboard:
Depends On: 1345855
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-06-14 06:48 UTC by Xavi Hernandez
Modified: 2016-07-08 14:42 UTC (History)
1 user (show)

Fixed In Version: glusterfs-3.8.1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1345855
Environment:
Last Closed: 2016-07-08 14:42:55 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Xavi Hernandez 2016-06-14 06:48:22 UTC
+++ This bug was initially created as a clone of Bug #1345855 +++

Description of problem:

Incorrect management of timers failed to be cancelled could lead to crashes when the timer callback is executed and some resources have already been released by the cancelling thread.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Vijay Bellur on 2016-06-13 12:47:26 CEST ---

REVIEW: http://review.gluster.org/14712 (cluster/ec: Fix race in timer cancellation) posted (#1) for review on master by Xavier Hernandez (xhernandez)

--- Additional comment from Vijay Bellur on 2016-06-13 12:49:57 CEST ---

REVIEW: http://review.gluster.org/14712 (cluster/ec: Fix race in timer cancellation) posted (#2) for review on master by Xavier Hernandez (xhernandez)

--- Additional comment from Vijay Bellur on 2016-06-13 13:40:39 CEST ---

REVIEW: http://review.gluster.org/14712 (cluster/ec: Fix race in timer cancellation) posted (#3) for review on master by Xavier Hernandez (xhernandez)

--- Additional comment from Vijay Bellur on 2016-06-14 03:03:24 CEST ---

COMMIT: http://review.gluster.org/14712 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit fb013a9db2cc019d36b07644f24e6c15ed39725c
Author: Xavier Hernandez <xhernandez>
Date:   Mon Jun 13 12:42:47 2016 +0200

    cluster/ec: Fix race in timer cancellation
    
    A race in timer cancellation for delayed unlock could cause a crash
    if the cancelling thread fails to cancel the timer because it has
    already been fired but not executed, and the callback is scheduled
    out of the CPU, delaying it until the thread has released important
    resources needed by the callback.
    
    This patch improves the handling of this case to make it robust.
    
    Change-Id: I5c8a8c6610c5136f71b938aa78b5878ba05238d4
    BUG: 1345855
    Signed-off-by: Xavier Hernandez <xhernandez>
    Reviewed-on: http://review.gluster.org/14712
    Smoke: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>

Comment 1 Vijay Bellur 2016-06-14 06:56:16 UTC
REVIEW: http://review.gluster.org/14723 (cluster/ec: Fix race in timer cancellation) posted (#1) for review on release-3.8 by Xavier Hernandez (xhernandez)

Comment 2 Vijay Bellur 2016-07-04 13:28:11 UTC
COMMIT: http://review.gluster.org/14723 committed in release-3.8 by Niels de Vos (ndevos) 
------
commit 6484ac71abbc183b31767f6ba761f870be37de76
Author: Xavier Hernandez <xhernandez>
Date:   Mon Jun 13 12:42:47 2016 +0200

    cluster/ec: Fix race in timer cancellation
    
    A race in timer cancellation for delayed unlock could cause a crash
    if the cancelling thread fails to cancel the timer because it has
    already been fired but not executed, and the callback is scheduled
    out of the CPU, delaying it until the thread has released important
    resources needed by the callback.
    
    This patch improves the handling of this case to make it robust.
    
    Backport of:
    > Change-Id: I5c8a8c6610c5136f71b938aa78b5878ba05238d4
    > BUG: 1345855
    > Signed-off-by: Xavier Hernandez <xhernandez>
    > Reviewed-on: http://review.gluster.org/14712
    > Smoke: Gluster Build System <jenkins.com>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.com>
    > Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    
    Change-Id: I5c8a8c6610c5136f71b938aa78b5878ba05238d4
    BUG: 1346158
    Signed-off-by: Xavier Hernandez <xhernandez>
    Reviewed-on: http://review.gluster.org/14723
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 3 Niels de Vos 2016-07-08 14:42:55 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.1, please open a new bug report.

glusterfs-3.8.1 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.packaging/156
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.