Bug 1002698 - AFR : change-logs becoming "0xffffffff0000000000000000" when a brick goes offline and comes back online
AFR : change-logs becoming "0xffffffff0000000000000000" when a brick goes off...
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: replicate (Show other bugs)
mainline
Unspecified Unspecified
high Severity urgent
: ---
: ---
Assigned To: Pranith Kumar K
:
Depends On: 1002069
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-29 14:05 EDT by Pranith Kumar K
Modified: 2014-04-17 07:46 EDT (History)
8 users (show)

See Also:
Fixed In Version: glusterfs-3.5.0
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1002069
Environment:
Last Closed: 2014-04-17 07:46:48 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Comment 1 Anand Avati 2013-08-29 14:10:59 EDT
REVIEW: http://review.gluster.org/5736 (cluster/afr: Reset attempted count before attempting blocking lock) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu@redhat.com)
Comment 2 Anand Avati 2013-08-29 15:29:54 EDT
COMMIT: http://review.gluster.org/5736 committed in master by Anand Avati (avati@redhat.com) 
------
commit 7dd4be82b1a346077673fde9218ae7c8ad8e11e0
Author: Pranith Kumar K <pkarampu@redhat.com>
Date:   Thu Aug 29 22:42:43 2013 +0530

    cluster/afr: Reset attempted count before attempting blocking lock
    
    Problem:
    internal_lock->lk_attempted_count keeps track of the number of blocking
    locks attempted. lk_expected_count keeps track of the number locks expected.
    Here are the sequence of steps that happen which lead to the illution that
    a full file lock is achieved, even without attempting any lock.
    
    2 mounts are doing dd on same file. Both of them witness a brick going
    down and coming back up again. Both of the mounts issue self-heal
    1) Both mount-1, mount-2 attempt full file locks in self-heal domain.
    lets say mount-1 got the lock, mount-2 attempts blocking lock.
    
    2) mount-1 attempts full file lock in data domain. It goes into blocking
    mode because some other writes are in progress. Eventually it gets the lock.
    But this results in lk_attempted_count to be still as 2 and will not be reset.
    It completes syncing the data.
    
    3) mount-1 before unlocking final small range lock attempts full file lock in
    data domain to figure out the source/sink. This will be put into blocked mode
    again because some other writes are in progress. But this time seeing the
    stale value of lk_attempted_count being equal to lk_expected_count, blocking_lock
    phase thinks it completed locking without acquiring a single lock :-O.
    
    4) mount-1 reads xattrs without any lock but since it does not modify the xattrs,
    no harm is done by this phase. It tries to do unlocks and the unlocks will fail
    because the locks are never taken in data domain. mount-1 also unlocks
    self-heal domain locks.
    
    Our beloved mount-2 now gets the chance to cause horror :-(.
    
    5) mount-2 gets the full range blocking lock in self-heal domain.
    Please note that this sets lk_attempted_count to 2.
    
    6) mount-2 attempts full range lock in data domain, since there are still
    writes on going, it switches to blocking mode. But since lk_attempted_count is 2
    which is same as lk_expected_count, blocking phase locks thinks it actually got
    the full range locks even though not a single lock request went out the wire.
    
    7) mount-2 reads the change-log xattrs, which would give the number of operations
    in progress (lets call this 'X'). It does the syncing and at the end of the sync
    decrements the changelog by 'X'. But since that 'X' was introduced by 'X' number
    of transactions that are in progress, they also decrement the changelog by 'X'.
    Effectively for 'X' operations 'X' number of pre-ops are done but 2 times 'X'
    number of post-ops are done resulting in -ve changelog numbers.
    
    Fix:
    Reset the lk_attempted_count and inode locks array that is used to remember locks
    that are granted.
    
    Change-Id: Ic0a79cd16f32392ea7c790511343c73592bbe6bd
    BUG: 1002698
    Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
    Reviewed-on: http://review.gluster.org/5736
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Anand Avati <avati@redhat.com>
Comment 3 Niels de Vos 2014-04-17 07:46:48 EDT
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.0, please reopen this bug report.

glusterfs-3.5.0 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.