Created attachment 731831 [details] This test script needs to be run in the regression tests framework (ref: http://www.gluster.org/community/documentation/index.php/Development_Work_Flow#Test_cases) Description of problem: Glusterd could end up incorrectly updating its in-memory knowledge about the cluster and/or volumes, when certain kind of network/node failures happen in conjunction with volume/peer operations. Users don't see the effects of this (latent) issue 'often', because the chances for a race'y update depend on the relative latency between backend-disks and network. Version-Release number of selected component (if applicable): How reproducible: Intermittent. Steps to Reproduce: 1. Run the regression test script added. 2. 3. Actual results: Sometimes, glusterd may crash and at other times, it could lead to inconsistent view of the cluster among glusterds. Expected results: Glusterd must continue to function as expected. Additional info:
REVIEW: http://review.gluster.org/4784 (glusterd: big lock - a coarse-grained locking to prevent races) posted (#1) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4784 (glusterd: big lock - a coarse-grained locking to prevent races) posted (#2) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4795 (syncenv: be robust against spurious wake()s) posted (#1) for review on master by Anand Avati (avati)
COMMIT: http://review.gluster.org/4795 committed in master by Anand Avati (avati) ------ commit ce111f472796d027796b0cc3a4a6f78689f1172d Author: Anand Avati <avati> Date: Fri Apr 5 02:18:06 2013 -0700 syncenv: be robust against spurious wake()s In the current implementation, when the callers of synctasks perform a spurious wake() of a sleeping synctask (i.e, an extra wake() soon after a wake() which already woke up a yielded synctask), there is now a possibility of two sync threacs picking up the same synctask. This can result in a crash. The fix is to change ->slept = 0|1 and membership of synctask in runqueue atomically. Today we dequeue a task from the runqueue in syncenv_task(), but reset ->slept = 0 much later in synctask_switchto() in an unlocked manner -- which is safe, when there are no spurious wake()s. However, this opens a race window where, if a second wake() happens after the dequeue, but before setting ->slept = 0, it results in queueing the same synctask in the runqueue once again, and get picked up by a different synctask. This is has been diagnosed to be the crashes in the regression tests of http://review.gluster.org/4784. However that patch still has a spurious wake() [the trigger for this bug] which is yet to be fixed. Change-Id: I9b4b9dd5115d6e62ba45162ae90dd5e917a4f83d BUG: 948686 Signed-off-by: Anand Avati <avati> Reviewed-on: http://review.gluster.org/4795 Reviewed-by: Krishnan Parthasarathi <kparthas> Tested-by: Gluster Build System <jenkins.com>
REVIEW: http://review.gluster.org/4784 (glusterd: big lock - a coarse-grained locking to prevent races) posted (#3) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4784 (glusterd: big lock - a coarse-grained locking to prevent races) posted (#4) for review on master by Anand Avati (avati)
REVIEW: http://review.gluster.org/4784 (glusterd: big lock - a coarse-grained locking to prevent races) posted (#5) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4802 (glusterd: Fixed spurious wakeups in glusterd syncops) posted (#1) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4802 (glusterd: Fixed spurious wakeups in glusterd syncops) posted (#2) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4784 (glusterd: big lock - a coarse-grained locking to prevent races) posted (#6) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4802 (glusterd: Fixed spurious wakeups in glusterd syncops) posted (#3) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4784 (glusterd: big lock - a coarse-grained locking to prevent races) posted (#7) for review on master by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/4802 committed in master by Anand Avati (avati) ------ commit 77a02c4dd8467e2a78b3ab3cdef95178ef4b1898 Author: Kaleb S. KEITHLEY <kkeithle> Date: Thu Apr 11 09:36:35 2013 -0400 object-storage: rebase Swift to 1.8.0 (grizzly) Two minor tweaks found while packaging 3.4.0-0.1.alpha2 for Fedora 19 BUG: 948039 Change-Id: I97175636164702cf4042bc4a18ffead76ad386cb Signed-off-by: Kaleb S. KEITHLEY <kkeithle> Reviewed-on: http://review.gluster.org/4807 Reviewed-by: Jeff Darcy <jdarcy> Tested-by: Gluster Build System <jenkins.com>
REVIEW: http://review.gluster.org/4784 (glusterd: big lock - a coarse-grained locking to prevent races) posted (#8) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4820 (glusterd: big lock - a coarse-grained locking to prevent races) posted (#1) for review on release-3.4 by Jeff Darcy (jdarcy)
COMMIT: http://review.gluster.org/4784 committed in master by Anand Avati (avati) ------ commit f34343d3751cd73e8eabe6d5544fb1f58b316595 Author: Krishnan Parthasarathi <kparthas> Date: Tue Apr 2 07:56:25 2013 +0530 glusterd: big lock - a coarse-grained locking to prevent races There are primarily three lists that are part of glusterd process, that are concurrently accessed. Namely, priv->volumes, priv->peers and volinfo->bricks_list. Big-lock approach ----------------- WHAT IS IT? Big lock is a coarse-grained lock which protects all three lists, mentioned above, from racy access. HOW DOES IT WORK? At any given point in time, glusterd's thread(s) are in execution _iff_ there is a preceding, inbound network event. Of course, the sigwaiter thread and timer thread are exceptions. A network event is an external trigger to glusterd, via the epoll thread, in the form of POLLIN and POLLERR. As long as we take the big-lock at all such entry points and yield it when we are done, we are guaranteed that all the network events, accessing the global lists, are serialised. This amounts to holding the big lock at - all the handlers of all the actors in glusterd. (POLLIN) - all the cbks in glusterd. (POLLIN) - rpc_notify (DISCONNECT event), if we access/modify one of the three lists. (POLLERR) In the case of synctask'ized volume operations, we must remember that, if we held the big lock for the entire duration of the handler, we may block other non-synctask rpc actors from executing. For eg, volume-start would block in PMAP SIGNIN, if done incorrectly. To prevent this, we need to yield the big lock, when we yield the synctask, and reacquire on waking up of the synctask. Change-Id: Ib929f9905b55fb6c3fc27fefb497a26dba058e4f BUG: 948686 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/4784 Reviewed-by: Jeff Darcy <jdarcy> Tested-by: Gluster Build System <jenkins.com>
REVIEW: http://review.gluster.org/4833 (syncenv: be robust against spurious wake()s) posted (#1) for review on release-3.4 by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4834 (glusterd: Fixed spurious wakeups in glusterd syncops) posted (#1) for review on release-3.4 by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4835 (glusterd: big lock - a coarse-grained locking to prevent races) posted (#1) for review on release-3.4 by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4837 (glusterd: Avoided deadlock in single node cluster, glusterd restart) posted (#1) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4837 (glusterd: Avoided deadlock in single node cluster, glusterd restart) posted (#2) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4834 (glusterd: Fixed spurious wakeups in glusterd syncops) posted (#2) for review on release-3.4 by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4835 (glusterd: big lock - a coarse-grained locking to prevent races) posted (#2) for review on release-3.4 by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4833 (syncenv: be robust against spurious wake()s) posted (#2) for review on release-3.4 by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/4837 committed in master by Vijay Bellur (vbellur) ------ commit dcebed550ebfc878b0b3bd02ab7fe15db6764f81 Author: Krishnan Parthasarathi <kparthas> Date: Mon Apr 15 22:07:21 2013 +0530 glusterd: Avoided deadlock in single node cluster, glusterd restart In a single node cluster, it is possible to deadlock on the "big lock", while restarting bricks. In glusterd_restart_bricks, we perform a glusterd_brick_connect, where we release the big lock in anticipation that glusterd_brick_rpc_notify could run in the same C stack (and deadlocking). So, in the restart code path, we could unlock before we have performed a lock on the big lock. To fix this, we need to take the big lock in the glusterd_launch_synctask 'thread' as well. Change-Id: I1abea1ca82b55c784b8a810a8194f254b32b1dcc BUG: 948686 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/4837 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Jeff Darcy <jdarcy>
REVIEW: http://review.gluster.org/4834 (glusterd: Fixed spurious wakeups in glusterd syncops) posted (#3) for review on release-3.4 by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4835 (glusterd: big lock - a coarse-grained locking to prevent races) posted (#3) for review on release-3.4 by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4833 (syncenv: be robust against spurious wake()s) posted (#3) for review on release-3.4 by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/4833 committed in release-3.4 by Vijay Bellur (vbellur) ------ commit 1787debc1b6640e15a02ccac4699b92affb2bb14 Author: Krishnan Parthasarathi <kparthas> Date: Mon Apr 15 15:55:28 2013 +0530 syncenv: be robust against spurious wake()s In the current implementation, when the callers of synctasks perform a spurious wake() of a sleeping synctask (i.e, an extra wake() soon after a wake() which already woke up a yielded synctask), there is now a possibility of two sync threacs picking up the same synctask. This can result in a crash. The fix is to change ->slept = 0|1 and membership of synctask in runqueue atomically. Today we dequeue a task from the runqueue in syncenv_task(), but reset ->slept = 0 much later in synctask_switchto() in an unlocked manner -- which is safe, when there are no spurious wake()s. However, this opens a race window where, if a second wake() happens after the dequeue, but before setting ->slept = 0, it results in queueing the same synctask in the runqueue once again, and get picked up by a different synctask. This is has been diagnosed to be the crashes in the regression tests of http://review.gluster.org/4784. However that patch still has a spurious wake() [the trigger for this bug] which is yet to be fixed. BUG: 948686 Change-Id: I51858e887cad2680e46fb973629f8465f4429363 Original-author: Anand Avati <avati> Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/4833 Reviewed-by: Vijay Bellur <vbellur> Tested-by: Gluster Build System <jenkins.com>
COMMIT: http://review.gluster.org/4834 committed in release-3.4 by Vijay Bellur (vbellur) ------ commit 47c118e22d9d6fb6662fe96841ed4fe3089739b5 Author: Krishnan Parthasarathi <kparthas> Date: Wed Apr 10 17:12:01 2013 +0530 glusterd: Fixed spurious wakeups in glusterd syncops glusterd syncops perform a barrier_wake whenever rpc_clnt_submit returned -1. This is based on the wrong assumption that the cbkfn wasn't called. This would result in one more wakeup than there ought to be. BUG: 948686 Change-Id: I839fd218a81255fe50c2047d67461d45360e894d Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/4834 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
COMMIT: http://review.gluster.org/4835 committed in release-3.4 by Vijay Bellur (vbellur) ------ commit 92729add67e2e7b8c7589c2dfab0bde071a7faf2 Author: Krishnan Parthasarathi <kparthas> Date: Mon Apr 15 15:56:57 2013 +0530 glusterd: big lock - a coarse-grained locking to prevent races There are primarily three lists that are part of glusterd process, that are concurrently accessed. Namely, priv->volumes, priv->peers and volinfo->bricks_list. Big-lock approach ----------------- WHAT IS IT? Big lock is a coarse-grained lock which protects all three lists, mentioned above, from racy access. HOW DOES IT WORK? At any given point in time, glusterd's thread(s) are in execution _iff_ there is a preceding, inbound network event. Of course, the sigwaiter thread and timer thread are exceptions. A network event is an external trigger to glusterd, via the epoll thread, in the form of POLLIN and POLLERR. As long as we take the big-lock at all such entry points and yield it when we are done, we are guaranteed that all the network events, accessing the global lists, are serialised. This amounts to holding the big lock at - all the handlers of all the actors in glusterd. (POLLIN) - all the cbks in glusterd. (POLLIN) - rpc_notify (DISCONNECT event), if we access/modify one of the three lists. (POLLERR) In the case of synctask'ized volume operations, we must remember that, if we held the big lock for the entire duration of the handler, we may block other non-synctask rpc actors from executing. For eg, volume-start would block in PMAP SIGNIN, if done incorrectly. To prevent this, we need to yield the big lock, when we yield the synctask, and reacquire on waking up of the synctask. BUG: 948686 Change-Id: I429832f1fed67bcac0813403d58346558a403ce9 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/4835 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
REVIEW: http://review.gluster.org/4921 (synctask: implement barriers around yield, not the other way) posted (#1) for review on master by Anand Avati (avati)
REVIEW: http://review.gluster.org/4921 (synctask: implement barriers around yield, not the other way) posted (#2) for review on master by Anand Avati (avati)
REVIEW: http://review.gluster.org/4922 (Revert "glusterd: Fixed spurious wakeups in glusterd syncops") posted (#1) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4938 (glusterd: Syncop callbks should take big lock too) posted (#1) for review on master by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/4938 committed in master by Anand Avati (avati) ------ commit b224d7ffb88c274ef0a65d4b5d30b2ce320c6200 Author: Krishnan Parthasarathi <kparthas> Date: Thu May 2 16:13:59 2013 +0530 glusterd: Syncop callbks should take big lock too Change-Id: I5ae71ab98f9a336dc9bbf0e7b2ec50a6ed42b0f5 BUG: 948686 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/4938 Reviewed-by: Amar Tumballi <amarts> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Anand Avati <avati>
REVIEW: http://review.gluster.org/4922 (Revert "glusterd: Fixed spurious wakeups in glusterd syncops") posted (#2) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4944 (glusterd: Removed spurious barrier_wake) posted (#1) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4922 (Revert "glusterd: Fix spurious wakeups in glusterd syncops") posted (#3) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4921 (synctask: implement barriers around yield, not the other way) posted (#3) for review on master by Anand Avati (avati)
COMMIT: http://review.gluster.org/4921 committed in master by Anand Avati (avati) ------ commit 83cedcd9be2676e63b1be72ecaf3316a781773cb Author: Anand Avati <avati> Date: Mon Apr 22 04:35:03 2013 -0700 synctask: implement barriers around yield, not the other way In the current implementation, barriers are in the core of the syncprocessors. Wake()s are treated as syncbarrier wake. This is however delicate, as spurious wake()s of the synctask can mess up the accounting of the barrier and waking it prematurely. The fix is to keep yield() and wake() as the basic primitives, and implement barriers as an object impelemented on top of these primitives. This way, only an explicit barrier_wake() gets counted towards the barrier accounting, and spurious wakes will be truly safe. Change-Id: I8087f0f446113e5b2d0853431c0354335ccda076 BUG: 948686 Signed-off-by: Anand Avati <avati> Reviewed-on: http://review.gluster.org/4921 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Krishnan Parthasarathi <kparthas>
REVIEW: http://review.gluster.org/4922 (Revert "glusterd: Fix spurious wakeups in glusterd syncops") posted (#4) for review on master by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/4922 committed in master by Anand Avati (avati) ------ commit fc39ee2ea3a22704ebacd0607cf6fd4eae9ec66a Author: Krishnan Parthasarathi <kparthas> Date: Thu May 2 09:11:58 2013 +0530 Revert "glusterd: Fix spurious wakeups in glusterd syncops" This reverts commit efa154bb0a4cac34d5a9610ec25d38eebe495f22. -- Following is Avati's analysis (edited) from gerrit -- The claim of the patch (being reverted) is that it in some cases cbkfn is missed. This is wrong analysis. cbk_fn is _always_ called. The patch treats ret > 0 as a "missed cbk". ret > 0 only means socket submission was not complete, and is queued to submit asynchronously when POLLOUT is raised. This is sufficient to guarantee that cbkfn is going to be called (either the socket errors or submission succeeds and reply eventually arrives). This commit also removes spurious barrier_wake(s). call backs are guaranteed to be called even if the transport is disconnected. This means, a 'wake' would be called if rpc_clnt_submit is called. Also, we count both successful and failed operations in a particular batch of operations for the synctask_barrier_wait. So, calling synctask_barrier_wake on failure of rpc_clnt_submit (say, due to network failure) would result in a spurious wake. Change-Id: I7d508c2a54b74a65b82f097742206bc777afc53a BUG: 948686 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/4922 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Anand Avati <avati>
REVIEW: http://review.gluster.org/4985 (syncop: Update synctask state appropriately) posted (#2) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4987 (syncop: Remove task from synclock's waitq before 'wake') posted (#2) for review on master by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/4987 committed in master by Vijay Bellur (vbellur) ------ commit 4c0b149d8e7c574186a1ccefd9c74b79f8a06267 Author: Krishnan Parthasarathi <kparthas> Date: Sat May 11 14:36:38 2013 +0530 syncop: Remove task from synclock's waitq before 'wake' Removing task from synclock's waitq after wake could result in a subsequent unlock, wake'ing up the already running task. This fix makes the removal from waitq and wake 'atomic'. Change-Id: Ie51ccf9d38f2cee84471097644aab95496f488b1 BUG: 948686 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/4987 Reviewed-by: Jeff Darcy <jdarcy> Tested-by: Gluster Build System <jenkins.com>
REVIEW: http://review.gluster.org/5021 (glusterd: Refresh glusterd-syncop fixes from master) posted (#1) for review on release-3.4 by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/5021 committed in release-3.4 by Vijay Bellur (vbellur) ------ commit 8505eadb3032132a1b936951687ac643731c29ec Author: Krishnan Parthasarathi <kparthas> Date: Thu May 2 16:13:59 2013 +0530 glusterd: Refresh glusterd-syncop fixes from master Following commits were cherry-picked from master, 044f8ce syncop: Remove task from synclock's waitq before 'wake' cb6aeed glusterd: Give up big lock before performing any RPC 46572fe Revert "glusterd: Fix spurious wakeups in glusterd syncops" 5021e04 synctask: implement barriers around yield, not the other way 4843937 glusterd: Syncop callbks should take big lock too Change-Id: I5ae71ab98f9a336dc9bbf0e7b2ec50a6ed42b0f5 BUG: 948686 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/4938 Reviewed-by: Amar Tumballi <amarts> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Anand Avati <avati> Reviewed-on: http://review.gluster.org/5021 Reviewed-by: Vijay Bellur <vbellur>
REVIEW: http://review.gluster.org/5047 (syncop: Remove task from syncbarrier's waitq before 'wake') posted (#3) for review on master by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/5047 committed in master by Anand Avati (avati) ------ commit 277fabf577f95b20c61d65b28f8269e6abca6fee Author: Krishnan Parthasarathi <kparthas> Date: Mon May 20 17:17:05 2013 +0530 syncop: Remove task from syncbarrier's waitq before 'wake' Removing task from syncbarrier's waitq after wake could result in a subsequent syncbarrier_wake, wake'ing up the already running task. This fix makes the removal from waitq and wake 'atomic' The root cause and the fix are similar in spirit to what was observed in synclock's waitq implementation. Change-Id: I7dd9e6ad5945742bcda20eb5a06a9376bb18528e BUG: 948686 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/5047 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Anand Avati <avati>
REVIEW: http://review.gluster.org/5053 (syncop: Update synctask state appropriately) posted (#1) for review on release-3.4 by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/5054 (syncop: Remove task from syncbarrier's waitq before 'wake') posted (#1) for review on release-3.4 by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/5053 committed in release-3.4 by Vijay Bellur (vbellur) ------ commit f8e19c6b6f21142aadabd2f30dc3b960a6ec1fa2 Author: Krishnan Parthasarathi <kparthas> Date: Sat May 11 14:29:29 2013 +0530 syncop: Update synctask state appropriately Backport of http://review.gluster.org/4985 * Earlier, SYNCOP macro, the only consumer of synctask_yield, would set the task->state to SYNCTASK_SUSPEND. Today, we have glusterd having its own wrapper macros which don't set task's state. There is also the syncbarrier and synclock framework, which also participate in a synctask's scheduling (and need to keep a task's state up to date). It only makes more sense to leave a synctask's state to the synctask library, since its an internal affair. * Need to 'yawn' before 'yield' to avoid re-running tasks to set task->woken appropriately. Change-Id: Ic7a59e6ebcc46f03e53223ca237668d45a3cba40 BUG: 948686 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/5053 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
COMMIT: http://review.gluster.org/5054 committed in release-3.4 by Vijay Bellur (vbellur) ------ commit 418f763303ec4003f4b805867ff306a2e43bca76 Author: Krishnan Parthasarathi <kparthas> Date: Mon May 20 17:17:05 2013 +0530 syncop: Remove task from syncbarrier's waitq before 'wake' Backport of http://review.gluster.org/5047 Removing task from syncbarrier's waitq after wake could result in a subsequent syncbarrier_wake, wake'ing up the already running task. This fix makes the removal from waitq and wake 'atomic' The root cause and the fix are similar in spirit to what was observed in synclock's waitq implementation. Change-Id: I7dd9e6ad5945742bcda20eb5a06a9376bb18528e BUG: 948686 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/5054 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
REVIEW: http://review.gluster.org/5058 (syncop: synctask shouldn't yawn, it could miss a 'wake') posted (#1) for review on master by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/5058 committed in master by Anand Avati (avati) ------ commit 16b5ec67120e198fb320e13ade9e31d3761b0932 Author: Krishnan Parthasarathi <kparthas> Date: Wed May 22 00:18:04 2013 +0530 syncop: synctask shouldn't yawn, it could miss a 'wake' Change-Id: I7731fd33ca0c925cc52f8d105275b44fc625a1e2 BUG: 948686 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/5058 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Anand Avati <avati>
REVIEW: http://review.gluster.org/5071 (syncop: synctask shouldn't yawn, it could miss a 'wake') posted (#1) for review on release-3.4 by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/5071 committed in release-3.4 by Vijay Bellur (vbellur) ------ commit 3d68ed8fbb41be06c222aa7754b36e77edced92a Author: Krishnan Parthasarathi <kparthas> Date: Wed May 22 00:18:04 2013 +0530 syncop: synctask shouldn't yawn, it could miss a 'wake' Backport of http://review.gluster.org/5058 Change-Id: I7731fd33ca0c925cc52f8d105275b44fc625a1e2 BUG: 948686 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/5071 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Jeff Darcy <jdarcy>
REVIEW: http://review.gluster.org/5087 (glusterd-syncop: Fix unlocking and collating errors) posted (#1) for review on master by Kaushal M (kaushal)
REVIEW: http://review.gluster.org/5087 (glusterd-syncop: Fix unlocking and collating errors) posted (#2) for review on master by Kaushal M (kaushal)
COMMIT: http://review.gluster.org/5087 committed in master by Anand Avati (avati) ------ commit 0c1916f3e1eb81aa81dc2d62e97f46880390838c Author: Kaushal M <kaushal> Date: Thu May 23 12:19:33 2013 +0530 glusterd-syncop: Fix unlocking and collating errors * Only those peers which were locked need to be unlocked. * Fix location of collating errors in callbacks. The callback functions could miss collating errors if there was an rpc error. Change-Id: Ie27c2f1ec197da4f5077a4d6e032127954ce87cd BUG: 948686 Signed-off-by: Kaushal M <kaushal> Reviewed-on: http://review.gluster.org/5087 Reviewed-by: Krishnan Parthasarathi <kparthas> Tested-by: Gluster Build System <jenkins.com>
*** Bug 924296 has been marked as a duplicate of this bug. ***