The current ordering guarantees of write-behind is too strict and causes significant drop in performance for a VM workload.
If I'm reading the current patch correctly, it's now "less strict" in the sense that ordering between non-overlapping writes is not maintained even if the two are completely separate in time. Is that correct? Is it possible for a user to choose the stricter behavior via an option, or have the code structures changed so much that that would be infeasible? Also, how big is the performance difference, and how is it being measured?
Performance difference is very significant. Ben has latest info on the results. I will explore if it is easy to provide an option to strictly preserve ordering. It might come out as an additional patch.
http://review.gluster.org/#change,3947,patchset=9 implements strict write ordering option for the paranoid. Note that the previous version of write-behind was _not_ maintaining strict write ordering (for adjacent writes). This option is "stricter" than the previous write-behind for writes, but more relaxed w.r.t reads. It only holds back those reads which have overlapping pending writes (which is why the option is called "strict-write-ordering" and not "strict-ordering").
*** Bug 845213 has been marked as a duplicate of this bug. ***
CHANGE: http://review.gluster.org/4079 (write-behind: use uint64_t for overlap comparison) merged in master by Anand Avati (avati)
CHANGE: http://review.gluster.org/4551 (performance/write-behind: guarantee non-overlapping concurrent writes) merged in master by Anand Avati (avati)
CHANGE: http://review.gluster.org/4642 (performance/write-behind: guarantee non-overlapping concurrent writes) merged in release-3.4 by Anand Avati (avati)