Description of problem: As part of metadata self-heal, ec sets versions of the sinks as to that of sources so that on going transactions will be successful on the healing subvolume as well. Then it starts rebuilding the data on to the sinks. If this data rebuilding fails for some reason the files are left with same version even when the data rebuilding is not complete, this can lead to data corruption. Fix: If the version numbers do not match, then writes are performed only on at least N-R bricks which have same version. But if we want to do healing of files which are constantly modified we need to allow writes on subvols that are undergoing heal. Data healing will mark 62nd bit while the heal is going on. When the data transaction sees that this bit is set it needs to perform the fop on that subvol irrespective of whether the versions match or do not match. Fop is considered successful only if N-R non-healing bricks succeed. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
REVIEW: http://review.gluster.org/10372 (cluster/ec: Perform inode-write on healing subvols) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
Data self-heal needs to be changed so that it marks the SELFHEAL_BIT i.e. 62nd bit. I will be posting the new code for data self-heal to go with this patch.
COMMIT: http://review.gluster.org/10372 committed in master by Vijay Bellur (vbellur) ------ commit 7efa7e2116856b4cf37797218612a41bdd237e77 Author: Pranith Kumar K <pkarampu> Date: Thu Apr 23 08:30:11 2015 +0530 cluster/ec: Perform inode-write on healing subvols If the version numbers do not match, then writes are performed only on at least N-R bricks which have same version. But if we want to do healing of files which are constantly modified we need to allow writes on subvols that are undergoing heal. Data healing will mark 62nd bit while the heal is going on. When the data transaction sees that this bit is set it needs to perform the fop on that subvol irrespective of whether the versions match or do not match. Fop is considered successful only if N-R non-healing bricks succeed. Change-Id: I69a17582df397aaf6e8ca4b5e746c7ca802cbbde BUG: 1215265 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/10372 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
REVIEW: http://review.gluster.org/10382 (syncop: Implement syncop_fxattrop) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10383 (storage/posix: prevent NULL dereference) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10382 (syncop: Implement syncop_fxattrop) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10312 (Adding 64 bits in "version" key of extended attributes. First 64 bits (Left) represents Data version. Last 64 bits (right) represents Meta Data version.) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10384 (data-heal) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10385 (cluster/ec: Change meaning of trusted.ec.dirty) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10386 (cluster/ec: Link new heal implementation everywhere) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10382 (syncop: Implement syncop_fxattrop) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10386 (cluster/ec: Link new heal implementation everywhere) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10384 (data-heal) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10385 (cluster/ec: Change meaning of trusted.ec.dirty) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10312 (Adding 64 bits in "version" key of extended attributes. First 64 bits (Left) represents Data version. Last 64 bits (right) represents Meta Data version.) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10390 (cluster/ec: Handle unhandled states) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10391 (libglusterfs: Fix cluster_entrylk retry) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
COMMIT: http://review.gluster.org/10382 committed in master by Vijay Bellur (vbellur) ------ commit 585b1f0d9e485674268cb90bd8f3fdb143bab06b Author: Pranith Kumar K <pkarampu> Date: Sun Apr 26 10:40:18 2015 +0530 syncop: Implement syncop_fxattrop Change-Id: Ifc7937ceb451f6e11e40a9513017226fd0f115b0 BUG: 1215265 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/10382 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Krutika Dhananjay <kdhananj> Reviewed-by: Vijay Bellur <vbellur>
REVIEW: http://review.gluster.org/10386 (cluster/ec: Link new heal implementation everywhere) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10384 (data-heal) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10385 (cluster/ec: Change meaning of trusted.ec.dirty) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10390 (cluster/ec: Handle unhandled states) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10391 (libglusterfs: Fix cluster_entrylk retry) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10312 (Adding 64 bits in "version" key of extended attributes. First 64 bits (Left) represents Data version. Last 64 bits (right) represents Meta Data version.) posted (#4) for review on master by Pranith Kumar Karampuri (pkarampu)
COMMIT: http://review.gluster.org/10383 committed in master by Vijay Bellur (vbellur) ------ commit 472d5c67013913ca8646f32ece214a767a955ef9 Author: Pranith Kumar K <pkarampu> Date: Sun Apr 26 17:59:49 2015 +0530 storage/posix: prevent NULL dereference filler->fd is never set but used. Change-Id: Icf21c439b37c9faa3751658a9e63a74570ed153c BUG: 1215265 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/10383 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Krutika Dhananjay <kdhananj> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur>
COMMIT: http://review.gluster.org/10390 committed in master by Vijay Bellur (vbellur) ------ commit 315364b78cd152835cf6d30e32fd145a942e1d7a Author: Pranith Kumar K <pkarampu> Date: Mon Apr 27 00:00:08 2015 +0530 cluster/ec: Handle unhandled states This was leading to hangs when get_size_and_version fails Change-Id: Iad9408c2dacc9a74594b8d2f94c95f402533b0f1 BUG: 1215265 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/10390 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Xavier Hernandez <xhernandez>
REVIEW: http://review.gluster.org/10386 (cluster/ec: Link new heal implementation everywhere) posted (#4) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10384 (data-heal) posted (#4) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10385 (cluster/ec: Change meaning of trusted.ec.dirty) posted (#4) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10391 (libglusterfs: Fix cluster_entrylk retry) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10312 (Adding 64 bits in "version" key of extended attributes. First 64 bits (Left) represents Data version. Last 64 bits (right) represents Meta Data version.) posted (#5) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10457 (cluster/ec: Do not do non-blocking heal for heal command) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10391 (libglusterfs: Fix cluster_entrylk retry) posted (#4) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10391 (libglusterfs: Fix cluster_entrylk retry) posted (#5) for review on master by Pranith Kumar Karampuri (pkarampu)
COMMIT: http://review.gluster.org/10391 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit e6f2472d2434ab43a30720ef4de2e0abc0a3f4ac Author: Pranith Kumar K <pkarampu> Date: Mon Apr 27 01:20:02 2015 +0530 libglusterfs: Fix cluster_entrylk retry Change-Id: I92ff46bae36d39a449d4bbaedc88a322992f65eb BUG: 1215265 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/10391 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Krutika Dhananjay <kdhananj>
REVIEW: http://review.gluster.org/10312 (cluster/ec: add separate versions for data/entry, metadata) posted (#8) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10597 (libglusterfs: Fix cluster_entrylk retry) posted (#1) for review on release-3.7 by Gaurav Kumar Garg (ggarg)
COMMIT: http://review.gluster.org/10312 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 50063ea7f4182ed30b86f38a716d03464e07b8c6 Author: Ashish Pandey <aspandey> Date: Tue Apr 21 17:22:40 2015 +0530 cluster/ec: add separate versions for data/entry, metadata Adding 64 bits in "version" key of extended attributes. First 64 bits (Left) represents Data version. Last 64 bits (right) represents Meta Data version. Note: 3.7 and 3.6 version ec can't co-exist with this change because xattrop in 3.6 will fail with ERANGE as the buffer passed to it will be '8' bytes where as the value will be 16 bytes in 3.7. Where as 3.7 version clients can work with old version files. For upgrades we need to tell users to complete heals and then upgrade BUG: 1215265 Change-Id: Ib85114680cb7e75b8371c984d9f7b6401c1ffb93 Signed-off-by: Ashish Pandey <aspandey> Reviewed-on: http://review.gluster.org/10312 Tested-by: Gluster Build System <jenkins.com> Tested-by: NetBSD Build System Reviewed-by: Pranith Kumar Karampuri <pkarampu>
REVIEW: http://review.gluster.org/10384 (cluster/ec: data heal implementation for ec) posted (#5) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10385 (cluster/ec: Change meaning of trusted.ec.dirty) posted (#5) for review on master by Pranith Kumar Karampuri (pkarampu)
COMMIT: http://review.gluster.org/10597 committed in release-3.7 by Pranith Kumar Karampuri (pkarampu) ------ commit 35e77e239aaa1abafe45727f76aaa61ba41cc484 Author: Pranith Kumar K <pkarampu> Date: Mon Apr 27 01:20:02 2015 +0530 libglusterfs: Fix cluster_entrylk retry Change-Id: I92ff46bae36d39a449d4bbaedc88a322992f65eb BUG: 1215265 Signed-off-by: Gaurav Kumar Garg <ggarg> Reviewed-on: http://review.gluster.org/10391 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Krutika Dhananjay <kdhananj> (cherry picked from commit e6f2472d2434ab43a30720ef4de2e0abc0a3f4ac) Reviewed-on: http://review.gluster.org/10597 Reviewed-by: Pranith Kumar Karampuri <pkarampu>
REVIEW: http://review.gluster.org/10384 (cluster/ec: data heal implementation for ec) posted (#6) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10385 (cluster/ec: Change meaning of trusted.ec.dirty) posted (#6) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10626 (cluster/ec: add separate versions for data/entry, metadata) posted (#2) for review on release-3.7 by Ashish Pandey (aspandey)
REVIEW: http://review.gluster.org/10626 (cluster/ec: add separate versions for data/entry, metadata) posted (#3) for review on release-3.7 by Ashish Pandey (aspandey)
COMMIT: http://review.gluster.org/10384 committed in master by Vijay Bellur (vbellur) ------ commit fc199e1b6f9423b4dc0c9b34bf894ad66ffafabf Author: Pranith Kumar K <pkarampu> Date: Sat Apr 25 15:58:09 2015 +0530 cluster/ec: data heal implementation for ec Data self-heal: 1) Take inode lock in domain 'this->name:self-heal' on 0-0 range (full file), So that no other processes try to do self-heal at the same time. 2) Take inode lock in domain 'this->name' on 0-0 range (full file), 3) perform fxattrop+fstat and get the xattrs on all the bricks 3) Choose the brick with ec->fragment number of same version as source 4) Truncate sinks 5) Unlock lock taken in 2) 5) For each block take full file lock, Read from sources write to the sinks, Unlock 6) Take full file lock and see if the file is still sane copy i.e. File didn't become unusable while the bricks are offline. Update mtime to before healing 7) xattrop with -ve values of 'dirty' and difference of highest and its own version values for version xattr 8) unlock lock acquired in 6) 9) unlock lock acquired in 1) Change-Id: I6f4d42cd5423c767262c9d7bb5ca7767adb3e5fd BUG: 1215265 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/10384 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
COMMIT: http://review.gluster.org/10385 committed in master by Vijay Bellur (vbellur) ------ commit 02f9835d24aa07bd4e9fcb39cb7ace343f31924f Author: Pranith Kumar K <pkarampu> Date: Sun Apr 26 14:28:00 2015 +0530 cluster/ec: Change meaning of trusted.ec.dirty - With this change, the xattr will represent if the file needs to be healed or not. It will have different values for data/entry and metadata changes. - inode ref leaks and dict_set_dynstr related leaks fixed - Added support for trylock/lock based on heal-cmd execution or not in data heal. - Made fixes to pass regression runs Change-Id: I9d8def4c2badde18a76b7898816fecfac113737a BUG: 1215265 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/10385 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
COMMIT: http://review.gluster.org/10626 committed in release-3.7 by Pranith Kumar Karampuri (pkarampu) ------ commit 599726fec2e5c59b16a5aeb947342d65c1fc967f Author: Ashish Pandey <aspandey> Date: Tue Apr 21 17:22:40 2015 +0530 cluster/ec: add separate versions for data/entry, metadata Adding 64 bits in "version" key of extended attributes. First 64 bits (Left) represents Data version. Last 64 bits (right) represents Meta Data version. Note: 3.7 and 3.6 version ec can't co-exist with this change because xattrop in 3.6 will fail with ERANGE as the buffer passed to it will be '8' bytes where as the value will be 16 bytes in 3.7. Where as 3.7 version clients can work with old version files. For upgrades we need to tell users to complete heals and then upgrade BUG: 1215265 Change-Id: Ib85114680cb7e75b8371c984d9f7b6401c1ffb93 Signed-off-by: Ashish Pandey <aspandey> Reviewed-on: http://review.gluster.org/10312 Reviewed-on: http://review.gluster.org/10626 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report. glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user