REVIEW: https://review.gluster.org/17703 (cluster/ec: Non-disruptive upgrade on EC volume fails) posted (#3) for review on master by Sunil Kumar Acharya (sheggodu)
REVIEW: https://review.gluster.org/17703 (cluster/ec: Non-disruptive upgrade on EC volume fails) posted (#4) for review on master by Sunil Kumar Acharya (sheggodu)
REVIEW: https://review.gluster.org/17703 (cluster/ec: Non-disruptive upgrade on EC volume fails) posted (#5) for review on master by Sunil Kumar Acharya (sheggodu)
REVIEW: https://review.gluster.org/17703 (cluster/ec: Non-disruptive upgrade on EC volume fails) posted (#6) for review on master by Sunil Kumar Acharya (sheggodu)
REVIEW: https://review.gluster.org/17703 (cluster/ec: Non-disruptive upgrade on EC volume fails) posted (#7) for review on master by Sunil Kumar Acharya (sheggodu)
Description of problem: ==================== The ec non-disruptive upgrade fails due to some regression Client IO:tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_handles.h: Cannot open: Input/output error linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_import.h tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_import.h: Cannot open: Input/output error linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_intent.h tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_intent.h: Cannot open: Input/output error linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_kernelcomm.h tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_kernelcomm.h: Cannot open: Input/output error linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_lib.h tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_lib.h: Cannot open: Input/output error linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_linkea.h tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_linkea.h: Cannot open: Input/output error linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_lmv.h tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_lmv.h: Cannot open: Input/output error linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_log.h tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_log.h: Cannot open: Input/output error linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_mdc.h tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_mdc.h: Cannot open: Input/output error linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_mds.h tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_mds.h: Cannot open: Input/output error linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_net.h Client fuse logs: 17-06-27 06:31:41.488462] W [MSGID: 122035] [ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with some subvolumes unavailable (4) [2017-06-27 06:31:41.492350] W [MSGID: 122040] [ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size and version [Input/output error] [2017-06-27 06:31:41.495012] W [MSGID: 122035] [ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with some subvolumes unavailable (4) [2017-06-27 06:31:41.498939] W [MSGID: 122040] [ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size and version [Input/output error] [2017-06-27 06:31:41.500037] W [MSGID: 122035] [ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with some subvolumes unavailable (4) [2017-06-27 06:31:41.501771] W [MSGID: 122040] [ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size and version [Input/output error] [2017-06-27 06:31:41.502741] W [MSGID: 122035] [ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with some subvolumes unavailable (4) [2017-06-27 06:31:41.510185] W [MSGID: 122040] [ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size and version [Input/output error] [2017-06-27 06:31:41.512205] W [MSGID: 122035] [ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with some subvolumes unavailable (4) [2017-06-27 06:31:41.517462] W [MSGID: 122040] [ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size and version [Input/output error] [2017-06-27 06:31:41.520244] W [MSGID: 122035] [ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with some subvolumes unavailable (4) [2017-06-27 06:31:41.522030] W [MSGID: 122040] [ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size and version [Input/output error] [2017-06-27 06:31:41.530202] W [MSGID: 122035] [ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with some subvolumes unavailable (4) [2017-06-27 06:31:41.533945] W [MSGID: 122040] [ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size and version [Input/output error] [2017-06-27 06:31:41.536465] W [MSGID: 122035] [ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with some subvolumes unavailable (4) [2017-06-27 06:31:41.539042] W [MSGID: 122040] [ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size and version [Input/output error] [2017-06-27 06:31:41.540564] W [MSGID: 122035] [ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with some subvolumes unavailable (4) [2017-06-27 06:31:41.544238] W [MSGID: 122040] [ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size and version [Input/output error] [2017-06-27 06:31:41.545663] W [MSGID: 122035] [ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with some subvolumes unavailable (4) [2017-06-27 06:31:41.550015] W [MSGID: 122040] [ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size and version [Input/output error] [2017-06-27 06:31:41.552186] W [MSGID: 122035] [ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Version-Release number of selected component (if applicable): ============ 3.8.4.28-->3.8.4.29 3.8.4.29-->3.8.4-31 How reproducible: ====== 2/2 Steps to Reproduce: 1.have a 4+2 ec volume on 6 nodes 2.let untar linux kernel go on during this upgrade procedure 3.upgrade node#1 and #2 (kill glusterfsd, glusterfs,stop glusterd and post upgrade of rpm start glusterd) 4. wait for healing to complete 5. post heal completed, and with still kernel untar going on 6. now upgrade node#3((kill glusterfsd, glusterfs,stop glusterd) At this step you will see IO errors with i/o error
REVIEW: https://review.gluster.org/17703 (cluster/ec: Non-disruptive upgrade on EC volume fails) posted (#8) for review on master by Sunil Kumar Acharya (sheggodu)
REVIEW: https://review.gluster.org/17703 (cluster/ec: Non-disruptive upgrade on EC volume fails) posted (#9) for review on master by Sunil Kumar Acharya (sheggodu)
COMMIT: https://review.gluster.org/17703 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit d2650feb4bfadf3fb0cdb90236bc78c33b5ea451 Author: Sunil Kumar Acharya <sheggodu> Date: Wed Jul 5 16:41:38 2017 +0530 cluster/ec: Non-disruptive upgrade on EC volume fails Problem: Enabling optimistic changelog on EC volume was not handling node down scenarios appropriately resulting in volume data inaccessibility. Solution: Update dirty xattr appropriately on good bricks whenever nodes are down. This would fix the metadata information as part of heal and thus ensures data accessibility. BUG: 1468261 Change-Id: I08b0d28df386d9b2b49c3de84b4aac1c729ac057 Signed-off-by: Sunil Kumar Acharya <sheggodu> Reviewed-on: https://review.gluster.org/17703 Smoke: Gluster Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.0, please open a new bug report. glusterfs-3.12.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-September/000082.html [2] https://www.gluster.org/pipermail/gluster-users/