Bug 1347686
Summary: | IO error seen with Rolling or non-disruptive upgrade of an distribute-disperse(EC) volume from 3.7.5 to 3.7.9 | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Ashish Pandey <aspandey> | |
Component: | disperse | Assignee: | Ashish Pandey <aspandey> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
Severity: | urgent | Docs Contact: | ||
Priority: | unspecified | |||
Version: | mainline | CC: | aspandey, bugs, mzywusko, nchilaka, pkarampu | |
Target Milestone: | --- | Keywords: | ZStream | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.9.0 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | 1347251 | |||
: | 1360152 1360174 (view as bug list) | Environment: | ||
Last Closed: | 2017-03-27 18:28:06 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1347251 | |||
Bug Blocks: | 1360152, 1360174 |
Comment 1
Ashish Pandey
2016-06-17 12:10:17 UTC
For glusterfs 3.7.5, feature/lock was not returning the lock count in xdata which ec requested. To solve a hang issue we modified the code in such a way that if there is any request of inodelk count in xdata, feature/lock will return the same using xdata. Now for glusterfs 3.7.9 ec is getting inodelk count in xdata from feature/lock. This issue arises when we do a rolling update from 3.7.5 to 3.7.9. For 4+2 volume running 3.7.5, if we update 2 nodes and after heal completion kill 2 older nodes, this problem can be seen. After update and killing of bricks, 2 nodes will return inodelk count while 2 older nodes will not contain it. During dictionary match , ec_dict_compare, this will lead to mismatch of answers and the file operation on mount point will fail with IO error. REVIEW: http://review.gluster.org/14761 (cluster/ec: Match xdata key if present in both dicts) posted (#1) for review on master by Ashish Pandey (aspandey) REVIEW: http://review.gluster.org/14761 (cluster/ec: Match xdata key if present in both dicts) posted (#2) for review on master by Ashish Pandey (aspandey) REVIEW: http://review.gluster.org/14761 (cluster/ec: Handle absence of keys in some callback dict) posted (#3) for review on master by Ashish Pandey (aspandey) COMMIT: http://review.gluster.org/14761 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 558a45fa527b01ec81904150532a8b661c06ae8a Author: Ashish Pandey <aspandey> Date: Fri Jun 17 17:52:56 2016 +0530 cluster/ec: Handle absence of keys in some callback dict Problem: This issue arises when we do a rolling update from 3.7.5 to 3.7.9. For 4+2 volume running 3.7.5, if we update 2 nodes and after heal completion kill 2 older nodes, this problem can be seen. After update and killing of bricks, 2 nodes will return inodelk count key in dict while other 2 nodes will not have inodelk count in dict. This is also true for get-link-count. During dictionary match , ec_dict_compare, this will lead to mismatch of answers and the file operation on mount point will fail with IO error. Solution: Don't match inode, entry and link count keys while comparing two dictionaries. However, while combining the data in ec_dict_combine, go through all the dictionaries and select the maximum values received in different dicts for these keys. Change-Id: I33546e3619fe8f909286ee48fb0df2009cd3d22f BUG: 1347686 Signed-off-by: Ashish Pandey <aspandey> Reviewed-on: http://review.gluster.org/14761 Reviewed-by: Xavier Hernandez <xhernandez> Smoke: Gluster Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu> CentOS-regression: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.9.0, please open a new bug report. glusterfs-3.9.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2016-November/029281.html [2] https://www.gluster.org/pipermail/gluster-users/ |