Bug 1239044
Summary: | [geo-rep]: killing brick from replica pair makes geo-rep session faulty with Traceback "ChangelogException" | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Kotresh HR <khiremat> | |
Component: | geo-replication | Assignee: | Kotresh HR <khiremat> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
Severity: | urgent | Docs Contact: | ||
Priority: | unspecified | |||
Version: | mainline | CC: | bugs, chrisw, csaba, gluster-bugs, nlevinki, nsathyan, rcyriac, rhinduja | |
Target Milestone: | --- | Keywords: | Reopened, ZStream | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.8rc2 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | 1236546 | |||
: | 1247882 (view as bug list) | Environment: | ||
Last Closed: | 2016-06-16 13:19:46 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1236546 | |||
Bug Blocks: | 1236554, 1247882 |
Description
Kotresh HR
2015-07-03 11:01:33 UTC
I got it the reason for first time failure. The register time is the end time we pass for the history API. Since the PASSIVE worker register much earlier along with ACTIVE worker and start time it passes the stime i.e., register time < stime For history API, start time > end time which obviously fails. When it registers for second time, register time > stime and hence it passes. There are no side effects with respect to DATA sync. It is just worker going down and coming back. We will fix this but not a BLOCKER definitely. REVIEW: http://review.gluster.org/11524 (geo-rep: Fix history failure) posted (#1) for review on master by Kotresh HR (khiremat) REVIEW: http://review.gluster.org/11524 (geo-rep: Fix history failure) posted (#2) for review on master by Kotresh HR (khiremat) COMMIT: http://review.gluster.org/11524 committed in master by Venky Shankar (vshankar) ------ commit 62c2e7f8b9211ba149368d26f772f175fe51b43b Author: Kotresh HR <khiremat> Date: Fri Jul 3 16:32:56 2015 +0530 geo-rep: Fix history failure Both ACTIVE and PASSIVE workers register to changelog at almost same time. When PASSIVE worker becomes ACTIVE, the start and end time would be current stime and register_time repectively for history API. Hence register_time would be less then stime for which history obviously fails. But it will be successful for the next restart as new register_time > stime. Fix is to pass current time as the end time to history call instead of the register_time. Also improvised the logging for ACTIVE/PASSIVE switching. Change-Id: Idc08b4b55c7a4c575ba44918a98389164ccbee8f BUG: 1239044 Signed-off-by: Kotresh HR <khiremat> Reviewed-on: http://review.gluster.org/11524 Tested-by: Gluster Build System <jenkins.com> Tested-by: NetBSD Build System <jenkins.org> Reviewed-by: Aravinda VK <avishwan> Reviewed-by: Venky Shankar <vshankar> Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well. This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user |