+++ This bug was initially created as a clone of Bug #1575490 +++ Description of problem: ======================= While upgrading from gluster version 3.8 to v.3.12 encountered a FAULTY session where there was only one worker ACTIVE. [root@dhcp42-53 master]# gluster volume geo-replication master 10.70.42.164::slave status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ------------------------------------------------------------------------------------------------------------------------------------------ 10.70.42.53 master /rhs/brick1/b1 root 10.70.42.164::slave N/A Faulty N/A N/A 10.70.42.53 master /rhs/brick2/b4 root 10.70.42.164::slave N/A Faulty N/A N/A 10.70.42.138 master /rhs/brick1/b3 root 10.70.42.164::slave 10.70.42.164 Active History Crawl N/A 10.70.42.138 master /rhs/brick2/b6 root 10.70.42.164::slave N/A Faulty N/A N/A 10.70.42.160 master /rhs/brick1/b2 root 10.70.42.164::slave N/A Faulty N/A N/A 10.70.42.160 master /rhs/brick2/b5 root 10.70.42.164::slave N/A Faulty N/A N/A Traceback in geo-rep logs: -------------------------------- Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 210, in main main_i() File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 802, in main_i local.service_loop(*[r for r in [remote] if r]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1676, in service_loop g3.crawlwrap(oneshot=True) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 597, in crawlwrap self.crawl() File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1470, in crawl self.changelogs_batch_process(changes) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1370, in changelogs_batch_process self.process(batch) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1204, in process self.process_change(change, done, retry) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1123, in process_change entry_stime_to_update[0]) File "/usr/libexec/glusterfs/python/syncdaemon/gsyncdstatus.py", line 200, in set_field return self._update(merger) File "/usr/libexec/glusterfs/python/syncdaemon/gsyncdstatus.py", line 161, in _update data = mergerfunc(data) File "/usr/libexec/glusterfs/python/syncdaemon/gsyncdstatus.py", line 194, in merger if data[key] == value: KeyError: 'last_synced_entry' Version-Release number of selected component (if applicable): ============================================================= How reproducible: ================= 1/1 Actual results: =============== Session is FAULTY. Expected results: ================= Session should not be FAULTY. --- Additional comment from Worker Ant on 2018-05-07 02:06:22 EDT --- REVIEW: https://review.gluster.org/19969 (geo-rep: Fix upgrade issue) posted (#1) for review on master by Kotresh HR --- Additional comment from Worker Ant on 2018-05-07 06:17:41 EDT --- COMMIT: https://review.gluster.org/19969 committed in master by "Aravinda VK" <avishwan> with a commit message- geo-rep: Fix upgrade issue Cause and Analysis: The last synced changelog for entry operations is marked in current version to avoid re-processing of already processed entry operations in a batch during crash/restart of geo-rep. This was not present in previous versoins. The marker is maintained in the dictionary with the key 'last_synced_entry' and dictionary is persisted into status file. So upgrading to current version in which the marker is present was failing with KeyError. Solution: Load the dictionary with default keys first which contains all the keys including latest ones and then load the values from status file instead of doing otherwise. fixes: bz#1575490 Change-Id: Ic654e6f9a3c97f616761f1362f890352a2186fb4 Signed-off-by: Kotresh HR <khiremat> --- Additional comment from Worker Ant on 2018-05-14 23:03:58 EDT --- REVISION POSTED: https://review.gluster.org/20018 (geo-rep: Fix upgrade issue) posted (#2) for review on release-3.12 by Kotresh HR
REVIEW: https://review.gluster.org/20606 (geo-rep: Fix upgrade issue) posted (#2) for review on release-4.1 by Kotresh HR
COMMIT: https://review.gluster.org/20606 committed in release-4.1 by "Shyamsundar Ranganathan" <srangana> with a commit message- geo-rep: Fix upgrade issue Cause and Analysis: The last synced changelog for entry operations is marked in current version to avoid re-processing of already processed entry operations in a batch during crash/restart of geo-rep. This was not present in previous versoins. The marker is maintained in the dictionary with the key 'last_synced_entry' and dictionary is persisted into status file. So upgrading to current version in which the marker is present was failing with KeyError. Solution: Load the dictionary with default keys first which contains all the keys including latest ones and then load the values from status file instead of doing otherwise. Backport of: > BUG: 1575490 > Change-Id: Ic654e6f9a3c97f616761f1362f890352a2186fb4 > Signed-off-by: Kotresh HR <khiremat> (cherry picked from commit 23c1385b5f6f6103e820d15ecfe1df31940fdb45) fixes: bz#1611104 Change-Id: Ic654e6f9a3c97f616761f1362f890352a2186fb4 Signed-off-by: Kotresh HR <khiremat> (cherry picked from commit 23c1385b5f6f6103e820d15ecfe1df31940fdb45)
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.1.3, please open a new bug report. glusterfs-4.1.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2018-August/000111.html [2] https://www.gluster.org/pipermail/gluster-users/