+++ This bug was initially created as a clone of Bug #1448386 +++ Description of problem: ======================= While running geo-replication sanity check which does following fop's (create,chmod,chown,chgrp,hardlink,symlink,truncate,rename,remove) observed the following worker crash. Eventually the checksum matches at master and slave, and hence do not know after which fop or crawl this is observed. The crash is only seen once and worker became online post that. [2017-05-04 17:24:29.679775] I [gsyncdstatus(/bricks/brick0/master_brick1):276:set_passive] GeorepStatus: Worker Status: Passive [2017-05-04 17:24:30.635793] I [master(/bricks/brick1/master_brick4):1195:crawl] _GMaster: slave's time: (1493917690, 0) [2017-05-04 17:24:35.496570] E [syncdutils(/bricks/brick1/master_brick4):296:log_raise_exception] <top>: FAIL: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 326, in twrap tf(*aa) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1575, in syncjob po = self.sync_engine(pb, self.log_err) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1702, in rsync log_err=log_err) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 56, in sup sys._getframe(1).f_code.co_name)(*a, **kw) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1025, in rsync "log_rsync_performance", default_value=False)) File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 264, in get_realtime return self.get(opt, printValue=False, default_value=default_value) File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 369, in get self.update_to(d, allow_unresolved=True) File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 359, in update_to update_from_sect(sect, MultiDict(dct, mad, *self.auxdicts)) File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 343, in update_from_sect dct[k] = Template(v).safe_substitute(mud) File "/usr/lib64/python2.7/string.py", line 205, in safe_substitute return self.pattern.sub(convert, self.template) TypeError: expected string or buffer [2017-05-04 17:24:35.563095] I [syncdutils(/bricks/brick1/master_brick4):237:finalize] <top>: exiting. [2017-05-04 17:24:35.572370] I [repce(/bricks/brick1/master_brick4):92:service_loop] RepceServer: terminating on reaching EOF. [2017-05-04 17:24:35.573046] I [syncdutils(/bricks/brick1/master_brick4):237:finalize] <top>: exiting. Version-Release number of selected component (if applicable): ============================================================= glusterfs-geo-replication-3.8.4-24.el7rhgs.x86_64 Steps to Reproduce: ================== Do not know the exact steps since it was seen in the automation run. Will work to find out the specific steps and update this space later. Actual results: ============== Worker crashed and then came online. Arequal checksum between master and slave matches. Expected results: ================= Worker should not crash. --- Additional comment from Aravinda VK on 2017-06-06 08:48:33 EDT --- Easy reproducer: cd /usr/libexec/glusterfs/python/syncdaemon/ python from configinterface import GConffile conf = GConffile("/var/lib/glusterd/geo-replication/gsyncd_template.conf", ["master", "slave"], {}) print conf.get() conf.set("log-rsync-performance", 10) print conf.get("log-rsync-performance") Above script fails with the same traceback. RCA: We are not restarting the worker for Some of the configuration changes(For example, log-rsync-performance), this causes non string value passed as template. Worker restart fixes this since it reads config values from file as string instead of actual type.
REVIEW: https://review.gluster.org/17489 (geo-rep: Fix ConfigInterface Template issue) posted (#1) for review on master by Aravinda VK (avishwan)
COMMIT: https://review.gluster.org/17489 committed in master by Aravinda VK (avishwan) ------ commit 513984ad90531c53fcb7d6f0d581f198a6afcf93 Author: Aravinda VK <avishwan> Date: Tue Jun 6 17:59:59 2017 +0530 geo-rep: Fix ConfigInterface Template issue ConfigParser uses string Template to substitute the dynamic values for config. For some of the configurations, Geo-rep worker will not restart. Due to this conf object may have non string values. If val is not string in Template(val), then it fails with "TypeError: expected string or buffer" BUG: 1459620 Change-Id: I25b8bbc1df42f6f29e9563a55b3e27a228321c44 Signed-off-by: Aravinda VK <avishwan> Reviewed-on: https://review.gluster.org/17489 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Kotresh HR <khiremat>
REVIEW: https://review.gluster.org/17503 (geo-rep: Fix string format issue caused due to #17489) posted (#1) for review on master by Aravinda VK (avishwan)
COMMIT: https://review.gluster.org/17503 committed in master by Aravinda VK (avishwan) ------ commit 778ad0e2bbfe60db32df460590e0c3596fdf1aa5 Author: Aravinda VK <avishwan> Date: Mon Jun 12 11:05:27 2017 +0530 geo-rep: Fix string format issue caused due to #17489 With Patch #17489, values from Geo-rep config always represented as Unicode string, which is not compatible with rest of the code. Changed the format with this patch to fix the issue. BUG: 1459620 Change-Id: I935fca0d24f02e90757f688f92ef73fad9f9b8e1 Signed-off-by: Aravinda VK <avishwan> Reviewed-on: https://review.gluster.org/17503 NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> Reviewed-by: Kotresh HR <khiremat>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.0, please open a new bug report. glusterfs-3.12.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-September/000082.html [2] https://www.gluster.org/pipermail/gluster-users/