Description of problem: If log-rsync-performance config is set using following command, workers restarts and causes reprocessing Changelogs which are processed before the config change. gluster volume geo-replication <MASTER> <SLAVEHOST>::<SLAVEVOL> config log-rsync-performance true
REVIEW: http://review.gluster.org/15816 (geo-rep: Do not restart workers when log-rsync-performance config change) posted (#1) for review on master by Aravinda VK (avishwan)
REVIEW: http://review.gluster.org/15816 (geo-rep: Do not restart workers when log-rsync-performance config change) posted (#2) for review on master by Aravinda VK (avishwan)
REVIEW: http://review.gluster.org/15816 (geo-rep: Do not restart workers when log-rsync-performance config change) posted (#3) for review on master by Aravinda VK (avishwan)
COMMIT: http://review.gluster.org/15816 committed in master by Aravinda VK (avishwan) ------ commit a268e2865c21ec8d2b4fed26715e986cfcc66fad Author: Aravinda VK <avishwan> Date: Thu Nov 10 12:35:30 2016 +0530 geo-rep: Do not restart workers when log-rsync-performance config change Geo-rep restarts workers when any of the configurations changed. We don't need to restart workers if tunables like log-rsync-performance is modified. With this patch, Geo-rep workers will get new "log-rsync-performance" config automatically without restart. BUG: 1393678 Change-Id: I40ec253892ea7e70c727fa5d3c540a11e891897b Signed-off-by: Aravinda VK <avishwan> Reviewed-on: http://review.gluster.org/15816 NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> Reviewed-by: Kotresh HR <khiremat>
REVIEW: http://review.gluster.org/16102 (geo-rep: Fix log-rsync-performance config issue) posted (#1) for review on master by Aravinda VK (avishwan)
REVIEW: http://review.gluster.org/16102 (geo-rep: Fix log-rsync-performance config issue) posted (#2) for review on master by Aravinda VK (avishwan)
COMMIT: http://review.gluster.org/16102 committed in master by Aravinda VK (avishwan) ------ commit ff2a58d784bc20ccafab8183d82787ceb8ac471b Author: Aravinda VK <avishwan> Date: Mon Dec 12 13:06:15 2016 +0530 geo-rep: Fix log-rsync-performance config issue If log-rsync-performance config is not set, gconf.get_realtime will return None, Added default value as False if config file doesn't have this option set. BUG: 1393678 Change-Id: I89016ab480a16179db59913d635d8553beb7e14f Signed-off-by: Aravinda VK <avishwan> Reviewed-on: http://review.gluster.org/16102 Smoke: Gluster Build System <jenkins.org> Tested-by: Kotresh HR <khiremat> Reviewed-by: Kotresh HR <khiremat> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org>
I believe this fix has been incorrectly backported to 3.9, or at least the Ubuntu PPA of 3.9. Consider https://launchpadlibrarian.net/302065598/glusterfs_3.8.7-ubuntu1~xenial1_3.8.8-ubuntu1~xenial1.diff.gz and https://launchpadlibrarian.net/302850916/glusterfs_3.9.0-ubuntu1~xenial6_3.9.1-ubuntu1~xenial1.diff.gz The former contains - if gconf.log_rsync_performance: + log_rsync_performance = boolify(gconf.configinterface.get_realtime( + "log_rsync_performance", default_value=False)) but the latter doesn't have `default_value=False`: - if gconf.log_rsync_performance: + if boolify(gconf.configinterface.get_realtime( + "log_rsync_performance")): So I'm running `gluster --version` `glusterfs 3.9.1` from that PPA (3.9.1-ubuntu1~xenial1 to be precise), and I get this error (attached so that people can Google it): [2017-01-19 03:44:47.201340] I [monitor(monitor):273:monitor] Monitor: starting gsyncd worker(/gluster-brick/brick1/gv0). Slave node: ssh://root@mymachine:gluster://localhost:gv0-geo-sfo2 [2017-01-19 03:44:47.460453] I [changelogagent(/gluster-brick/brick1/gv0):73:__init__] ChangelogAgent: Agent listining... [2017-01-19 03:44:54.361351] I [master(/gluster-brick/brick1/gv0):1323:register] _GMaster: Working dir: /var/lib/misc/glusterfsd/gv0/ssh%3A%2F%2Froot%40mymachine%3Agluster%3A%2F%2F127.0.0.1%3Agv0-geo-sfo2/e989bdc037f1478d9 b2cc6e6ae3d3d0d [2017-01-19 03:44:54.361701] I [resource(/gluster-brick/brick1/gv0):1584:service_loop] GLUSTER: Register time: 1484797494 [2017-01-19 03:44:54.376507] I [gsyncdstatus(/gluster-brick/brick1/gv0):264:set_active] GeorepStatus: Worker Status: Active [2017-01-19 03:44:54.377803] I [gsyncdstatus(/gluster-brick/brick1/gv0):237:set_worker_crawl_status] GeorepStatus: Crawl Status: History Crawl [2017-01-19 03:44:54.378166] I [master(/gluster-brick/brick1/gv0):1239:crawl] _GMaster: starting history crawl... turns: 1, stime: None, etime: 1484797494, entry_stime: None [2017-01-19 03:44:54.378293] I [resource(/gluster-brick/brick1/gv0):1599:service_loop] GLUSTER: No stime available, using xsync crawl [2017-01-19 03:44:54.385798] I [master(/gluster-brick/brick1/gv0):1348:crawl] _GMaster: starting hybrid crawl..., stime: None [2017-01-19 03:44:54.387316] I [gsyncdstatus(/gluster-brick/brick1/gv0):237:set_worker_crawl_status] GeorepStatus: Crawl Status: Hybrid Crawl [2017-01-19 03:44:55.388740] I [master(/gluster-brick/brick1/gv0):1358:crawl] _GMaster: processing xsync changelog /var/lib/misc/glusterfsd/gv0/ssh%3A%2F%2Froot%40mymachine%3Agluster%3A%2F%2F127.0.0.1%3Agv0-geo-sfo2/e989bd c037f1478d9b2cc6e6ae3d3d0d/xsync/XSYNC-CHANGELOG.1484797494 [2017-01-19 03:44:55.852421] E [syncdutils(/gluster-brick/brick1/gv0):296:log_raise_exception] <top>: FAIL: Traceback (most recent call last): File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/syncdutils.py", line 326, in twrap tf(*aa) File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py", line 1649, in syncjob po = self.sync_engine(pb, self.log_err) File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/resource.py", line 1730, in rsync log_err=log_err) File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/resource.py", line 56, in sup sys._getframe(1).f_code.co_name)(*a, **kw) File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/resource.py", line 1041, in rsync "log_rsync_performance")): File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/syncdutils.py", line 368, in boolify lstr = s.lower() AttributeError: 'NoneType' object has no attribute 'lower' [2017-01-19 03:44:55.854578] I [syncdutils(/gluster-brick/brick1/gv0):237:finalize] <top>: exiting. [2017-01-19 03:44:55.860104] I [repce(/gluster-brick/brick1/gv0):92:service_loop] RepceServer: terminating on reaching EOF. [2017-01-19 03:44:55.860390] I [syncdutils(/gluster-brick/brick1/gv0):237:finalize] <top>: exiting. [2017-01-19 03:44:56.351736] I [monitor(monitor):349:monitor] Monitor: worker(/gluster-brick/brick1/gv0) died in startup phase [2017-01-19 03:44:56.358608] I [gsyncdstatus(monitor):233:set_worker_status] GeorepStatus: Worker Status: Faulty I can work around this bug by setting gluster volume geo-replication gv0 root@mymachine::gv0-geo config log_rsync_performance true Could you check if the fix for this issue was correctly applied to 3.9, and release a new version then (I assume you are the ones maintaining that PPA)? Thanks!
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report. glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html [2] https://www.gluster.org/pipermail/gluster-users/