+++ This bug was initially created as a clone of Bug #1054154 +++ Description of problem: gsyncd crashed in syncdutils.py while removing a file. I have observed this crash many time, while removing different files. Python back trace >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [2014-01-16 15:20:54.420363] I [master(/bricks/master_brick1):451:crawlwrap] _GMaster: 20 crawls, 0 turns [2014-01-16 15:21:37.910284] E [syncdutils(/bricks/master_brick1):240:log_raise_exception] <top>: FAIL: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 150, in main main_i() File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 540, in main_i local.service_loop(*[r for r in [remote] if r]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1157, in service_loop g2.crawlwrap() File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 476, in crawlwrap time.sleep(self.sleep_interval) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 331, in <lambda> def set_term_handler(hook=lambda *a: finalize(*a, **{'exval': 1})): File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 184, in finalize shutil.rmtree(gconf.ssh_ctl_dir) File "/usr/lib64/python2.6/shutil.py", line 217, in rmtree onerror(os.remove, fullname, sys.exc_info()) File "/usr/lib64/python2.6/shutil.py", line 215, in rmtree os.remove(fullname) OSError: [Errno 2] No such file or directory: '/tmp/gsyncd-aux-ssh-8CWIhl/061fc87d252b63093ab9bfb765588973.sock' [2014-01-16 15:21:37.911117] E [syncdutils(/bricks/master_brick1):223:log_raise_exception] <top>: connection to peer is broken [2014-01-16 15:21:37.917700] E [resource(/bricks/master_brick1):204:errlog] Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-8CWIhl/061fc87d252b63093ab9bfb765588973.sock root.43.174 /nonexistent/gsyncd --session-owner 47fa81ef-44a3-4fb6-b58e-cb4a81fa5b44 -N --listen --timeout 120 gluster://localhost:slave" returned with 255, saying: [2014-01-16 15:21:37.918075] E [resource(/bricks/master_brick1):207:logerr] Popen: ssh> [2014-01-15 12:33:49.858181] I [socket.c:3505:socket_init] 0-glusterfs: SSL support is NOT enabled [2014-01-16 15:21:37.918354] E [resource(/bricks/master_brick1):207:logerr] Popen: ssh> [2014-01-15 12:33:49.858259] I [socket.c:3520:socket_init] 0-glusterfs: using system polling thread [2014-01-16 15:21:37.918692] E [resource(/bricks/master_brick1):207:logerr] Popen: ssh> [2014-01-15 12:33:49.859676] I [socket.c:3505:socket_init] 0-glusterfs: SSL support is NOT enabled >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How reproducible: Doesn't happen everytime Steps to Reproduce: Don't know exact steps. 1.create and start a geo-rep relationship between master and slave. 2.start creating files on master and slave. 3. check the geo-rep logs. Actual results: gsyncd crashed while removing some file Expected results: gsyncd should never crash.
REVIEW: http://review.gluster.org/9792 (geo-rep: Handle ENOENT during cleanup) posted (#1) for review on master by Aravinda VK (avishwan)
REVIEW: http://review.gluster.org/9792 (geo-rep: Handle ENOENT during cleanup) posted (#2) for review on master by Aravinda VK (avishwan)
COMMIT: http://review.gluster.org/9792 committed in master by Vijay Bellur (vbellur) ------ commit 2452d284b38061981d7fbd7e5a7bd15808d13c21 Author: Aravinda VK <avishwan> Date: Tue Mar 3 17:22:30 2015 +0530 geo-rep: Handle ENOENT during cleanup shutil.rmtree was failing to remove file if file was not exists. Added error handling function to ignore ENOENT if a file/dir not present. BUG: 1198101 Change-Id: I1796db2642f81d9e2b5e52c6be34b4ad6f1c9786 Signed-off-by: Aravinda VK <avishwan> Reviewed-on: http://review.gluster.org/9792 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Prashanth Pai <ppai> Reviewed-by: Venky Shankar <vshankar> Reviewed-by: Kotresh HR <khiremat> Reviewed-by: Saravanakumar Arumugam <sarumuga>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report. glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user