Description of problem: ======================= Observing "OSError: [Errno 107] Transport endpoint is not connected" and "OSError: [Errno 117] Structure needs cleaning:" traceback on the slave logs when issued "rm -rf" from master [2015-09-08 19:01:09.604484] E [repce(slave):117:worker] <top>: call failed: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker res = getattr(self.obj, rmeth)(*in_data[2:]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 685, in entry_ops [], [ENOTEMPTY, ESTALE, ENODATA]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap return call(*arg) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 662, in recursive_rmdir recursive_rmdir(gfid, entry, fullname) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 667, in recursive_rmdir errno_wrap(os.rmdir, [path], [ENOENT, ESTALE]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap return call(*arg) OSError: [Errno 107] Transport endpoint is not connected: '.gfid/4752e6d2-4bef-4319-a535-2fbf1fea55e7/level04/hardlink_to_files' [2015-09-08 19:01:09.621318] I [repce(slave):92:service_loop] RepceServer: terminating on reaching EOF. [2015-09-08 19:01:09.621871] I [syncdutils(slave):220:finalize] <top>: exiting. [2015-09-08 19:01:20.188294] I [gsyncd(slave):649:main_i] <top>: syncing: gluster://localhost:slave [2015-09-08 19:01:21.213108] I [gsyncd(slave):649:main_i] <top>: syncing: gluster://localhost:slave [2015-09-08 19:01:21.355303] I [resource(slave):844:service_loop] GLUSTER: slave listening [2015-09-08 19:01:22.369762] I [resource(slave):844:service_loop] GLUSTER: slave listening [2015-09-08 19:01:36.427643] W [syncdutils(slave):486:errno_wrap] <top>: reached maximum retries (['b7e28dc5-f77a-45ff-b5ed-38fc596359a7', '.gfid/7fad9921-77c4-4806-8ed1-048df17c2fb7/level44', '.gfid/7fad9921-77c4-4806-8ed1-048df17c2fb7/level44'])...[Errno 39] Directory not empty: '.gfid/7fad9921-77c4-4806-8ed1-048df17c2fb7/level44' [2015-09-08 19:01:36.428356] W [resource(slave):692:entry_ops] <top>: Recursive remove b7e28dc5-f77a-45ff-b5ed-38fc596359a7 => .gfid/7fad9921-77c4-4806-8ed1-048df17c2fb7/level44failed: Directory not empty [2015-09-08 19:01:37.557725] E [repce(slave):117:worker] <top>: call failed: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker res = getattr(self.obj, rmeth)(*in_data[2:]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 685, in entry_ops [], [ENOTEMPTY, ESTALE, ENODATA]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap return call(*arg) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 659, in recursive_rmdir EISDIR]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap return call(*arg) OSError: [Errno 117] Structure needs cleaning: '.gfid/b5a0188e-f6fa-4c67-81dd-66b4f68a330b/level14/symlink_to_files' [2015-09-08 19:01:37.568291] I [repce(slave):92:service_loop] RepceServer: terminating on reaching EOF. Eventually with retrial all the files are removed from the slave. Version-Release number of selected component (if applicable): ============================================================= glusterfs-3.7.1-14.el7rhgs.x86_64 How reproducible: ================= Always Steps to Reproduce: =================== 1. Create master and slave cluster 2. Create geo-rep session between master and slave volume 3. Execute following fops on master and verify the sync on slave create, chmod, chown, chgrp, symlink, hardlink, truncate, rename, followed by remove {rm -rf }
Reason for Structure needs cleaning error is that the lookup selfheal in DHT is not setting errors properly at few places. Fix sent here http://review.gluster.org/#/c/12165/
For records, hitting this consistently with rm -rf * on build: glusterfs-3.7.5-13.el7rhgs.x86_64 [2016-01-04 11:06:34.928106] I [resource(slave):844:service_loop] GLUSTER: slave listening [2016-01-04 11:06:47.999924] E [repce(slave):117:worker] <top>: call failed: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker res = getattr(self.obj, rmeth)(*in_data[2:]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 685, in entry_ops [], [ENOTEMPTY, ESTALE, ENODATA]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap return call(*arg) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 662, in recursive_rmdir recursive_rmdir(gfid, entry, fullname) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 667, in recursive_rmdir errno_wrap(os.rmdir, [path], [ENOENT, ESTALE]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap return call(*arg) OSError: [Errno 107] Transport endpoint is not connected: '.gfid/1d9fc177-fc1e-411f-8ea9-b6e1234575af/level11/symlink_to_files'
For records, hitting this consistently with rm -rf * on build: glusterfs-3.7.9-1.el7rhgs.x86_64 [2016-04-14 10:34:21.478831] W [resource(slave):721:entry_ops] <top>: Recursive remove ec3be6a9-d725-4a09-8a4c-ef98a84ee39f => .gfid/00000000-0000-0000-0000-000000000001/thread1failed: Directory not empty [2016-04-14 10:34:26.432831] W [syncdutils(slave):486:errno_wrap] <top>: reached maximum retries (['ec3be6a9-d725-4a09-8a4c-ef98a84ee39f', '.gfid/00000000-0000-0000-0000-000000000001/thread1', '.gfid/00000000-0000-0000-0000-000000000001/thread1'])...[Errno 39] Directory not empty: '.gfid/00000000-0000-0000-0000-000000000001/thread1/level07/level17/level27/level37' [2016-04-14 10:34:26.433208] W [resource(slave):721:entry_ops] <top>: Recursive remove ec3be6a9-d725-4a09-8a4c-ef98a84ee39f => .gfid/00000000-0000-0000-0000-000000000001/thread1failed: Directory not empty [2016-04-14 10:35:03.264699] E [repce(slave):117:worker] <top>: call failed: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker res = getattr(self.obj, rmeth)(*in_data[2:]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 714, in entry_ops [], [ENOTEMPTY, ESTALE, ENODATA]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap return call(*arg) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 666, in recursive_rmdir recursive_rmdir(gfid, entry, fullname) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 671, in recursive_rmdir errno_wrap(os.rmdir, [path], [ENOENT, ESTALE]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap return call(*arg) OSError: [Errno 107] Transport endpoint is not connected: '.gfid/cd8f7393-5078-4776-a24f-3e8eb69e953f/level37/symlink_to_files' [2016-04-14 10:35:03.274899] I [repce(slave):92:service_loop] RepceServer: terminating on reaching EOF. [2016-04-14 10:35:03.275094] I [syncdutils(slave):220:finalize] <top>: exiting. [2016-04-14 10:35:08.326251] E [repce(slave):117:worker] <top>: call failed: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker res = getattr(self.obj, rmeth)(*in_data[2:]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 714, in entry_ops [], [ENOTEMPTY, ESTALE, ENODATA]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap return call(*arg) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 671, in recursive_rmdir errno_wrap(os.rmdir, [path], [ENOENT, ESTALE]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap return call(*arg) OSError: [Errno 107] Transport endpoint is not connected: '.gfid/af8c23fb-dfd2-4c76-8c74-45813518acb0/symlink_to_files' [2016-04-14 10:35:08.335688] I [repce(slave):92:service_loop] RepceServer: terminating on reaching EOF. [2016-04-14 10:35:08.335922] I [syncdutils(slave):220:finalize] <top>: exiting. [2016-04-14 10:35:14.135530] I [gsyncd(slave):653:main_i] <top>: syncing: gluster://localhost:slave [2016-04-14 10:35:15.292807] I [resource(slave):902:service_loop] GLUSTER: slave listening [2016-04-14 10:35:19.146552] I [gsyncd(slave):653:main_i] <top>: syncing: gluster://localhost:slave [2016-04-14 10:35:20.289783] I [resource(slave):902:service_loop] GLUSTER: slave listening [2016-04-14 10:36:16.786941] E [repce(slave):117:worker] <top>: call failed: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker res = getattr(self.obj, rmeth)(*in_data[2:]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 714, in entry_ops [], [ENOTEMPTY, ESTALE, ENODATA]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap return call(*arg) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 671, in recursive_rmdir errno_wrap(os.rmdir, [path], [ENOENT, ESTALE]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap return call(*arg) OSError: [Errno 107] Transport endpoint is not connected: '.gfid/8ca37ba9-5d4f-40a1-94df-aed09a2bd36a/symlink_to_files'