Bug 1577796
| Summary: | [Geo-rep]: Worker crashes with OSError: [Errno 116] Stale file handle | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Rochelle <rallan> |
| Component: | geo-replication | Assignee: | Aravinda VK <avishwan> |
| Status: | CLOSED DUPLICATE | QA Contact: | Rahul Hinduja <rhinduja> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rhgs-3.4 | CC: | csaba, khiremat, rallan, rgowdapp, rhs-bugs, sankarshan, storage-qa-internal |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-05-24 06:39:42 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This looks to be a duplicate of bz 1546717, which is fixed in version glusterfs-3.12.2-9. Can you try recreating this bug on glusterfs-3.12.2-9 or higher? Alternatively you can also try with performance.stat-prefetch off as bz 1546717 was found to be an issue with stat-prefetch. Also, how serious is the issue? I hear from Kotresh, if this issue is a transient error which doesn't affect syncing of data to slaves, the issue is not serious enough to be targeted for rhgs-3.4.0. @Kotresh/Rochelle, can you please let me know whether this bug is serious enough to be considered for rhgs-3.4.0? *** This bug has been marked as a duplicate of bug 1546717 *** |
Description of problem: ======================== All the files were not removed from the slave when an rmdir was happening on the slave. Ran (34) automation case -- Rsync + Fuse and encountered the following Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 210, in main main_i() File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 802, in main_i local.service_loop(*[r for r in [remote] if r]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1676, in service_loop g3.crawlwrap(oneshot=True) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 597, in crawlwrap self.crawl() File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1470, in crawl self.changelogs_batch_process(changes) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1370, in changelogs_batch_process self.process(batch) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1204, in process self.process_change(change, done, retry) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1114, in process_change failures = self.slave.server.entry_ops(entries) File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 228, in __call__ return self.ins(self.meth, *a) File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 210, in __call__ raise res OSError: [Errno 116] Stale file handle Version-Release number of selected component (if applicable): ============================================================== [root@dhcp41-226 tmp]# rpm -qa | grep gluster glusterfs-api-3.12.2-8.el7rhgs.x86_64 glusterfs-3.12.2-8.el7rhgs.x86_64 glusterfs-debuginfo-3.8.4-54.8.el7rhgs.x86_64 gluster-nagios-common-0.2.4-1.el7rhgs.noarch libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.2.x86_64 glusterfs-client-xlators-3.12.2-8.el7rhgs.x86_64 glusterfs-cli-3.12.2-8.el7rhgs.x86_64 glusterfs-geo-replication-3.12.2-8.el7rhgs.x86_64 vdsm-gluster-4.19.43-2.3.el7rhgs.noarch python2-gluster-3.12.2-8.el7rhgs.x86_64 glusterfs-server-3.12.2-8.el7rhgs.x86_64 glusterfs-events-3.12.2-8.el7rhgs.x86_64 gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64 glusterfs-libs-3.12.2-8.el7rhgs.x86_64 glusterfs-rdma-3.12.2-8.el7rhgs.x86_64 glusterfs-fuse-3.12.2-8.el7rhgs.x86_64 How reproducible: ================= 1/1 Actual results: ============== Worker crashed and rmdir was not synced Expected results: ================ The worker should not crash