Bug 1342785
| Summary: | [geo-rep]: Worker crashes with permission denied during hybrid crawl caused via replace brick | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Rahul Hinduja <rhinduja> |
| Component: | geo-replication | Assignee: | Kotresh HR <khiremat> |
| Status: | CLOSED ERRATA | QA Contact: | Rochelle <rallan> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rhgs-3.1 | CC: | csaba, khiremat, nchilaka, sheggodu, srmukher |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | RHGS 3.4.0 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | rebase | ||
| Fixed In Version: | glusterfs-3.12.2-1 | Doc Type: | If docs needed, set a value |
| Doc Text: |
Previously, metadata changes such as, ownership change on symlink file crashed with "Permission Denied" error. With this fix, geo-replication is fixed to sync metadata of symlink files and ownership change of symlink files is replicated properly and does not result in a crash.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-09-04 06:29:40 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1503134 | ||
Upstream Patch: https://review.gluster.org/17389 The same patch is fixing this bug and https://bugzilla.redhat.com/show_bug.cgi?id=1299740 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607 |
Description of problem: ======================= In a scenario, where files are synced to slave and replace brick is issued on master. This can cause hybrid crawl if the replaced brick becomes ACTIVE, it crashes with permission denied and becomes passive. [2016-06-05 09:39:57.549393] E [repce(/rhs/brick3/b9):207:__call__] RepceClient: call 5532:140024108808000:1465119596.96 (meta_ops) failed on peer with OSError [2016-06-05 09:39:57.549735] E [syncdutils(/rhs/brick3/b9):276:log_raise_exception] <top>: FAIL: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 201, in main main_i() File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 720, in main_i local.service_loop(*[r for r in [remote] if r]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1510, in service_loop g2.crawlwrap() File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 571, in crawlwrap self.crawl() File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1132, in crawl self.changelogs_batch_process(changes) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1107, in changelogs_batch_process self.process(batch) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 992, in process self.process_change(change, done, retry) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 952, in process_change failures = self.slave.server.meta_ops(meta_entries) File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 226, in __call__ return self.ins(self.meth, *a) File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 208, in __call__ raise res OSError: [Errno 13] Permission denied: '.gfid/86bf33db-52e8-49b7-a7d9-1d00d82b88ef' [2016-06-05 09:39:57.552012] I [syncdutils(/rhs/brick3/b9):220:finalize] <top>: exiting. [2016-06-05 09:39:57.561508] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2016-06-05 09:39:57.561933] I [syncdutils(agent):220:finalize] <top>: exiting. Version-Release number of selected component (if applicable): ============================================================= glusterfs-3.7.9-8 How reproducible: ================= Yet to try second time. Steps to Reproduce: =================== 1. Create Master and Slave volume. Create geo-rep session between them 2. Create data on Master and let it sync to slave. 3. Stop the existing session. 4. Initiate Replace brick commit for one of the brick 5. Start geo-rep immediately Actual results: =============== Replace brick starts (Hybrid) crawl and crashes. Expected results: ================= Worker should not crash