Bug 1393362
| Summary: | [geo-rep]: Sync stuck in Hybrid crawl since 2 days | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Rahul Hinduja <rhinduja> |
| Component: | geo-replication | Assignee: | Bug Updates Notification Mailing List <rhs-bugs> |
| Status: | CLOSED DEFERRED | QA Contact: | Rahul Hinduja <rhinduja> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rhgs-3.2 | CC: | amukherj, atumball, avishwan, bmohanra, csaba, rcyriac, rhs-bugs, storage-qa-internal, vnosov |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Known Issue | |
| Doc Text: |
If a geo-replication session is created while gluster volume rebalance is in progress, then geo-replication may miss some files/directories sync to slave volume. This is caused because of internal movement of files due to rebalance.
Workaround:
Do not create a geo-replication session if the master Volume rebalance is in progress.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-10-11 10:10:19 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1351530 | ||
|
Description
Rahul Hinduja
2016-11-09 12:11:58 UTC
Seeing lot of rsync errors about creating directories, that means mkdir failed for some reason and all the files from that directory failed to sync. (Checked Georep logs from *.121 node). Not yet traced the reason for mkdir failure. Operation not permitted are expected for rsync, rsync expects directory should be present before syncing files. Yet to find the reason for entry failure by entry_ops [2016-11-09 10:09:24.794229] E [resource(/rhs/brick1/b14):1003:rsync] SSH: SYNC Error(Rsync): rsync: recv_generator: mkdir "/proc/9429/cwd/.gfid/00201812-179b-400e-a6f2-b332bf72c7de" failed: Operation not permitted (1) [2016-11-09 10:09:24.794754] E [resource(/rhs/brick1/b14):1003:rsync] SSH: SYNC Error(Rsync): *** Skipping any contents from this failed directory *** [2016-11-09 10:09:24.794974] E [resource(/rhs/brick1/b14):1003:rsync] SSH: SYNC Error(Rsync): rsync: recv_generator: mkdir "/proc/9429/cwd/.gfid/01c699d5-8955-4e7e-85ae-b8c2f6dcb9da" failed: Operation not permitted (1) [2016-11-09 10:09:24.795208] E [resource(/rhs/brick1/b14):1003:rsync] SSH: SYNC Error(Rsync): *** Skipping any contents from this failed directory *** [2016-11-09 10:09:24.795418] E [resource(/rhs/brick1/b14):1003:rsync] SSH: SYNC Error(Rsync): rsync: recv_generator: mkdir "/proc/9429/cwd/.gfid/01d22973-d93b-435b-9d25-f7ec2ebc262a" failed: Operation not permitted (1) [2016-11-09 10:09:24.795630] E [resource(/rhs/brick1/b14):1003:rsync] SSH: SYNC Error(Rsync): *** Skipping any contents from this failed directory *** [2016-11-09 10:09:24.795819] E [resource(/rhs/brick1/b14):1003:rsync] SSH: SYNC Error(Rsync): rsync: recv_generator: mkdir "/proc/9429/cwd/.gfid/022988cd-b59c-4200-85d1-010343ce23a7" failed: Operation not permitted (1) [2016-11-09 10:09:24.796001] E [resource(/rhs/brick1/b14):1003:rsync] SSH: SYNC Error(Rsync): *** Skipping any contents from this failed directory *** [2016-11-09 10:09:24.796231] E [resource(/rhs/brick1/b14):1003:rsync] SSH: SYNC Error(Rsync): rsync: recv_generator: mkdir "/proc/9429/cwd/.gfid/0234671b-0b85-48c4-bdd6-e2108805f631" failed: Operation not permitted (1) Logs mentioned in comment 4 are for different issue(May be related but perceived difference) where crawl is moved to Changelog after restarting the geo-rep. But it still misses the data. (log: 2016-11-09) Original issue was Hybrid crawls being stuck (log: 2016-11-07) Not worked on this in last 2 years! and not planning to pick up in any immediate releases! RCA available, and agreed long back that it is not an issue, and documented the Known Issue! Please reopen the bug if found important for any release! |