Bug 1141862
| Summary: | dist-geo-rep: Deleting files on master volume is not propagated to slave volume and geo-rep session goes faulty after a rebalance on slave volume. | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | shilpa <smanjara> | ||||||
| Component: | geo-replication | Assignee: | Bug Updates Notification Mailing List <rhs-bugs> | ||||||
| Status: | CLOSED WONTFIX | QA Contact: | storage-qa-internal <storage-qa-internal> | ||||||
| Severity: | unspecified | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | rhgs-3.0 | CC: | avishwan, chrisw, csaba, khiremat, mzywusko, nlevinki, smohan | ||||||
| Target Milestone: | --- | Keywords: | ZStream | ||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2018-04-16 15:57:40 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
Created attachment 939150 [details]
Slave logs
Created attachment 939153 [details]
Master log
|
Description of problem: After running add-brick/rebalance on slave volume, and on the master volume, while trying to delete the files on master volume found that the deletes were not propagated to slave. Version-Release number of selected component (if applicable): glusterfs-3.6.0.28-1.el6rhs.x86_64 How reproducible: Tried once Steps to Reproduce: 1. Set up a geo-rep session with 6*2 master and slave volumes. 2. With changelog change_detector and FUSE mount, 0 start the geo-rep session. create files: crefi -T 10 -n 10 --multi -d 10 -b 10 --random --max=5K --min=1k /mnt/master 3. While the the geo-rep is in sync, add-brick/rebalance on the slave volume. 4. Once all the files are sync-ed and the rebalance is complete, check the arequal-checksum of master and slave and they should be equal. 5. Now do a "rm -rf /mnt/master/*" on master node and all the files on master volume are deleted. 6. Check the slave mount point to see if all the files have been deleted on slave as well. Actual results: The deletes never propagated to the slave volume. Geo-rep status showed faulty. Expected results: Deletes should have been propagated. Additional info: [root@Tim master]# find /mnt/master -type f | wc -l 30000 [root@Tim master]# find /mnt/slave -type f | wc -l 30000 Found some traceback errors in logs such as: [2014-09-15 20:29:50.241961] E [repce(/bricks/brick3/mastervol_b9):207:__call__] RepceClient: call 5055:140136763217664:14107931 89.84 (entry_ops) failed on peer with OSError [2014-09-15 20:29:50.242613] E [syncdutils(/bricks/brick3/mastervol_b9):270:log_raise_exception] <top>: FAIL: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 164, in main main_i() File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 643, in main_i local.service_loop(*[r for r in [remote] if r]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1343, in service_loop g2.crawlwrap() File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 524, in crawlwrap self.crawl(no_stime_update=no_stime_update) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1174, in crawl self.process(changes) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 927, in process self.process_change(change, done, retry) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 891, in process_change self.slave.server.entry_ops(entries) File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 226, in __call__ return self.ins(self.meth, *a) File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 208, in __call__ raise res OSError: [Errno 39] Directory not empty: '.gfid/b3c8c3cb-07a0-41fb-ae61-2a27b43eb876/level70' [2014-09-15 20:29:50.245551] I [syncdutils(/bricks/brick3/mastervol_b9):214:finalize] <top>: exiting. [2014-09-15 20:29:50.249872] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. # gluster v geo mastervol 10.70.42.197::slavevol status detail MASTER NODE MASTER VOL MASTER BRICK SLAVE STATUS CHECKPOINT STATUS CRAWL STATUS FILES SYNCD FILES PENDING BYTES PENDING DELETES PENDING FILES SKIPPED ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Tim.blr.redhat.com mastervol /bricks/brick1/mastervol_b1 10.70.42.254::slavevol faulty N/A N/A 2472 0 0 532 0 Tim.blr.redhat.com mastervol /bricks/brick2/mastervol_b5 10.70.42.254::slavevol faulty N/A N/A 2191 0 0 560 0 Tim.blr.redhat.com mastervol /bricks/brick3/mastervol_b9 10.70.42.254::slavevol faulty N/A N/A 2417 0 0 568 0 green mastervol /bricks/brick1/mastervol_b4 10.70.42.151::slavevol Passive N/A N/A 0 0 0 0 0 green mastervol /bricks/brick2/mastervol_b8 10.70.42.151::slavevol Passive N/A N/A 0 0 0 0 0 green mastervol /bricks/brick3/mastervol_b12 10.70.42.151::slavevol Passive N/A N/A 0 0 0 0 0 purple mastervol /bricks/brick1/mastervol_b3 10.70.42.197::slavevol faulty N/A N/A 1988 0 0 494 0 purple mastervol /bricks/brick2/mastervol_b7 10.70.42.197::slavevol faulty N/A N/A 2081 0 0 530 0 purple mastervol /bricks/brick3/mastervol_b11 10.70.42.197::slavevol faulty N/A N/A 2106 0 0 468 0 Javier.blr.redhat.com mastervol /bricks/brick1/mastervol_b2 10.70.42.97::slavevol Passive N/A N/A 0 0 0 0 0 Javier.blr.redhat.com mastervol /bricks/brick2/mastervol_b6 10.70.42.97::slavevol Passive N/A N/A 0 0 0 0 0 Javier.blr.redhat.com mastervol /bricks/brick3/mastervol_b10 10.70.42.97::slavevol Passive N/A N/A 0 0 0 0 0 # gluster v status Status of volume: mastervol Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.42.190:/bricks/brick1/mastervol_b1 49155 Y 12359 Brick 10.70.43.88:/bricks/brick1/mastervol_b2 49155 Y 16855 Brick 10.70.42.29:/bricks/brick1/mastervol_b3 49155 Y 24731 Brick 10.70.42.88:/bricks/brick1/mastervol_b4 49155 Y 24650 Brick 10.70.42.190:/bricks/brick2/mastervol_b5 49156 Y 12370 Brick 10.70.43.88:/bricks/brick2/mastervol_b6 49156 Y 16866 Brick 10.70.42.29:/bricks/brick2/mastervol_b7 49156 Y 24742 Brick 10.70.42.88:/bricks/brick2/mastervol_b8 49156 Y 24661 Brick 10.70.42.190:/bricks/brick3/mastervol_b9 49157 Y 12381 Brick 10.70.43.88:/bricks/brick3/mastervol_b10 49157 Y 16877 Brick 10.70.42.29:/bricks/brick3/mastervol_b11 49157 Y 24753 Brick 10.70.42.88:/bricks/brick3/mastervol_b12 49157 Y 24672