Bug 1299740 - [geo-rep]: On cascaded setup for every entry their is setattr recorded in changelogs of slave
[geo-rep]: On cascaded setup for every entry their is setattr recorded in cha...
Status: ON_QA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: geo-replication (Show other bugs)
x86_64 Linux
unspecified Severity medium
: ---
: RHGS 3.4.0
Assigned To: Bug Updates Notification Mailing List
: ZStream
Depends On:
Blocks: 1503134
  Show dependency treegraph
Reported: 2016-01-19 02:29 EST by Rahul Hinduja
Modified: 2018-02-26 05:44 EST (History)
7 users (show)

See Also:
Fixed In Version: glusterfs-3.12.2-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Rahul Hinduja 2016-01-19 02:29:53 EST
Description of problem:

In a cascaded setup of 3 (Master 1, Slave 1 and Slave 2) where the slave 1 acts as master for slave 2 making changelogs enables for slave 1. Every entry which now gets sync from master 1 to slave 1, records Entry and Setattr in changelogs. 

[root@dhcp47-94 changelogs]# cat CHANGELOG.1453187547
GlusterFS Changelog | version: v1.2 | encoding : 2
E84476bb1-8af5-46d5-8d03-7812d79a0ec7700000000-0000-0000-0000-000000000001/mtabM84476bb1-8af5-46d5-8d03-7812d79a0ec738[root@dhcp47-94 changelogs]# 
[root@dhcp47-94 changelogs]# 

The effect of this would be permission denied with the synced files is symlink to non modified file for example: 

lrwxrwxrwx. 1 root root 17 Jan  8 16:17 /etc/mtab -> /proc/self/mounts

[2016-01-18 08:56:05.512773] I [master(/rhs/brick2/b5):1202:crawl] _GMaster: slave's time: (1453100978, 0)
[2016-01-18 08:56:15.938399] E [repce(/rhs/brick2/b5):207:__call__] RepceClient: call 12009:140230416361280:1453107374.25 (meta_ops) failed on peer with OSError
[2016-01-18 08:56:15.938977] E [syncdutils(/rhs/brick2/b5):276:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 165, in main
   File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 662, in main_i
     local.service_loop(*[r for r in [remote] if r])
   File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1439, in service_loop
   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 610, in crawlwrap
   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1211, in crawl
   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1117, in changelogs_batch_process
   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 995, in process
     self.process_change(change, done, retry)
   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 968, in process_change
     failures = self.slave.server.meta_ops(meta_entries)
   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 226, in __call__
     return self.ins(self.meth, *a)
   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 208, in __call__
     raise res
 OSError: [Errno 13] Permission denied: '.gfid/cd028294-7c61-4414-9dbb-18801f2a2922'
 [2016-01-18 08:56:15.940839] I [syncdutils(/rhs/brick2/b5):220:finalize] <top>: exiting.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:

Case 1: Cascading

1. Cascade of M1=>S1=>S2
2. cp -rf /mnt/mtab on M1 
3. Check changelogs of S1 and S2. It should not have recorded SETATTR (M). Should only record ENTRY (E)

Case 2: Failover/Failback

1. Promote slave to master, enable indexing and changelog
2. cp -rf /etc a.1 on slave
3. Stop original geo-rep session from original master
4. create and start new geo-rep session from slave to original master
5. Once synced confirm via arequal, stop and delete new geo-rep session
6. Disable changelog and indexing on slave
7. Start original geo-rep session
8. Create new data on original master and verify that it gets sync to slave. 
9. Geo-rep shouldn't go to faulty. Check for status and logs. 

Actual results:

For every E and M gets recorded

Expected results:

Only Entry should be recorded
Comment 2 Kotresh HR 2017-09-21 15:34:54 EDT
The patch [1] which fixes this issue is already in merged in master and in 3.12. Hence moving it to POST.

[1] https://review.gluster.org/#/c/17389/

Note You need to log in before you can comment on or make changes to this bug.