Bug 1329675

Summary: geo-rep+sharding : Checkpoint reports completion, but all files are not copied to slave
Product: Red Hat Gluster Storage Reporter: Sahina Bose <sabose>
Component: geo-replicationAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED NOTABUG QA Contact: storage-qa-internal <storage-qa-internal>
Severity: high Docs Contact:
Priority: high    
Version: rhgs-3.1CC: avishwan, chrisw, csaba, nlevinki, rhinduja, sasundar
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-26 05:57:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1258386    
Attachments:
Description Flags
geo-rep-logs-from-master none

Description Sahina Bose 2016-04-22 14:30:54 UTC
Description of problem:

I had an existing geo-rep session between master "vmstore" and slave "data" which I cleaned up:

i. Removed all files from slave brick using rm -rf
ii. Deleted geo-rep session from master cluster
gluster volume geo-rep vmstore 10.70.40.112::data delete
iii. Removed stime xattr from master bricks on all 3 nodes.
setfattr -x trusted.glusterfs.61d9e4cc-7e60-4557-835d-0c44cbcb8b31.decc121d-7665-4c20-b2e0-39cc8c028c39.stime /rhgs/vmstore/brick1/


Now started geo-rep between master and slave using the scheduler script:
/usr/share/glusterfs/scripts/schedule_georep.py vmstore 10.70.40.112 data --interval 10 --timeout 15

Ensured checkpoint completed - and compared output in master and slave - 
Result: files missing in slave.

Master:
[root@rhsdev14 ~]# ll /rhgs/vmstore/brick1/.shard/ | wc -l
98

Slave:
[root@openstack-vm1 ~]# ll /rhgs/brick/brick1/.shard/ | wc -l
6



Version-Release number of selected component (if applicable):
3.1.3 
Master - glusterfs-3.7.9-1.el7rhgs.x86_64
Slave - glusterfs-3.7.9-2.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
As provided above

Actual results:
Master:
/rhev/data-center/mnt/glusterSD/10.70.42.219:_vmstore
├── 5e1a37cf-933d-424c-8e3d-eb9e40b690a7
│   ├── dom_md
│   │   ├── ids
│   │   ├── inbox
│   │   ├── leases
│   │   ├── metadata
│   │   └── outbox
│   ├── images
│   │   ├── 202efaa6-0d01-40f3-a541-10eee920d221
│   │   │   ├── eb701046-6ee1-4c9d-b097-e51a8fd283e1
│   │   │   ├── eb701046-6ee1-4c9d-b097-e51a8fd283e1.lease
│   │   │   └── eb701046-6ee1-4c9d-b097-e51a8fd283e1.meta
│   │   ├── c52e4e02-dc6c-4a77-a184-9fcab88106c2
│   │   │   ├── 766a15b9-57db-417d-bfa0-beadbbb84ad2
│   │   │   ├── 766a15b9-57db-417d-bfa0-beadbbb84ad2.lease
│   │   │   ├── 766a15b9-57db-417d-bfa0-beadbbb84ad2.meta
│   │   │   ├── 90f1e26a-00e9-4ea5-9e92-2e448b9b8bfa
│   │   │   ├── 90f1e26a-00e9-4ea5-9e92-2e448b9b8bfa.lease
│   │   │   ├── 90f1e26a-00e9-4ea5-9e92-2e448b9b8bfa.meta
│   │   │   ├── df874a88-a998-4f9e-be23-f3f581067bca
│   │   │   ├── df874a88-a998-4f9e-be23-f3f581067bca.lease
│   │   │   └── df874a88-a998-4f9e-be23-f3f581067bca.meta
│   │   ├── c75de5b7-aa88-48d7-ba1b-067181eac6ae
│   │   │   ├── ff09e16a-e8a0-452b-b95c-e160e68d09a9
│   │   │   ├── ff09e16a-e8a0-452b-b95c-e160e68d09a9.lease
│   │   │   └── ff09e16a-e8a0-452b-b95c-e160e68d09a9.meta
│   │   ├── efa94a0d-c08e-4ad9-983b-4d1d76bca865
│   │   │   ├── 8174e8b4-3605-4db3-86a1-cb62c3a079f4
│   │   │   ├── 8174e8b4-3605-4db3-86a1-cb62c3a079f4.lease
│   │   │   ├── 8174e8b4-3605-4db3-86a1-cb62c3a079f4.meta
│   │   │   ├── b1c5bc52-476c-466a-9301-e1ce862bb75b
│   │   │   ├── b1c5bc52-476c-466a-9301-e1ce862bb75b.lease
│   │   │   ├── b1c5bc52-476c-466a-9301-e1ce862bb75b.meta
│   │   │   ├── e79a8821-bb4a-436a-902d-3876f107dd99
│   │   │   ├── e79a8821-bb4a-436a-902d-3876f107dd99.lease
│   │   │   └── e79a8821-bb4a-436a-902d-3876f107dd99.meta
│   │   └── f5eacc6e-4f16-4aa5-99ad-53ac1cda75b7
│   │       ├── 476bbfe9-1805-4c43-bde6-e7de5f7bd75d
│   │       ├── 476bbfe9-1805-4c43-bde6-e7de5f7bd75d.lease
│   │       └── 476bbfe9-1805-4c43-bde6-e7de5f7bd75d.meta
│   └── master
│       ├── tasks
│       └── vms
└── __DIRECT_IO_TEST__

11 directories, 33 files


Slave:
/mnt/
├── 5e1a37cf-933d-424c-8e3d-eb9e40b690a7
│   ├── dom_md
│   │   ├── ids
│   │   ├── inbox
│   │   ├── leases
│   │   ├── metadata
│   │   └── outbox
│   ├── images
│   │   ├── c52e4e02-dc6c-4a77-a184-9fcab88106c2
│   │   │   ├── 766a15b9-57db-417d-bfa0-beadbbb84ad2.meta
│   │   │   ├── 90f1e26a-00e9-4ea5-9e92-2e448b9b8bfa
│   │   │   ├── 90f1e26a-00e9-4ea5-9e92-2e448b9b8bfa.lease
│   │   │   ├── 90f1e26a-00e9-4ea5-9e92-2e448b9b8bfa.meta
│   │   │   ├── df874a88-a998-4f9e-be23-f3f581067bca
│   │   │   ├── df874a88-a998-4f9e-be23-f3f581067bca.lease
│   │   │   └── df874a88-a998-4f9e-be23-f3f581067bca.meta
│   │   ├── c75de5b7-aa88-48d7-ba1b-067181eac6ae
│   │   │   ├── ff09e16a-e8a0-452b-b95c-e160e68d09a9
│   │   │   └── ff09e16a-e8a0-452b-b95c-e160e68d09a9.meta
│   │   ├── efa94a0d-c08e-4ad9-983b-4d1d76bca865
│   │   │   ├── 8174e8b4-3605-4db3-86a1-cb62c3a079f4
│   │   │   ├── 8174e8b4-3605-4db3-86a1-cb62c3a079f4.lease
│   │   │   ├── 8174e8b4-3605-4db3-86a1-cb62c3a079f4.meta
│   │   │   ├── b1c5bc52-476c-466a-9301-e1ce862bb75b
│   │   │   ├── b1c5bc52-476c-466a-9301-e1ce862bb75b.lease
│   │   │   ├── b1c5bc52-476c-466a-9301-e1ce862bb75b.meta
│   │   │   └── e79a8821-bb4a-436a-902d-3876f107dd99.meta
│   │   └── f5eacc6e-4f16-4aa5-99ad-53ac1cda75b7
│   │       ├── 476bbfe9-1805-4c43-bde6-e7de5f7bd75d
│   │       └── 476bbfe9-1805-4c43-bde6-e7de5f7bd75d.meta
│   └── master
│       └── tasks
└── __DIRECT_IO_TEST__

9 directories, 24 files


Expected results:
Master and slave in sync

Additional info:
Logs from Active geo-rep node attached.

Comment 1 Sahina Bose 2016-04-22 14:37:11 UTC
Created attachment 1149770 [details]
geo-rep-logs-from-master

Comment 2 Aravinda VK 2016-04-25 07:24:35 UTC
RCA:

The issue was due to deleting stime xattrs only from brick root and not from all dirs. Hybrid Crawl(Filesystem crawl during initial sync) will compare xtime and stime. where xtime is like modification time set by marker translator on every change. stime is Slave time which indicates till what time Slave is in sync with Master. stime is maintained in Brick root as well as on directories. Xtime will be available in all files and directories.

/(xtime=10, stime=8)
   d1/(xtime=10, stime=8)
      f1(xtime=10)
      f2(xtime=7)
   d2/(xtime=7, stime=7)
      f3(xtime=6)
      f4(xtime=7)

Above directory structure shows two directories d1, d2 and two files in each directories. Hybrid crawl detects the files and directories need to be synced based on xtime > stime. In the above example, Hybrid crawl picks only "d1/f1" to be synced.

When we remove stime xattr from brick root, Geo-rep may fail to get the list of files/directories for syncing.

/(xtime=10, stime=)
   d1/(xtime=10, stime=8)
      f1(xtime=10)
      f2(xtime=7)
   d2/(xtime=7, stime=7)
      f3(xtime=6)
      f4(xtime=7)

Even after stime reset, Geo-rep picks only "d1/f1" for syncing. Since Geo-rep thinks all the other files already synced to Slave.

This xtime > stime comparison is required to avoid re-crawl and sync of same dirs/files when worker crashes.


Workaround:
-------------------
Delete stime xattrs from directories and brick root if Resync is required.


Possible solution:
-------------------
Instead of deleting stime xattr, Mark xattr value as -2. Geo-rep should ignore xtime>stime comparison if Brick root stime xattr value is -2. This should be part of Geo-rep delete command. (BZ 1205162)

Comment 3 Sahina Bose 2016-04-26 05:57:52 UTC
Closing this, as this will not be a supported use case.
While recovering from slave, files in slave should not be deleted.