Bug 802785

Summary: Geo-replication with ignore deletes on , dd of a file deleted won't get synced.
Product: [Community] GlusterFS Reporter: Vijaykumar Koppad <vkoppad>
Component: geo-replicationAssignee: Venky Shankar <vshankar>
Status: CLOSED NOTABUG QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: mainlineCC: bbandari, gluster-bugs, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-20 11:27:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Vijaykumar Koppad 2012-03-13 13:58:28 UTC
Description of problem:
With plain distribute volume, as the master and directory as slave , with ignore-deletes enabled , if you do dd of filename which was deleted earlier on the master, then it won't get synced properly and the md5sum doesn't match . 


Version-Release number of selected component (if applicable):
[6a8fcff3fb6955162dc4eeaeaa627bb31311627e] 3.3.0 ,master

How reproducible:always


Steps to Reproduce:
1.Start a geo-replication session with distribute replicate as master, with ignore-deletes on 
2.touch a file on the master, and delete it. It won't delete on slave. 
3.Now do dd of 100MB with the same filename which you deleted.
4. Check the disk usage of both files ( on master ans on slave) and also check the md5sum.

Comment 1 Venky Shankar 2012-04-08 16:46:23 UTC
Vijaykumar,

I was not able to reproduce this issue. Do you still have the setup which was experiencing this issue ? If not, please try it again. I am unable to get it not sync the file with the steps you mentioned.

Comment 2 Vijaykumar Koppad 2012-04-09 07:54:50 UTC
I guess its problem with du command which i was using to check the disk usage. 
If i use stat command ,  it gives the result properly. It means geo-rep is syncing the data properly , but since du is not showing the correct disk-usage , it might be problem with glusterfs.

Comment 3 Vijaykumar Koppad 2012-04-20 11:27:50 UTC
            After doing some experiments and research on this, i got to know that it was happening because of the sparse files. This is not the only case with ignore deletes, its  the case with normal geo-replication , since rsync use -S option internally, which won't sync the sparse data, i mean it syncs only one block of repeated data and it adjusts the other data with the checksum. 
            If i remove that -S option this won't be the case. But if we use stripe volume on the slave side, i was getting more disk space , it  is the issue with this.
http://post-office.corp.redhat.com/archives/gluster-internal/2012-April/msg00050.html . So i am closing this bug.