Bug 984603 - Dist-geo-rep : hardlinks to files are not synced as actual hardlinks by first xsync crawl.
Dist-geo-rep : hardlinks to files are not synced as actual hardlinks by first...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: geo-replication (Show other bugs)
2.1
x86_64 Linux
high Severity high
: ---
: ---
Assigned To: Venky Shankar
Vijaykumar Koppad
: ZStream
Depends On:
Blocks: 957769 1001980
  Show dependency treegraph
 
Reported: 2013-07-15 10:14 EDT by Vijaykumar Koppad
Modified: 2014-08-24 20:50 EDT (History)
8 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0.34rhs
Doc Type: Bug Fix
Doc Text:
Previously, the logic of geo-replication's initial xsync to fetch the master cluster's file details failed to capture hardlinks properly. With this update, hardlinks are fully supported by geo-replication.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-11-27 10:27:38 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Vijaykumar Koppad 2013-07-15 10:14:55 EDT
Description of problem: The hardlinks to to files were created after stopping a geo-rep session and then started geo-rep session, the first xsync crawl will sync hardlinks as separate files, not as hardlinks.  Consequently, total disk usage on slave will be greater than master. 


Version-Release number of selected component (if applicable): 3.4.0.12rhs.beta4-1.el6rhs.x86_64


How reproducible: Observed once. 


Steps to Reproduce:
1.Create and start a geo-rep relationship between master(DIST-REP) and slave. 
2.Create files using the command, ./crefi.py -n 10 --multi -b 10 -d 10 --random --max=500K --min=10 <MNT_PNT>
3.Let it sync to slave.
4. Stop the geo-rep session,
5. create hardlinks to all the files, using the command ./crefi.py -n 10 --multi -b 10 -d 10 --random --max=500K --min=10 --fop=hardlink <MNT_PNT>
6. start the geo-rep session. 
7. Check if it has completed syncing by checking the number of files on master and slave.


Actual results: hardlinks to files are not synced as actual hardlinks by first xsync crawl. 


Expected results: Hardlinks should be synced as, actual hardlinks , not as separate files. 


Additional info:
Comment 2 Venky Shankar 2013-07-21 02:34:29 EDT
First of all, the hybrid crawl does not handle hardlinks (which you mention in the "Expected Result" section).

The issue is actually that the entry creation for the hardlinks should result in a NOP as the gfid already exist on the slave. So, keeping this bug in open state.
Comment 3 Amar Tumballi 2013-08-01 06:22:55 EDT
the 'hybrid crawl' mechanism we use doen't capture the hardlinks yet, and hence we don't have any option now. It was similar behavior (hardlinks used to be created as separate files) with earlier implementation.
Comment 4 Amar Tumballi 2013-08-28 08:43:46 EDT
Venky, can you review this patch?

diff --git a/geo-replication/syncdaemon/master.py b/geo-replication/syncdaemon/master.py
index f18a60e..5ed6796 100644
--- a/geo-replication/syncdaemon/master.py
+++ b/geo-replication/syncdaemon/master.py
@@ -885,7 +885,12 @@ class GMasterXsyncMixin(GMasterChangelogMixin):
                 self.write_entry_change("E", [gfid, 'MKDIR', escape(os.path.join(pargfid, bname))])
                 self.crawl(e, xtr)
             elif stat.S_ISREG(mo):
-                self.write_entry_change("E", [gfid, 'CREATE', escape(os.path.join(pargfid, bname))])
+                # if a file has a hardlink, create a Changelog entry as 'LINK' so the slave
+                # side will decide if to create the new entry, or to create link.
+                if st.st_nlink == 1:
+                    self.write_entry_change("E", [gfid, 'CREATE', escape(os.path.join(pargfid, bname))])
+                else:
+                    self.write_entry_change("E", [gfid, 'LINK', escape(os.path.join(pargfid, bname))])
                 self.write_entry_change("D", [gfid])
             elif stat.S_ISLNK(mo):
                 self.write_entry_change("E", [gfid, 'SYMLINK', escape(os.path.join(pargfid, bname))])


With this, we may just solve it anyways.
Comment 5 Venky Shankar 2013-08-28 13:19:21 EDT
(In reply to Amar Tumballi from comment #4)

[snip]
> 
> With this, we may just solve it anyways.

This could work if the original file was already synced to the slave and was not modified when gsyncd was not running.

For freshly created files and the it's hardlink, nlink would be > 1, therefore having a 'LINK' entry with a gfid that does not yet exist on the slave.

If the entry was in sync and was modified and had a hardlink created to it before the first crawl, then there would be two 'LINK' entries: one would probably be OK (the actual hardlink) but what about the other one?
Comment 6 Amar Tumballi 2013-09-04 08:13:29 EDT
(In reply to Venky Shankar from comment #5)
> 
> This could work if the original file was already synced to the slave and was
> not modified when gsyncd was not running.
> 

This works for non-existent files on the slave too now.

> For freshly created files and the it's hardlink, nlink would be > 1,
> therefore having a 'LINK' entry with a gfid that does not yet exist on the
> slave.
> 
> If the entry was in sync and was modified and had a hardlink created to it
> before the first crawl, then there would be two 'LINK' entries: one would
> probably be OK (the actual hardlink) but what about the other one?

We do a 'lstat()' on the gfid on slave side and then decide if we should do 'MKNOD' (ie, a fresh create), or 'LINK'. So, sending 2 LINKs instead of one MKNOD/CREATE and another LINK is fine.

Also, this case is no different than below set of operation in changelog mode (if the operations end up in same CHANGELOG file.

bash# cd /mount/point; touch a; ln a b;

-----

https://code.engineering.redhat.com/gerrit/#/c/12110
Comment 7 Amar Tumballi 2013-09-04 08:14:28 EDT
This bug is very much related to bug 1001498 and hence should be treated as blocker.
Comment 8 Vivek Agarwal 2013-09-30 06:31:33 EDT
Verified that with the build (given in fixed in version field) the steps in description works.
Comment 9 Vijaykumar Koppad 2013-10-15 06:54:04 EDT
verified on glusterfs-3.4.0.34rhs
Comment 11 errata-xmlrpc 2013-11-27 10:27:38 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1769.html

Note You need to log in before you can comment on or make changes to this bug.