Bug 1111171
Summary: | Dist-geo-rep : geo-rep xsync crawl takes too much time to sync meta data changes. | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Vijaykumar Koppad <vkoppad> | |
Component: | geo-replication | Assignee: | Venky Shankar <vshankar> | |
Status: | CLOSED ERRATA | QA Contact: | Bhaskar Bandari <bbandari> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | rhgs-3.0 | CC: | aavati, avishwan, bbandari, bturner, csaba, david.macdonald, nlevinki, nsathyan, sharne, smanjara, ssamanta, vshankar | |
Target Milestone: | --- | |||
Target Release: | RHGS 3.0.0 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.6.0.25-1 | Doc Type: | Known Issue | |
Doc Text: |
The Geo-replication takes more time to synchronize initial data while using Hybrid crawl.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1111490 (view as bug list) | Environment: | ||
Last Closed: | 2014-09-22 19:42:21 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1103155, 1111490 |
Description
Vijaykumar Koppad
2014-06-19 11:55:17 UTC
Upstream patch sent for review: http://review.gluster.org/#/c/8124/ Vijaykumar, Do you have a setup where I can profile the system? Essentially, I would like to strace and check which calls are taking up most time. As of now patch http://review.gluster.org/#/c/8124/ cuts down extra stat() [on mater] and redundant ch[mod,own]() from the slave. But I would like to check anyhow on other calls taking time. Please review and signoff edited doc text. Hit this issue while testing xsync crawl for symlinks.. It took too long to sync the changes. Tested on glusterfs-3.6.0.22 (In reply to shilpa from comment #7) > Hit this issue while testing xsync crawl for symlinks.. It took too long to > sync the changes. Tested on glusterfs-3.6.0.22 Hey Shilpa, Do you have the setup where I can look into. I am eager to thrash this out. Venky, I have provided the hardware details in the email. (In reply to shilpa from comment #10) > Venky, > > I have provided the hardware details in the email. Thanks. I am looking into it. Update: Followed steps as mentioned in Comment #1 (FUSE client) and did not experience the huge delay for metadata synchronization. I captured syscall invocations/timing on the bricks and they do not seem to reflect delays on setattr() calls. Also, gsyncd on master nodes are able to sync metadata without delays as mentioned in Comment #1. The slowness seems to be not due to syncing metadata (i.e. chown() or chmod()) but due to gsyncd getting EINVAL during creation of a file which already exist. Xsync based crawl scans the filesystem based on xtime and generates a changelog consisting of the list of changes to be replicated. During crawl, if a directory or a file satisfies: xtime(master) > xtime(slave) ... and entry is made to the changelog. Since the changes identified is solely based only by crawling the master volume, entry for files/directories are made in changelog irrespective of their existence on the slave (else we'd need to query the slave, which is an expensive operation). Now, when the actual replication is done, the changelog is replayed -- files/directories which already exist on slave should return EEXIST (entry already exist) on a creation which is gracefully handled by geo-replication. But what we get on such an operation is EINVAL (Invalid argument) as shown below from the logs: [2014-07-04 13:20:41.930759] W [client-rpc-fops.c:306:client3_3_mkdir_cbk] 0-slave-client-9: remote operation failed: File exists. Path: <gfid:7bc16d1b-e63f-4adf-bde9-ed9f4ebf220a>/level99 [2014-07-04 13:20:41.930934] W [client-rpc-fops.c:306:client3_3_mkdir_cbk] 0-slave-client-8: remote operation failed: File exists. Path: <gfid:7bc16d1b-e63f-4adf-bde9-ed9f4ebf220a>/level99 [2014-07-04 13:20:41.930986] W [fuse-bridge.c:1237:fuse_err_cbk] 0-glusterfs-fuse: 89160: SETXATTR() /.gfid/7bc16d1b-e63f-4adf-bde9-ed9f4ebf220a => -1 (Invalid argument) [NOTE: geo-replication uses special setxattr() based interface to create entries on the slave] EINVAL is given separate handling in geo-replication to retry the operation for a number of times. This was historically due to entries getting missed during creation because of an unknown race (at that time, now known) hoping that retrying would eventually fix it. After hitting retry limit, geo-replication moves ahead with the next operation (after logging the failed op as shown below), which undergoes the same problem again. [2014-07-04 18:44:47.376432] W [syncdutils(slave):480:errno_wrap] <top>: reached maximum retries (['.gfid/7bc16d1b-e63f-4adf-bde9-ed9f4ebf220a', 'glusterfs.gfid.newfile', '\x00\x00%\x8a\x00\x00%\x8a20b99f65-149e-4ff7-a043-1afa881357da\x00\x00\x00\x81\xa453b68684%%BKPGIUFLRB\x00\x00\x00\x01\xa4\x00\x00\x00\x00\x00\x00\x00\x00'])... SIDE NOTE: setxattr() does not have EINVAL as a valid errno (EEXIST though has another meaning) looks like EINVAL was generated in cases where the we get an EEXIST during entry creation and setattr() [to set uid/gid] using an inode on which a lookup has not happened [which is when DHT error's out with layout missing logs]. gfid-access translator uses a fresh inode for the entry to be created if it's not found in the inode table. On failure cases (errno EEXIST, etc.) DHT does not populate the inode context with the layout information. gfid-access incorrectly uses this inode blindly for setattr() after entry creation thereby resulting in EINVAL trickled up to the client application (generated by dht). Fix: Handle error cases in gfid-access entry creation (callback path) and return correct errno (EEXIST, etc.). Sent patch to downstream: https://code.engineering.redhat.com/gerrit/#/c/29095/ Upstream patch: http://review.gluster.org/#/c/8124/ One more patch from Venky https://code.engineering.redhat.com/gerrit/#/c/29133/ Upstream patch: http://review.gluster.org/#/c/8260/ verified on the build glusterfs-3.6.0.25-1 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html |