1650893 – fails to sync non-ascii (utf8) file and directory names, causes permanently faulty geo-replication state

Bug 1650893 - fails to sync non-ascii (utf8) file and directory names, causes permanently faulty geo-replication state

Summary: fails to sync non-ascii (utf8) file and directory names, causes permanently f...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	geo-replication
Sub Component:
Version:	mainline
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	urgent
Target Milestone:	---
Assignee:	Kotresh HR
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1648642
Blocks:
TreeView+	depends on / blocked

Reported:	2018-11-17 08:27 UTC by Kotresh HR
Modified:	2019-03-25 16:31 UTC (History)
CC List:	4 users (show)
Fixed In Version:	glusterfs-6.0
Clone Of:	1648642
Environment:
Last Closed:	2019-03-25 16:31:57 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Gluster.org Gerrit	21668	0	None	Merged	geo-rep: Fix syncing of files with non-ascii filenames	2018-12-04 09:16:33 UTC

Description Kotresh HR 2018-11-17 08:27:25 UTC

+++ This bug was initially created as a clone of Bug #1648642 +++

Description of problem:
geo replication gets into a permanently faulty state when a file or directory consisting of non-ascii characters is created on the master

Version-Release number of selected component (if applicable):
gluster 5.0 (debian stretch 9.6)

How reproducible:
100%

Steps to Reproduce:
1. setup geo-replication between two volumes
2. "mkdir foo" on master → check that it is synced to slave → all OK 
3. "touch foo/Руководства" (or mkdir, same effect)

Actual results:
geo-replication switches to faulty state

Expected results:
gluster should not have any problems with non-ascii path/filenames

Additional info:
you cannot fix the errornous state by removing the offending file from master (that's why I set Severity to "urgent") - even when the file no longer exists on master, the geo-replication will retry forever.


master gsync log, including directory sync and then failure to sync the non-ascii name

[2018-11-11 10:09:26.300494] I [master(worker /srv/backup/georeptest):1448:crawl] _GMaster: slave's time    stime=(1541930932, 0)
[2018-11-11 10:09:26.793891] I [master(worker /srv/backup/georeptest):1932:syncjob] Syncer: Sync Time Taken job=2   duration=0.1246 return_code=0   num_files=1
[2018-11-11 10:09:26.860157] I [master(worker /srv/backup/georeptest):1362:process] _GMaster: Entry Time Taken  MKN=0   duration=0.1347 MKD=1   CRE=0   LIN=0   RMD=0   REN=0   SYM=0   UNL=0
[2018-11-11 10:09:26.860423] I [master(worker /srv/backup/georeptest):1372:process] _GMaster: Data/Metadata Time Taken  SETX=1  meta_duration=0.0839    SETA=1  XATT=0  data_duration=0.3008    DATA=0
[2018-11-11 10:09:26.860752] I [master(worker /srv/backup/georeptest):1382:process] _GMaster: Batch Completed   duration=0.5591 entry_stime=(1541930962, 0) num_changelogs=1    changelog_end=1541930963    mode=live_changelog stime=(1541930962, 0)   changelog_start=1541930963
[2018-11-11 10:09:57.169396] I [master(worker /srv/backup/georeptest):1448:crawl] _GMaster: slave's time    stime=(1541930962, 0)
[2018-11-11 10:09:57.298777] E [repce(worker /srv/backup/georeptest):214:__call__] RepceClient: call failed error=OSError   method=entry_ops    call=25385:139958097360640:1541930997.252009
[2018-11-11 10:09:57.299144] E [syncdutils(worker /srv/backup/georeptest):338:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/gsyncd.py", line 322, in main
    func(args)
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/subcmds.py", line 82, in subcmd_worker
    local.service_loop(remote)
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/resource.py", line 1323, in service_loop
    g2.crawlwrap()
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py", line 599, in crawlwrap
    self.crawl()
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py", line 1459, in crawl
    self.changelogs_batch_process(changes)
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py", line 1433, in changelogs_batch_process
    self.process(batch)
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py", line 1268, in process
    self.process_change(change, done, retry)
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py", line 1165, in process_change
    failures = self.slave.server.entry_ops(entries)
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/repce.py", line 233, in __call__
    return self.ins(self.meth, *a)
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/repce.py", line 215, in __call__
    raise res
OSError: [Errno 12] Cannot allocate memory
[2018-11-11 10:09:57.340240] I [repce(agent /srv/backup/georeptest):97:service_loop] RepceServer: terminating on reaching EOF.
[2018-11-11 10:09:57.369302] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change status=Faulty
[2018-11-11 10:10:07.530161] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change status=Initializing...

######
gsyncd.log from slave:
######
[2018-11-11 10:06:20.175026] W [gsyncd(slave berta.tdf/srv/backup/georeptest):304:main] <top>: Session config file not exists, using the default config path=/var/lib/glusterd/geo-replication/georeptest_antares.tdf_georeptest/gsyncd.conf
[2018-11-11 10:06:20.195641] I [resource(slave berta.tdf/srv/backup/georeptest):1113:connect] GLUSTER: Mounting gluster volume locally...
[2018-11-11 10:06:21.258642] I [resource(slave berta.tdf/srv/backup/georeptest):1136:connect] GLUSTER: Mounted gluster volume   duration=1.0627
[2018-11-11 10:06:21.259554] I [resource(slave berta.tdf/srv/backup/georeptest):1163:service_loop] GLUSTER: slave listening
[2018-11-11 10:09:18.109813] E [repce(slave berta.tdf/srv/backup/georeptest):123:worker] <top>: call failed: 
Traceback (most recent call last):
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/repce.py", line 118, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/resource.py", line 709, in entry_ops
    [ESTALE, EINVAL, EBUSY])
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/syncdutils.py", line 546, in errno_wrap
    return call(*arg)
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/libcxattr.py", line 83, in lsetxattr
    cls.raise_oserr()
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/libcxattr.py", line 38, in raise_oserr
    raise OSError(errn, os.strerror(errn))
OSError: [Errno 12] Cannot allocate memory
[2018-11-11 10:09:18.171407] I [repce(slave berta.tdf/srv/backup/georeptest):97:service_loop] RepceServer: terminating on reaching EOF.

######
geo-rep mnt log from slave
######
[2018-11-11 10:06:20.217263] I [fuse-bridge.c:4259:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.26
[2018-11-11 10:06:20.217301] I [fuse-bridge.c:4870:fuse_graph_sync] 0-fuse: switched to graph 0
[2018-11-11 10:09:18.109597] E [gfid-access.c:203:ga_newfile_parse_args] 0-gfid-access-autoload: gfid: ab5b1116-d4f2-4ebf-985e-c9465e4383ac. Invalid length
[2018-11-11 10:09:18.109656] W [fuse-bridge.c:1428:fuse_err_cbk] 0-glusterfs-fuse: 51: SETXATTR() /.gfid/e7d9dc77-a374-45b0-a61d-17776abceb54 => -1 (Cannot allocate memory)
[2018-11-11 10:09:18.186812] I [fuse-bridge.c:5134:fuse_thread_proc] 0-fuse: initating unmount of /tmp/gsyncd-aux-mount-1j04ph5o
[2018-11-11 10:09:18.186979] W [glusterfsd.c:1481:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x7494) [0x7f1849021494] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xfd) [0x56015197f7ed] -->/usr/sbin/glusterfs(cleanup_and_exit+0x54) [0x56015197f644] ) 0-: received signum (15), shutting down
[2018-11-11 10:09:18.187039] I [fuse-bridge.c:5897:fini] 0-fuse: Unmounting '/tmp/gsyncd-aux-mount-1j04ph5o'.
[2018-11-11 10:09:18.187059] I [fuse-bridge.c:5902:fini] 0-fuse: Closing fuse connection to '/tmp/gsyncd-aux-mount-1j04ph5o'.

Comment 1 Worker Ant 2018-11-17 08:39:13 UTC

REVIEW: https://review.gluster.org/21668 (geo-rep: Fix syncing of files with non-ascii filenames) posted (#1) for review on master by Kotresh HR

Comment 2 Worker Ant 2018-12-04 09:16:32 UTC

REVIEW: https://review.gluster.org/21668 (geo-rep: Fix syncing of files with non-ascii filenames) posted (#6) for review on master by Amar Tumballi

Comment 3 Shyamsundar 2019-03-25 16:31:57 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report.

glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.