Bug 764187 (GLUSTER-2455)

Summary: "xtime" RePCe call fails with ENOTDIR
Product: [Community] GlusterFS Reporter: Csaba Henk <csaba>
Component: geo-replicationAssignee: Csaba Henk <csaba>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: pre-2.0CC: gluster-bugs, lakshmipathi
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTA Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Csaba Henk 2011-02-23 15:02:30 UTC
Tested with Posix test suite (http://www.tuxera.com/community/posix-test-suite/). Note that versions of the test suite available on this page did not induce this error, only the in-house one (of which it's not clear if corresponds to any release, as some releases are not accessible anymore).

Python version: 2.4.3
gsyncd was invoked like: /tmp/gsyncd24  --debug :yow /scratch/fex2 -r "/tmp/gsyncd24 -LDEBUG" (ie. simple gluster-to-file setup).
Test suite invocation was narrowed down to this: prove -r /home/csaba/work/src/posix-testsuite/tests/truncate/0{2,3}.t

Phenomenon:

[2011-02-23 16:00:23.139224] D [repce:131:push] RepceClient: call 21535:140433625843456:1298473223.14 xtime('./_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234/_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234', 'fd56f7fc-86dc-46af-ac3b-21471a6e0362') ...
[2011-02-23 16:00:23.139632] E [repce(slave):76:exception] <top>: call failed:
Traceback (most recent call last):
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/repce.py", line 72, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/resource.py", line 165, in xtime
    return struct.unpack('!II', Xattr.lgetxattr(path, '.'.join([cls.GX_NSPACE, uuid, 'xtime']), 8))
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/resource.py", line 77, in lgetxattr
    cls.raise_oserr()
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/resource.py", line 67, in raise_oserr
    raise OSError(errn, os.strerror(errn))
OSError: [Errno 20] Not a directory
[2011-02-23 16:00:23.147254] E [repce:139:__call__] RepceClient: call 21535:140433625843456:1298473223.14 (xtime) failed on peer with instance
[2011-02-23 16:00:23.147579] E [gsyncd:160:exception] <top>: FAIL:
Traceback (most recent call last):
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/gsyncd.py", line 156, in main
    main_i()
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/gsyncd.py", line 289, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/resource.py", line 379, in service_loop
    GMaster(self, args[0]).crawl()
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/master.py", line 63, in __init__
    self.crawl()
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/master.py", line 170, in crawl
    True)[-1], blame=e) == False:
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/master.py", line 136, in indulgently
    return fnc(e)
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/master.py", line 168, in <lambda>
    if indulgently(e, lambda e: (self.add_job(path, 'cwait', self.wait, e, xte, adct),
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/master.py", line 170, in crawl
    True)[-1], blame=e) == False:
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/master.py", line 136, in indulgently
    return fnc(e)
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/master.py", line 168, in <lambda>
    if indulgently(e, lambda e: (self.add_job(path, 'cwait', self.wait, e, xte, adct),
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/master.py", line 99, in crawl
    xtr0 = self.xtime(path, self.slave)
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/master.py", line 40, in xtime
    xt = rsc.server.xtime(path, self.uuid)
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/repce.py", line 151, in __call__
    return self.ins(self.meth, *a)
  File "/meta/progs/glusterfs-git/usr/libexec/python/syncdaemon/repce.py", line 140, in __call__
    raise res
OSError: [Errno 20] Not a directory
failed with OSError.

Comment 1 Csaba Henk 2011-02-24 01:08:08 UTC
The actions taken by the test can be simplified to the following scheme (executed in master volume's top dir):

$ touch a ; sleep 1; rm a; mkdir -p a/b

The actual static layout for master, resp. slave which defeats gsyncd is:

$ find /mnt/gluster1 /scratch/fex2 | xargs ls -pd1
/mnt/gluster1/
/mnt/gluster1/a/
/mnt/gluster1/a/b/
/scratch/fex2/
/scratch/fex2/a

Comment 2 Csaba Henk 2011-04-08 10:24:09 UTC
This bug is fixed by

http://patches.gluster.com/patch/6374/

(that patch should have been filed by this bugzilla entry).

Comment 3 Lakshmipathi G 2011-04-15 07:49:16 UTC
verified with 3.2.0qa12.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
master# touch a ; sleep 1; rm a; mkdir -p a/b

master# ls -lR a
a:
total 16
drwxr-xr-x 2 root root 8192 Apr 15 06:45 b
a/b:
total 0
--------
slave# ls -lR a/
a/:
total 16
drwxr-xr-x 2 root root 8192 Apr 15 06:45 b

a/b: