Bug 1200372 - Geo-rep fails with disperse volume
Summary: Geo-rep fails with disperse volume
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: disperse
Version: mainline
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
URL:
Whiteboard:
: 1210686 (view as bug list)
Depends On:
Blocks: qe_tracker_everglades
TreeView+ depends on / blocked
 
Reported: 2015-03-10 12:32 UTC by shilpa
Modified: 2015-05-14 17:35 UTC (History)
6 users (show)

Fixed In Version: glusterfs-3.7.0beta1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-05-14 17:26:52 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description shilpa 2015-03-10 12:32:32 UTC
Description of problem: Geo-rep session in faulty state when created on disperse volume


Version-Release number of selected component (if applicable):
glusterfs-3.7dev-0.667.gitadef0c8.el6.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Create and start a volume with disperse = 3 redundancy = 1 on master and slave cluster each. 
2. Create geo-rep session and start it.
 

Actual results:

Files fail to sync and geo-rep status is faulty


Expected results:


Additional info:

1. on master cluster:
# gluster v i
 
Volume Name: master
Type: Distributed-Disperse
Volume ID: fad80ec1-2ef8-47b5-a356-baa3f4e9c039
Status: Started
Number of Bricks: 2 x (2 + 1) = 6
Transport-type: tcp
Bricks:
Brick1: 10.x.x.x:/rhs/brick1/m1
Brick2: 10.x.x.x:/rhs/brick1/m2
Brick3: 10.x.x.x:/rhs/brick1/m3
Brick4: 10.x.x.x:/rhs/brick1/m4
Brick5: 10.x.x.x:/rhs/brick1/m5
Brick6: 10.x.x.x:/rhs/brick1/m6
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on

3.Create passwordless ssh between master and one slave node and run:

#gluster system:: execute gsec_create

4.Create and start a geo-rep session from master to slave volume

gluster volume geo-rep master 10.x.x.x::slave create push-pem
gluster volume geo-rep master 10.x.x.x::slave start

4. Check the status of geo-rep:#  gluster volume geo-rep master 10.70.37.56::slave status 
MASTER NODE               MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                 STATUS    CHECKPOINT STATUS    CRAWL STATUS       
-----------------------------------------------------------------------------------------------------------------------------------------
ecnode1    master        /rhs/brick1/m1    root          10.x.x.x::slave    faulty    N/A                  N/A                
ecnode1    master        /rhs/brick1/m4    root          10.x.x.x::slave    faulty    N/A                  N/A                
ecnode2   master        /rhs/brick1/m2    root          10.x.x.x::slave    faulty    N/A                  N/A                
ecnode2   master        /rhs/brick1/m5    root          10.x.x.x::slave    faulty    N/A                  N/A                
ecnode3    master        /rhs/brick1/m3    root          10.x.x.x::slave    faulty    N/A                  N/A                
ecnode3    master        /rhs/brick1/m6    root          10.x.x.x::slave    faulty    N/A                  N/A 

Logs from /var/log/glusterfs/geo-replication/master:

 E [syncdutils(/rhs/brick1/m4):275:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 164, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 646, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1333, in service_loop
    g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 486, in crawlwrap
    volinfo_sys = self.volinfo_hook()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 165, in volinfo_hook
    return self.get_sys_volinfo()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 331, in get_sys_volinfo
    self.master.server.aggregated.native_volume_info())
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1004, in native_volume_info
    'volume-mark']))
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 961, in _attr_unpack_dict
    buf = Xattr.lgetxattr('.', xattr, struct.calcsize(fmt_string))
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", line 55, in lgetxattr
    return cls._query_xattr(path, siz, 'lgetxattr', attr)
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", line 47, in _query_xattr
    cls.raise_oserr()
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", line 37, in raise_oserr
    raise OSError(errn, os.strerror(errn))
OSError: [Errno 22] Invalid argument

From etc-glusterfs-glusterd.vol.log:

[2015-03-10 06:53:47.130372] I [glusterd-geo-rep.c:3586:glusterd_read_status_file] 0-: Using passed config template(/var/lib/glusterd/geo-replication/master_10.70.37.56_slave/gsyncd.conf).
[2015-03-10 06:53:47.275735] E [glusterd-geo-rep.c:3266:glusterd_gsync_read_frm_status] 0-: Unable to read gsyncd status file
[2015-03-10 06:53:47.275789] E [glusterd-geo-rep.c:3673:glusterd_read_status_file] 0-: Unable to read the statusfile for /rhs/brick1/m1 brick for  master(master), 10.70.37.56::slave(slave) session
[2015-03-10 06:53:47.419816] E [glusterd-geo-rep.c:3266:glusterd_gsync_read_frm_status] 0-: Unable to read gsyncd status file
[2015-03-10 06:53:47.419868] E [glusterd-geo-rep.c:3673:glusterd_read_status_file] 0-: Unable to read the statusfile for /rhs/brick1/m4 brick for  master(master), 10.70.37.56::slave(slave) session


[2015-03-10 01:44:23.725119] W [fuse-bridge.c:3327:fuse_xattr_cbk]
>> 0-glusterfs-fuse: 6: GETXATTR(trusted.glusterfs.volume-mark) / => -1
>> (Invalid argument)

Comment 1 Anand Avati 2015-03-16 07:21:02 UTC
REVIEW: http://review.gluster.org/9892 (libxlator: Change marker xattr handling interface) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 2 Anand Avati 2015-03-17 02:22:48 UTC
REVIEW: http://review.gluster.org/9892 (libxlator: Change marker xattr handling interface) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 3 Anand Avati 2015-03-19 14:32:08 UTC
REVIEW: http://review.gluster.org/9892 (libxlator: Change marker xattr handling interface) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 4 Anand Avati 2015-03-23 06:01:48 UTC
REVIEW: http://review.gluster.org/9892 (libxlator: Change marker xattr handling interface) posted (#4) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 5 Anand Avati 2015-03-25 18:10:55 UTC
COMMIT: http://review.gluster.org/9892 committed in master by Vijay Bellur (vbellur) 
------
commit 21cd43cfd62f69cd011fced7ebec93b9347f9fce
Author: Pranith Kumar K <pkarampu>
Date:   Wed Mar 11 17:43:12 2015 +0530

    libxlator: Change marker xattr handling interface
    
    - Changed the implementation of marker xattr handling to take just a
      function which populates important data that is different from
      default 'gauge' values and subvolumes where the call needs to be
      wound.
    - Removed duplicate code I found while reading the code and moved it to
      cluster_marker_unwind. Removed unused structure members.
    - Changed dht/afr/stripe implementations to follow the new implementation
    - Implemented marker xattr handling for ec.
    
    Change-Id: Ib0c3626fe31eb7c8aae841eabb694945bf23abd4
    BUG: 1200372
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/9892
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Xavier Hernandez <xhernandez>
    Reviewed-by: Shyamsundar Ranganathan <srangana>
    Reviewed-by: Ravishankar N <ravishankar>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 6 Anand Avati 2015-03-26 16:06:08 UTC
REVIEW: http://review.gluster.org/10015 (xlators/lib: Handle NULL 'name' for marker xattrs) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 7 Anand Avati 2015-03-27 07:21:38 UTC
COMMIT: http://review.gluster.org/10015 committed in master by Vijay Bellur (vbellur) 
------
commit d15dedd8c99e84018a50130a8ffe5e971b9f7bd4
Author: Pranith Kumar K <pkarampu>
Date:   Thu Mar 26 21:08:12 2015 +0530

    xlators/lib: Handle NULL 'name' for marker xattrs
    
    Change-Id: I18f00b7e92f483673250821c457d1e8be2eef081
    BUG: 1200372
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/10015
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Aravinda VK <avishwan>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 8 Nithya Balachandran 2015-04-20 12:46:42 UTC
*** Bug 1210686 has been marked as a duplicate of this bug. ***

Comment 9 Niels de Vos 2015-05-14 17:26:52 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 10 Niels de Vos 2015-05-14 17:28:28 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 11 Niels de Vos 2015-05-14 17:35:17 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.