Bug 1207712

Summary: Input/Output error with disperse volume when geo-replication is started
Product: [Community] GlusterFS Reporter: Bhaskarakiran <byarlaga>
Component: geo-replicationAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: urgent Docs Contact:
Priority: unspecified    
Version: mainlineCC: avishwan, bugs, byarlaga, gluster-bugs, mzywusko, pkarampu
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8rc2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1224171 (view as bug list) Environment:
Last Closed: 2016-06-16 12:46:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1186580, 1224171    
Attachments:
Description Flags
log file of the master none

Description Bhaskarakiran 2015-03-31 14:20:00 UTC
Created attachment 1009088 [details]
log file of the master

Description of problem:
======================
Input/Output error on disperse volume with geo-replication after start.

Version-Release number of selected component (if applicable):
=============================================================
[root@vertigo ~]# gluster --version
glusterfs 3.7dev built on Mar 31 2015 01:05:54
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
[root@vertigo ~]# 

How reproducible:
=================
100%

Steps to Reproduce:
1. Create a 1x(4+2) disperse volume both for master and slave
2. Try to establish geo-replication b/w the volumes.
3. Once its started it thows out Input/Output error in the log file.

Actual results:
I/O error

Expected results:


Additional info:
================

[root@vertigo ~]# gluster v info geo-master
 
Volume Name: geo-master
Type: Disperse
Volume ID: fdb55cd4-34e7-4c15-a407-d9a831a09737
Status: Started
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: ninja:/rhs/brick1/geo-1
Brick2: vertigo:/rhs/brick1/geo-2
Brick3: ninja:/rhs/brick2/geo-3
Brick4: vertigo:/rhs/brick2/geo-4
Brick5: ninja:/rhs/brick3/geo-5
Brick6: vertigo:/rhs/brick3/geo-6
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
[root@vertigo ~]# gluster v status geo-master
Status of volume: geo-master
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick ninja:/rhs/brick1/geo-1               49202     0          Y       4714 
Brick vertigo:/rhs/brick1/geo-2             49203     0          Y       4643 
Brick ninja:/rhs/brick2/geo-3               49203     0          Y       4731 
Brick vertigo:/rhs/brick2/geo-4             49204     0          Y       4660 
Brick ninja:/rhs/brick3/geo-5               49204     0          Y       4748 
Brick vertigo:/rhs/brick3/geo-6             49205     0          Y       4677 
NFS Server on localhost                     2049      0          Y       5224 
NFS Server on ninja                         2049      0          Y       5090 
 
Task Status of Volume geo-master
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@vertigo ~]# 

Slave configuration:
====================

[root@dhcp37-164 ~]# gluster v info
 
Volume Name: disperse-slave
Type: Disperse
Volume ID: 1cbbe781-ee69-4295-bd17-a1dff37637ab
Status: Started
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: dhcp37-164:/rhs/brick1/b1
Brick2: dhcp37-95:/rhs/brick1/b1
Brick3: dhcp37-164:/rhs/brick2/b2
Brick4: dhcp37-95:/rhs/brick2/b2
Brick5: dhcp37-164:/rhs/brick3/b3
Brick6: dhcp37-95:/rhs/brick3/b3
[root@dhcp37-164 ~]# gluster v status
Status of volume: disperse-slave
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dhcp37-164:/rhs/brick1/b1             49152     0          Y       4066 
Brick dhcp37-95:/rhs/brick1/b1              49152     0          Y       6988 
Brick dhcp37-164:/rhs/brick2/b2             49153     0          Y       4083 
Brick dhcp37-95:/rhs/brick2/b2              49153     0          Y       7005 
Brick dhcp37-164:/rhs/brick3/b3             49154     0          Y       4100 
Brick dhcp37-95:/rhs/brick3/b3              49154     0          Y       7022 
NFS Server on localhost                     2049      0          Y       4120 
NFS Server on 10.70.37.95                   2049      0          Y       7044 
 
Task Status of Volume disperse-slave
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp37-164 ~]# 

Log file of the master will be attached.

Comment 1 Anand Avati 2015-03-31 20:34:16 UTC
REVIEW: http://review.gluster.org/10077 (cluster/ec: Ignore volume-mark key for comparing dicts) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 2 Anand Avati 2015-03-31 20:34:29 UTC
REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare function) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 3 Anand Avati 2015-03-31 20:34:32 UTC
REVIEW: http://review.gluster.org/10079 (cluster/ec: Handle stime, xtime differently) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 4 Anand Avati 2015-04-10 11:10:09 UTC
COMMIT: http://review.gluster.org/10077 committed in master by Vijay Bellur (vbellur) 
------
commit fcb55d54a62c8d4a2e8ce4596cd462c471f74dd3
Author: Pranith Kumar K <pkarampu>
Date:   Tue Mar 31 18:09:25 2015 +0530

    cluster/ec: Ignore volume-mark key for comparing dicts
    
    Change-Id: Id60107e9fb96588d24fa2f3be85c764b7f08e3d1
    BUG: 1207712
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/10077
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Xavier Hernandez <xhernandez>

Comment 5 Anand Avati 2015-04-12 11:09:35 UTC
REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare function) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 6 Anand Avati 2015-04-12 11:13:49 UTC
REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare function) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 7 Anand Avati 2015-04-28 17:23:34 UTC
REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare function) posted (#4) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 8 Anand Avati 2015-05-04 02:54:08 UTC
REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare function) posted (#5) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 9 Anand Avati 2015-05-04 08:56:17 UTC
REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare function) posted (#6) for review on master by Avra Sengupta (asengupt)

Comment 10 Anand Avati 2015-05-05 02:46:35 UTC
COMMIT: http://review.gluster.org/10078 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit c8cd488b794d7abb3d37f32a6d8d0a3b365aa46e
Author: Pranith Kumar K <pkarampu>
Date:   Tue Mar 31 23:07:09 2015 +0530

    cluster/ec: Fix dictionary compare function
    
    If both dicts are NULL then equal. If one of the dicts is NULL but the other
    has only ignorable keys then also they are equal. If both dicts are non-null
    then check if for each non-ignorable key, values are same or not.  value_ignore
    function is used to skip comparing values for the keys which must be present in
    both the dictionaries but the value could be different.
    
    geo-rep's stime xattr doesn't need to be present in list xattr but when
    getxattr comes on stime xattr even if there aren't enough responses with the
    xattr we should still give out an answer which is maximum of the stimes
    available.
    
    Change-Id: I8de2ceaa2db785b797f302f585d88e73b154167d
    BUG: 1207712
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/10078
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Xavier Hernandez <xhernandez>

Comment 11 Niels de Vos 2015-05-14 17:27:12 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 12 Niels de Vos 2015-05-14 17:28:40 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 13 Niels de Vos 2015-05-14 17:35:19 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 14 Niels de Vos 2016-06-16 12:46:47 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user