Bug 980789

Summary: Dist-geo-rep: 'gluster volume geo status' shows one node status as 'defunct', if one of the brick on that node is not up(replica of that brick is up)
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rachana Patel <racpatel>
Component: geo-replicationAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED EOL QA Contact: storage-qa-internal <storage-qa-internal>
Severity: medium Docs Contact:
Priority: high    
Version: 2.1CC: avishwan, chrisw, csaba, david.macdonald, mzywusko, rhs-bugs, vagarwal, vbhat
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: status
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-25 08:48:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rachana Patel 2013-07-03 08:52:22 UTC
Description of problem:
 Dist-geo-re: 'gluster volume geo status' shows one node status as 'defunct', if one of the brick on that node is not up(replica of that brick is up)

Version-Release number of selected component (if applicable):
3.4.0.12rhs.beta1-1.el6rhs.x86_64

How reproducible:
not sure

Steps to Reproduce:
1.Create geo rep sessions between master(dist-rep volume) and slave(any volume) cluster

2.In master cluster, kill one of the brick process ans make sure it's replica brick is up 

[root@wall ~]# gluster volume status master2
Status of volume: master2
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.42.158:/rhs/brick1/ma1			49154	Y	16188
Brick 10.70.42.246:/rhs/brick1/ma1			49154	Y	16141
Brick 10.70.42.191:/rhs/brick1/ma1			N/A	N	16922
Brick 10.70.42.158:/rhs/brick1/ma2			49155	Y	16200
Brick 10.70.42.246:/rhs/brick1/ma2			49155	Y	16153
Brick 10.70.42.191:/rhs/brick1/ma2			49155	Y	16934
NFS Server on localhost					2049	Y	17058
Self-heal Daemon on localhost				N/A	Y	16955
NFS Server on 5dddc52f-259a-4b45-ad60-8d1a917624ce	2049	Y	13591
Self-heal Daemon on 5dddc52f-259a-4b45-ad60-8d1a917624c
e							N/A	Y	13560
NFS Server on 50a95d83-f6fe-4996-9287-3005131c948b	2049	Y	16325
Self-heal Daemon on 50a95d83-f6fe-4996-9287-3005131c948
b							N/A	Y	16221
NFS Server on cf1941b8-7d06-482b-bacc-930b5a1401f4	2049	Y	16267
Self-heal Daemon on cf1941b8-7d06-482b-bacc-930b5a1401f
4							N/A	Y	16174
 
There are no active volume tasks
[root@wall ~]# gluster v info master2
 
Volume Name: master2
Type: Distributed-Replicate
Volume ID: 8574ece7-b738-4f22-8b14-f4414521cd84
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: 10.70.42.158:/rhs/brick1/ma1
Brick2: 10.70.42.246:/rhs/brick1/ma1
Brick3: 10.70.42.191:/rhs/brick1/ma1
Brick4: 10.70.42.158:/rhs/brick1/ma2
Brick5: 10.70.42.246:/rhs/brick1/ma2
Brick6: 10.70.42.191:/rhs/brick1/ma2
Options Reconfigured:
changelog.encoding: ascii
changelog.rollover-time: 15
changelog.fsync-interval: 3
geo-replication.indexing: on


3. Start this session, keep writing data on mount point at different interval and verify status and data is sync. or not


Actual results:
Initially, Status for that node was switching between 'faulty' and 'Stable', after 14-15 hour, I noticed that status is 'defunct'


Expected results:
'defunct' is not expected state

Additional info:
sync. operation was working fine, data between master - slave was in sync state

Comment 3 Aravinda VK 2015-11-25 08:48:53 UTC
Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again.

Comment 4 Aravinda VK 2015-11-25 08:50:46 UTC
Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again.