Bug 1457976

Summary:

Georeplication status goes faulty after reboot 1 source node

Product:

[Community] GlusterFS

Reporter:

Mark <deligatedgeek>

Component:

geo-replication

Assignee:

bugs <bugs>

Status:

CLOSED EOL

QA Contact:

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

3.8

CC:

avishwan, bugs

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2017-11-07 10:40:08 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Logs from georeplication volume	none

Description Mark 2017-06-01 16:23:26 UTC

Description of problem:

Environment 4 CentOS 7.2 servers, 2 in UK, and one each in New York and Sydney

A replica 2 Glusterfs src volume was created using 1 brick from each of 2 Uk servers, this worked great.

Geo-replication was configured from this source volume to a destination volume in NY and another in Sydney, this also worked great.

Output showed first src node as Active and second src node as Passive

The first src nodes was shutdown and after a short time the second src node became Active, replication continued.

The first src node was started and added its brick back into the src vol, but the geo-replication status for the first node became Faulty, the second node was Passive.  Upon shutting the first node down the second node geo-replication status became Active, but when started the status became faulty.

I searched the error and found that it may be related to the index being rotated and thus the first node had lost track, but no instructions on how to fix this.

I had to delete the geo-replication and the destination node, then recreate both to fix the issue.

Version-Release number of selected component (if applicable):
3.8.5

How reproducible:
twice so far

Steps to Reproduce:
1.create above environment 
2.shutdown one src node
3.start the src node

Actual results:
geo-replication status goes faulty


Expected results:
src node goes Active and replication continues

Additional info:

Comment 1 Aravinda VK 2017-06-12 06:11:26 UTC

Please upload the Geo-rep logs of Faulty node from /var/log/glusterfs/geo-replication directory

Comment 2 Mark 2017-06-12 09:18:01 UTC

Created attachment 1286989 [details]
Logs from georeplication volume

This file contains the logs from the georeplication directory for this volume

Comment 3 Mark 2017-06-12 09:21:03 UTC

I know the resolution make take time, is there a workaround if this occurs again?

Comment 4 Niels de Vos 2017-11-07 10:40:08 UTC

This bug is getting closed because the 3.8 version is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.