Bug 764497 (GLUSTER-2765)

Summary: geo-replication should have mercy on brick failure
Product: [Community] GlusterFS Reporter: Csaba Henk <csaba>
Component: geo-replicationAssignee: kaushik <kbudiger>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: low Docs Contact:
Priority: medium    
Version: mainlineCC: gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Csaba Henk 2011-04-15 06:45:56 UTC
If brick goes down and the file it provides vanish, gsyncd would just delete them from slave as well.

That's quite unacceptable. So if this happens, force gsyncd's gluster client to exit, making this way geo-replication defunct. When the user puts back the brick with replace-brick, geo-replication session will be deleted and when user knows that things went back to normal state (including a manual sync-back of files of the brick gone), then he can start it again.

Comment 1 Anand Avati 2011-04-15 07:58:36 UTC
PATCH: http://patches.gluster.com/patch/6892 in master (glusterd / geo-replication: have gsync's glusterfs client use assert-no-child-down for dht volume)

Comment 2 Anand Avati 2011-04-15 07:58:44 UTC
PATCH: http://patches.gluster.com/patch/6894 in master (DHT: Make assert-no-child-down a boolean option)

Comment 3 Anand Avati 2011-04-19 06:29:43 UTC
PATCH: http://patches.gluster.com/patch/6944 in master (mgmt/glusterd: do not allow replace-brick operations when geo-rep sessions are active on this volume.)

Comment 4 Csaba Henk 2011-04-25 09:22:06 UTC
The procedure for brick restoration using geo-replication is described here:

https://gist.github.com/e87ebea373bb67cf52b1

please use this as reference for verification.