Bug 763825 (GLUSTER-2093)

Summary: volumes cannot start when one node in a replicated setup is down
Product: [Community] GlusterFS Reporter: Allen Lu <allen>
Component: nfsAssignee: Shehjar Tikoo <shehjart>
Severity: high Docs Contact:
Priority: high    
Version: 3.1.0CC: anush, gluster-bugs, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTP Mount Type: nfs
Documentation: DNR CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Allen Lu 2010-11-11 14:58:11 EST
if one server is taken down, volumes cannot start. 
The issue is NFS server checks for existance of all subvolumes before 
it would register with portmap. NFS server would wait forever at trying 
to ping all subvolumes. The NFS server should go on if it can find one of the replicated volume.
Comment 1 Shehjar Tikoo 2010-11-15 03:37:23 EST
What gluster commands did you use? 

I cannot understand what you mean by, "if one server is taken down, volumes cannot start."

Are you saying that the "gluster volume start <volname>" command does not start nfs server in case a physical server containing a brick is down?
Comment 2 Anand Avati 2010-11-18 06:12:24 EST
PATCH: http://patches.gluster.com/patch/5748 in master (nfs: treat GF_EVENT_CHILD_CONNECTING as subvolume up status)
Comment 3 Anand Avati 2010-11-25 06:35:28 EST
PATCH: http://patches.gluster.com/patch/5782 in master (nfs: Export subvolumes on per-subvolume CHILD-UP)
Comment 4 Anand Avati 2010-12-03 10:03:34 EST
PATCH: http://patches.gluster.com/patch/5786 in master (nfs: Start nfs process even if portmap registration fails)