Red Hat Bugzilla – Bug 763825
volumes cannot start when one node in a replicated setup is down
Last modified: 2015-12-01 11:45:32 EST
if one server is taken down, volumes cannot start.
The issue is NFS server checks for existance of all subvolumes before
it would register with portmap. NFS server would wait forever at trying
to ping all subvolumes. The NFS server should go on if it can find one of the replicated volume.
What gluster commands did you use?
I cannot understand what you mean by, "if one server is taken down, volumes cannot start."
Are you saying that the "gluster volume start <volname>" command does not start nfs server in case a physical server containing a brick is down?
PATCH: http://patches.gluster.com/patch/5748 in master (nfs: treat GF_EVENT_CHILD_CONNECTING as subvolume up status)
PATCH: http://patches.gluster.com/patch/5782 in master (nfs: Export subvolumes on per-subvolume CHILD-UP)
PATCH: http://patches.gluster.com/patch/5786 in master (nfs: Start nfs process even if portmap registration fails)