1227656 – Unable to mount a replicated volume without all bricks online.

Bug 1227656 - Unable to mount a replicated volume without all bricks online.

Summary: Unable to mount a replicated volume without all bricks online.

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	3.7.5
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Assignee:	Kaushal
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-06-03 08:54 UTC by Richard
Modified:	2017-03-08 10:59 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2017-03-08 10:59:59 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
script to test problem (1.38 KB, text/plain) 2015-06-03 13:53 UTC, Richard	no flags	Details
View All

Description Richard 2015-06-03 08:54:05 UTC

Description of problem:

I have a distributed volume setup with a single brick. When adding a new brick to the volume and converting it into a replicated volume glusterd dies.

Version-Release number of selected component (if applicable): 

This is happening on all 3.5.x, 3.6.x (including 364beta1) and 3.7.x releases. However, 3.4.7 works perfectly.

Steps to Reproduce:

1. create new node, and make a distributed volume with a single brick.
2. create a 2nd node, and add this to the distributed volume as such:

gluster volume add-brick myvol replica 2

3. glusterd on all servers dies.

Actual results: glusterd dies, no gluster processes exist.

Expected results: volume should convert into a replicated volume.

Comment 1 Gaurav Kumar Garg 2015-06-03 11:09:58 UTC

Richard,

This bug is not reproducing with current 3.7.1 branch. 

could you attach your glusterd logs in this bug.

as per your bug description first your are creating a volume with one brick in one node. then you are attaching 2nd node with first node. then you are converting this volume to replica 2 by adding brick like:

gluster volume add-brick <VOLNAME> replica 2 server:/<path of new brick>


same steps performed with 3.7.x  branch code and this problems is not reproducible.

we need to saw your glusterd log to further analysis this problems. could you attach glusterd log with this bug ?????

Comment 2 Richard 2015-06-03 13:06:34 UTC

oh, I forgot one important step :-)

power off node #2, then power off node #1
boot node #1 only, and 

1) start glusterd
2) attempt to mount the volume again

In my case Step #2 is failing after a reboot

I will try again on my setup and gather some logs.

Comment 3 Richard 2015-06-03 13:53:02 UTC

Created attachment 1034324 [details]
script to test problem

The script I have used to diagnose this problem

Comment 4 Richard 2015-06-03 13:53:35 UTC

You don't even need to reboot, just try and stop and restart things and you can't mount the replicated volume until ALL glusterd services are started on all bricks.

Comment 5 Kaushal 2015-06-04 11:10:49 UTC

This is a bug with how server quorum has been implemented.

If server quorum is enabled on a volume, a GlusterD will only allow the brick processes of the volume to run iff it sees a quorum of other GlusterDs. What this means is that, a GlusterD when starting should not immediately start any bricks of such a volume.

But as it is currently implemented, GlusterD will not start any bricks, irrespective of whether server quorum is enabled on any volume or not, if it sees there are more than 1 peers in the cluster. Immediately after the GlusterD connects to another GlusterD, the bricks will be started.

This issue has been known for sometime, and is particularly troublesome with 2 node clusters. The suggested current workaround is to have a dummy 3rd node which doesn't host any bricks, but just runs GlusterD be a part of the cluster. This node will remain online always, and allow the other GlusterDs to start bricks quickly when restarting.

Comment 6 Richard 2015-06-04 12:42:21 UTC

I didn't enable quorum, I have all defaults set for anything that isn't in my test script I added to this bug report.

Comment 7 Richard 2015-06-04 12:44:13 UTC

even so, the volume is now a "replicaetd" volume, so should start with only one brick? or have I miss understood something?

It seems that quorum still sees the volume as distributed.

Comment 8 Richard 2015-06-04 13:43:44 UTC

Also, it doesn't explain why the glusterd process crashes on the 1st node. I've mnanaged to collect this now:

[2015-06-04 13:36:38.750676] and [2015-06-04 13:38:12.477453]
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 6
time of crash:
2015-06-04 13:38:14
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.1
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7f6a6f19bb66]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x7f6a6f1ba59f]
/lib64/libc.so.6(+0x326a0)[0x7f6a6ddc16a0]
/lib64/libc.so.6(gsignal+0x35)[0x7f6a6ddc1625]
/lib64/libc.so.6(abort+0x175)[0x7f6a6ddc2e05]
/lib64/libc.so.6(+0x70537)[0x7f6a6ddff537]
/lib64/libc.so.6(__fortify_fail+0x37)[0x7f6a6de91527]
/lib64/libc.so.6(+0x100410)[0x7f6a6de8f410]
/lib64/libc.so.6(+0xffb0b)[0x7f6a6de8eb0b]
/lib64/libc.so.6(__snprintf_chk+0x7a)[0x7f6a6de8e9da]
/usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(glusterd_handle_defrag_start+0x271)[0x7f6a63e9f061]
/usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(glusterd_op_rebalance+0x497)[0x7f6a63e9fc97]
/usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(glusterd_op_commit_perform+0x209)[0x7f6a63e59e79]
/usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(+0x60202)[0x7f6a63e5b202]
/usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(glusterd_op_sm+0x1f9)[0x7f6a63e55679]
/usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(__glusterd_stage_op_cbk+0x48e)[0x7f6a63e7effe]
/usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(glusterd_big_locked_cbk+0x60)[0x7f6a63e7ca90]
/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x7f6a6ef6bd75]
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x142)[0x7f6a6ef6d212]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f6a6ef688e8]
/usr/lib64/glusterfs/3.7.1/rpc-transport/socket.so(+0x9bcd)[0x7f6a6225abcd]
/usr/lib64/glusterfs/3.7.1/rpc-transport/socket.so(+0xb6fd)[0x7f6a6225c6fd]
/usr/lib64/libglusterfs.so.0(+0x80f70)[0x7f6a6f1f7f70]
/lib64/libpthread.so.0(+0x79d1)[0x7f6a6e50d9d1]
/lib64/libc.so.6(clone+0x6d)[0x7f6a6de778fd]

Comment 9 Gaurav Kumar Garg 2015-06-04 17:05:52 UTC

Hi Richard,

Steps  that you have mentioned in attach file after executing that steps glusterd is not crashing. Only mount failure is coming as you mentioned in the list of steps. So why mount failure is coming because you have executed " service glusterfsd stop " command on both node after executing that command you have executed "service glusterd stop" on both node. then you have started glusterd on one node. 

So this is expected behaviour (mount failure) because your brick process (glusterfsd) didn't come back online. so while mounting the volume it check whether brick is online or not if it is not then it will show mount failure.
you can also check in your volume status that brick is not online.

to solve mount failure problem you need to re-start volume again either by stopping it and starting it or by executing "gluster volume start <VOLNAME> force".


Then main things is about glusterd crash problem. so as far as my concern glusterd should not crash by performing these steps. are your executing any other command or previously have you executed any rebalance/remove-brick command ????

Could you paste output of "gluster voluem status" here ???

Comment 10 Atin Mukherjee 2015-06-05 04:05:07 UTC

(In reply to Richard from comment #8)
> Also, it doesn't explain why the glusterd process crashes on the 1st node.
> I've mnanaged to collect this now:
> 
> [2015-06-04 13:36:38.750676] and [2015-06-04 13:38:12.477453]
> pending frames:
> frame : type(0) op(0)
> patchset: git://git.gluster.com/glusterfs.git
> signal received: 6
> time of crash:
> 2015-06-04 13:38:14
> configuration details:
> argp 1
> backtrace 1
> dlfcn 1
> libpthread 1
> llistxattr 1
> setfsid 1
> spinlock 1
> epoll.h 1
> xattr.h 1
> st_atim.tv_nsec 1
> package-string: glusterfs 3.7.1
> /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7f6a6f19bb66]
> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x7f6a6f1ba59f]
> /lib64/libc.so.6(+0x326a0)[0x7f6a6ddc16a0]
> /lib64/libc.so.6(gsignal+0x35)[0x7f6a6ddc1625]
> /lib64/libc.so.6(abort+0x175)[0x7f6a6ddc2e05]
> /lib64/libc.so.6(+0x70537)[0x7f6a6ddff537]
> /lib64/libc.so.6(__fortify_fail+0x37)[0x7f6a6de91527]
> /lib64/libc.so.6(+0x100410)[0x7f6a6de8f410]
> /lib64/libc.so.6(+0xffb0b)[0x7f6a6de8eb0b]
> /lib64/libc.so.6(__snprintf_chk+0x7a)[0x7f6a6de8e9da]
> /usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.
> so(glusterd_handle_defrag_start+0x271)[0x7f6a63e9f061]
> /usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.
> so(glusterd_op_rebalance+0x497)[0x7f6a63e9fc97]
> /usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.
> so(glusterd_op_commit_perform+0x209)[0x7f6a63e59e79]
> /usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(+0x60202)[0x7f6a63e5b202]
> /usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.
> so(glusterd_op_sm+0x1f9)[0x7f6a63e55679]
> /usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.
> so(__glusterd_stage_op_cbk+0x48e)[0x7f6a63e7effe]
> /usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.
> so(glusterd_big_locked_cbk+0x60)[0x7f6a63e7ca90]
> /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x7f6a6ef6bd75]
> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x142)[0x7f6a6ef6d212]
> /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f6a6ef688e8]
> /usr/lib64/glusterfs/3.7.1/rpc-transport/socket.so(+0x9bcd)[0x7f6a6225abcd]
> /usr/lib64/glusterfs/3.7.1/rpc-transport/socket.so(+0xb6fd)[0x7f6a6225c6fd]
> /usr/lib64/libglusterfs.so.0(+0x80f70)[0x7f6a6f1f7f70]
> /lib64/libpthread.so.0(+0x79d1)[0x7f6a6e50d9d1]
> /lib64/libc.so.6(clone+0x6d)[0x7f6a6de778fd]

The fix for this crash is @ http://review.gluster.org/#/c/11090

Comment 11 Atin Mukherjee 2015-06-05 04:11:17 UTC

I accidentally moved the status to POST, reverting back to ASSIGNED as the patch is only available in mainline. Once the backport request is sent, it can be moved to POST.

Comment 12 Richard 2015-06-09 09:18:05 UTC

(In reply to Gaurav Kumar Garg from comment #9)
> Hi Richard,
> 
> Steps  that you have mentioned in attach file after executing that steps
> glusterd is not crashing. Only mount failure is coming as you mentioned in
> the list of steps. So why mount failure is coming because you have executed
> " service glusterfsd stop " command on both node after executing that
> command you have executed "service glusterd stop" on both node. then you
> have started glusterd on one node. 
> 

> So this is expected behaviour (mount failure) because your brick process
> (glusterfsd) didn't come back online. so while mounting the volume it check
> whether brick is online or not if it is not then it will show mount failure.
> you can also check in your volume status that brick is not online.

I disagree as it never used to fail in v3.4.7. You should be able to start up one brick of a replicated volume without the need for all replicated bricks to be online. That's the whole point of it beign replicated, so it should start with only one brick online.

> to solve mount failure problem you need to re-start volume again either by
> stopping it and starting it or by executing "gluster volume start <VOLNAME>
> force".

Taht does not help. The only thing I have to do to fix the problem currently is to start glusterd on the other brick and re attempt the mount. I never need to run a gluster volume start command and I never needed to doso with v3.4.7

> Then main things is about glusterd crash problem. so as far as my concern
> glusterd should not crash by performing these steps. are your executing any
> other command or previously have you executed any rebalance/remove-brick
> command ????

The commands in the script are the ones I use, that is my test script so I can repeat the process.
 
> Could you paste output of "gluster voluem status" here ???

No, once glusterd has crashed I can't get any output.

Comment 13 Richard 2015-06-09 09:20:05 UTC

btw, glusterd doesn't crash every time. if I get to repeat it on demand I will update this bug report.

Comment 14 Atin Mukherjee 2015-06-09 09:24:48 UTC

(In reply to Richard from comment #13)
> btw, glusterd doesn't crash every time. if I get to repeat it on demand I
> will update this bug report.

We have already RCAed the crash and patch review.gluster.org/11090 is posted to fix it.

Comment 15 Richard 2015-06-09 11:11:56 UTC

(In reply to Atin Mukherjee from comment #14)
> (In reply to Richard from comment #13)
> > btw, glusterd doesn't crash every time. if I get to repeat it on demand I
> > will update this bug report.
> 
> We have already RCAed the crash and patch review.gluster.org/11090 is posted
> to fix it.

Ok, thanks for the update.

Is there any news on the issue mounting a volume that has been converted from distributed to replicated? I still can't get round this without starting all bricks invovled which negates the point of replication.

Thanks,
Rich

Comment 16 Richard 2015-06-16 11:52:54 UTC

Anyone have an update, or do I need to open a new bug for the mount problem?

Comment 17 Richard 2015-06-29 08:36:53 UTC

ok, testing with the RPM's 3.7.2-3.el6.x86_64 the glusterd process does not die anymore. Thank you :-)

However, I am still unable to remount a replicated node of two bricks without both glusterd services being online. This defeates the point of replication.

This isn't how it used to be in 3.4.7 but from 3.5.x and above this is broken.

Comment 18 G Kuri 2015-07-01 16:01:42 UTC

I posted a message to the mailing list about the same issue, unable to mount a volume on a 2 node cluster with replicas enabled, when 1 of the nodes is down. I am only able to replicate the issue when the client that is trying to mount the volume happens to be on the same subnet as the node that is down. If the node that is down is on a different subnet than the client, the client mounts the volume fine.

I haven't tried to add a dummy 3rd node as suggested, but I was going to try doing that and see what happens.

Comment 19 Richard 2015-07-01 21:06:55 UTC

(In reply to G Kuri from comment #18)

Thank you for adding your comment. It means that what I'm doing is correct and my setup is just never going to work unitl this bug is resolved.

I hope they backport the fix right back to 5.x and 6.x as they don't work for me either.

Developers, I'm happy to test any RPM's you can throw my way if need be.

Comment 20 Kaushal 2015-07-30 13:17:49 UTC

This bug could not be fixed in time for glusterfs-3.7.3. This is now being tracked for being fixed in glusterfs-3.7.4.

Comment 21 Richard 2015-10-12 13:27:58 UTC

This is still broken in glusterfs-3.7.5

Comment 22 Atin Mukherjee 2016-06-22 05:15:49 UTC

Richard,

Can you please retest this with the latest 3.7 version and get back?

~Atin

Comment 23 Richard 2016-06-22 06:50:08 UTC

I'm afraid I stopped using GlusterFS due to this bug.
It is good to know some of the bugs I reported are actually being fixed, I may look again at the product and see how reliable it is.

Comment 24 Kaushal 2017-03-08 10:59:59 UTC

This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.

Note You need to log in before you can comment on or make changes to this bug.