1042830 – glusterd crashes and when rebalancing a volume with the name StartMigrationDuringRebalanceTest

Bug 1042830 - glusterd crashes and when rebalancing a volume with the name StartMigrationDuringRebalanceTest

Summary: glusterd crashes and when rebalancing a volume with the name StartMigrationDu...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterfs
Sub Component:
Version:	2.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	RHGS 2.1.2
Assignee:	krishnan parthasarathi
QA Contact:	Dustin Tsang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1046308
TreeView+	depends on / blocked

Reported:	2013-12-13 13:34 UTC by Dustin Tsang
Modified:	2015-11-03 23:05 UTC (History)
CC List:	13 users (show)
Fixed In Version:	glusterfs-3.4.0.53rhs
Doc Type:	Bug Fix
Doc Text:	Previously, glusterd service would crash when a Rebalance operation was started on a volume name that was 33 characters long. Now, with this update, glusterd service would not crash regardless of the size of volume name.
Clone Of:
Clones:	1046308 (view as bug list)
Environment:
Last Closed:	2014-02-25 08:08:58 UTC
Embargoed:
Dependent Products:
Flags:	dtsang: needinfo-

Attachments	(Terms of Use)
sos report (2.70 MB, application/x-xz) 2013-12-13 13:34 UTC, Dustin Tsang	no flags	Details
etc-glusterfs-glusterd.vol.log (158.31 KB, text/x-log) 2013-12-13 13:35 UTC, Dustin Tsang	no flags	Details
glusterd.log (459 bytes, text/x-log) 2013-12-13 13:37 UTC, Dustin Tsang	no flags	Details
core dump (199.61 KB, application/x-xz) 2013-12-19 13:39 UTC, Dustin Tsang	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2014:0208	0	normal	SHIPPED_LIVE	Red Hat Storage 2.1 enhancement and bug fix update #2	2014-02-25 12:20:30 UTC

Description Dustin Tsang 2013-12-13 13:34:46 UTC

Created attachment 836326 [details]
sos report

Description of problem:
glusterd crashes and when rebalancing a volume with the name StartMigrationDuringRebalanceTest. glusterd crashes again shortly after restarting the process.


Version-Release number of selected component (if applicable):
glusterfs-server-3.4.0.49rhs-1.el6rhs.x86_64

How reproducible:
100% reproducible

Steps to Reproduce:
1. create a simple dist volume with the name StartMigrationDuringRebalanceTest
2. start the volume
3. start rebalance on the volume

Actual results:
glusterd crashes, rebalance failed to start


Expected results:
rebalance starts successfully

Additional info:

Comment 1 Dustin Tsang 2013-12-13 13:35:57 UTC

Created attachment 836327 [details]
etc-glusterfs-glusterd.vol.log

Comment 2 Dustin Tsang 2013-12-13 13:37:10 UTC

Created attachment 836328 [details]
glusterd.log

Comment 4 Dustin Tsang 2013-12-16 14:13:50 UTC

Hi Dusmant,

The bug is reproducible by using gluster cli, rhsc gui and the rhsc rest api.
Please let me know what other information I can provide.

>Steps to Reproduce: 
>1. create a simple dist volume with the name StartMigrationDuringRebalanceTest >2. start the volume  
>[Dusmant] From CLI or through RHSC or through REST API? 
>3. start rebalance on the volume 
>[Dusmant] From CLI or through RHSC or through REST API?

Comment 5 Dustin Tsang 2013-12-16 14:49:11 UTC

Here is the input/ouput from the gluster cli:

[root@rhs-21u2-20131208-c ~]# gluster peer probe latest-d
peer probe: success. 
[root@rhs-21u2-20131208-c ~]# gluster vol create StartMigrationDuringRebalanceTest
Usage: volume create <NEW-VOLNAME> [stripe <COUNT>] [replica <COUNT>] [device vg] [transport <tcp|rdma|tcp,rdma>] <NEW-BRICK> ... [force]
[root@rhs-21u2-20131208-c ~]# gluster vol create StartMigrationDuringRebalanceTest latest-c:/bricks/foo latest-d:/bricks/foo
volume create: StartMigrationDuringRebalanceTest: success: please start the volume to access data
[root@rhs-21u2-20131208-c ~]# gluster vol start StartMigrationDuringRebalanceTest
volume start: StartMigrationDuringRebalanceTest: success
[root@rhs-21u2-20131208-c ~]# gluster vol rebalance StartMigrationDuringRebalanceTest start
Connection failed. Please check if gluster daemon is operational.

Comment 6 krishnan parthasarathi 2013-12-17 03:15:55 UTC

Dustin,

Could you attach the core file of the glusterd crash you are seeing? I couldn't see any backtrace corresponding to a crash in the log files nor a core file in sosreport that is attached to this bug.

Comment 7 Dustin Tsang 2013-12-17 17:21:27 UTC

krishnan, I wasn't able to locate the core dump either.
If you run the steps listed in comment#5, glusterd goes down while starting reblance on both nodes in the cluster.

Comment 8 Vivek Agarwal 2013-12-19 12:15:16 UTC

The information provided is insufficient. There is no core file. Log files don't provide any backtrace to make any conclusion. The issue is not reproducible.  We have asked Dustin Tsang to provide more information in the bug, with no response.

Comment 9 Dustin Tsang 2013-12-19 13:39:12 UTC

Created attachment 838985 [details]
core dump

Comment 10 Dustin Tsang 2013-12-19 13:39:49 UTC

Hi Vivek, core dump attached.

Comment 12 Pavithra 2014-01-08 09:36:20 UTC

KP, 
I've made minor edits. Can you please verify the doc text for technical accuracy?

Comment 13 Dustin Tsang 2014-01-13 17:10:07 UTC

in glusterfs-3.4.0.55rhs-1.el6rhs.x86_64:

However rebalance command does not cause a crash, rebalance is unsuccessful for volumes with names greater than equal 33 characters


steps:
1. create a volume with name greater than equal to 33 characters in length
2. start rebalance
=> start rebalance succeeds
3. poll rebalance status `gluster rebalance $VOLNAME status`
=>
[root@rhs-21u2-20131223-errata-a ~]# gluster vol rebalance aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa status
volume rebalance: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa: failed: error

Comment 15 Dustin Tsang 2014-01-15 20:25:06 UTC

kirshnan, setting to verified.

Comment 17 errata-xmlrpc 2014-02-25 08:08:58 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0208.html

Note You need to log in before you can comment on or make changes to this bug.