Bug 1042830 - glusterd crashes and when rebalancing a volume with the name StartMigrationDuringRebalanceTest
Summary: glusterd crashes and when rebalancing a volume with the name StartMigrationDu...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: 2.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: RHGS 2.1.2
Assignee: krishnan parthasarathi
QA Contact: Dustin Tsang
URL:
Whiteboard:
Depends On:
Blocks: 1046308
TreeView+ depends on / blocked
 
Reported: 2013-12-13 13:34 UTC by Dustin Tsang
Modified: 2015-11-03 23:05 UTC (History)
13 users (show)

Fixed In Version: glusterfs-3.4.0.53rhs
Doc Type: Bug Fix
Doc Text:
Previously, glusterd service would crash when a Rebalance operation was started on a volume name that was 33 characters long. Now, with this update, glusterd service would not crash regardless of the size of volume name.
Clone Of:
: 1046308 (view as bug list)
Environment:
Last Closed: 2014-02-25 08:08:58 UTC
Embargoed:
dtsang: needinfo-


Attachments (Terms of Use)
sos report (2.70 MB, application/x-xz)
2013-12-13 13:34 UTC, Dustin Tsang
no flags Details
etc-glusterfs-glusterd.vol.log (158.31 KB, text/x-log)
2013-12-13 13:35 UTC, Dustin Tsang
no flags Details
glusterd.log (459 bytes, text/x-log)
2013-12-13 13:37 UTC, Dustin Tsang
no flags Details
core dump (199.61 KB, application/x-xz)
2013-12-19 13:39 UTC, Dustin Tsang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2014:0208 0 normal SHIPPED_LIVE Red Hat Storage 2.1 enhancement and bug fix update #2 2014-02-25 12:20:30 UTC

Description Dustin Tsang 2013-12-13 13:34:46 UTC
Created attachment 836326 [details]
sos report

Description of problem:
glusterd crashes and when rebalancing a volume with the name StartMigrationDuringRebalanceTest. glusterd crashes again shortly after restarting the process.


Version-Release number of selected component (if applicable):
glusterfs-server-3.4.0.49rhs-1.el6rhs.x86_64

How reproducible:
100% reproducible

Steps to Reproduce:
1. create a simple dist volume with the name StartMigrationDuringRebalanceTest
2. start the volume
3. start rebalance on the volume

Actual results:
glusterd crashes, rebalance failed to start


Expected results:
rebalance starts successfully

Additional info:

Comment 1 Dustin Tsang 2013-12-13 13:35:57 UTC
Created attachment 836327 [details]
etc-glusterfs-glusterd.vol.log

Comment 2 Dustin Tsang 2013-12-13 13:37:10 UTC
Created attachment 836328 [details]
glusterd.log

Comment 4 Dustin Tsang 2013-12-16 14:13:50 UTC
Hi Dusmant,

The bug is reproducible by using gluster cli, rhsc gui and the rhsc rest api.
Please let me know what other information I can provide.

>Steps to Reproduce: 
>1. create a simple dist volume with the name StartMigrationDuringRebalanceTest >2. start the volume  
>[Dusmant] From CLI or through RHSC or through REST API? 
>3. start rebalance on the volume 
>[Dusmant] From CLI or through RHSC or through REST API?

Comment 5 Dustin Tsang 2013-12-16 14:49:11 UTC
Here is the input/ouput from the gluster cli:

[root@rhs-21u2-20131208-c ~]# gluster peer probe latest-d
peer probe: success. 
[root@rhs-21u2-20131208-c ~]# gluster vol create StartMigrationDuringRebalanceTest
Usage: volume create <NEW-VOLNAME> [stripe <COUNT>] [replica <COUNT>] [device vg] [transport <tcp|rdma|tcp,rdma>] <NEW-BRICK> ... [force]
[root@rhs-21u2-20131208-c ~]# gluster vol create StartMigrationDuringRebalanceTest latest-c:/bricks/foo latest-d:/bricks/foo
volume create: StartMigrationDuringRebalanceTest: success: please start the volume to access data
[root@rhs-21u2-20131208-c ~]# gluster vol start StartMigrationDuringRebalanceTest
volume start: StartMigrationDuringRebalanceTest: success
[root@rhs-21u2-20131208-c ~]# gluster vol rebalance StartMigrationDuringRebalanceTest start
Connection failed. Please check if gluster daemon is operational.

Comment 6 krishnan parthasarathi 2013-12-17 03:15:55 UTC
Dustin,

Could you attach the core file of the glusterd crash you are seeing? I couldn't see any backtrace corresponding to a crash in the log files nor a core file in sosreport that is attached to this bug.

Comment 7 Dustin Tsang 2013-12-17 17:21:27 UTC
krishnan, I wasn't able to locate the core dump either.
If you run the steps listed in comment#5, glusterd goes down while starting reblance on both nodes in the cluster.

Comment 8 Vivek Agarwal 2013-12-19 12:15:16 UTC
The information provided is insufficient. There is no core file. Log files don't provide any backtrace to make any conclusion. The issue is not reproducible.  We have asked Dustin Tsang to provide more information in the bug, with no response.

Comment 9 Dustin Tsang 2013-12-19 13:39:12 UTC
Created attachment 838985 [details]
core dump

Comment 10 Dustin Tsang 2013-12-19 13:39:49 UTC
Hi Vivek, core dump attached.

Comment 12 Pavithra 2014-01-08 09:36:20 UTC
KP, 
I've made minor edits. Can you please verify the doc text for technical accuracy?

Comment 13 Dustin Tsang 2014-01-13 17:10:07 UTC
in glusterfs-3.4.0.55rhs-1.el6rhs.x86_64:

However rebalance command does not cause a crash, rebalance is unsuccessful for volumes with names greater than equal 33 characters


steps:
1. create a volume with name greater than equal to 33 characters in length
2. start rebalance
=> start rebalance succeeds
3. poll rebalance status `gluster rebalance $VOLNAME status`
=>
[root@rhs-21u2-20131223-errata-a ~]# gluster vol rebalance aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa status
volume rebalance: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa: failed: error

Comment 15 Dustin Tsang 2014-01-15 20:25:06 UTC
kirshnan, setting to verified.

Comment 17 errata-xmlrpc 2014-02-25 08:08:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0208.html


Note You need to log in before you can comment on or make changes to this bug.