Bug 1042830

Summary: glusterd crashes and when rebalancing a volume with the name StartMigrationDuringRebalanceTest
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Dustin Tsang <dtsang>
Component: glusterfsAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED ERRATA QA Contact: Dustin Tsang <dtsang>
Severity: high Docs Contact:
Priority: high    
Version: 2.1CC: dpati, dtsang, knarra, kparthas, mmahoney, mmccune, nsathyan, pprakash, psriniva, sdharane, ssampat, vagarwal, vbellur
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 2.1.2Flags: dtsang: needinfo-
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0.53rhs Doc Type: Bug Fix
Doc Text:
Previously, glusterd service would crash when a Rebalance operation was started on a volume name that was 33 characters long. Now, with this update, glusterd service would not crash regardless of the size of volume name.
Story Points: ---
Clone Of:
: 1046308 (view as bug list) Environment:
Last Closed: 2014-02-25 08:08:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1046308    
Attachments:
Description Flags
sos report
none
etc-glusterfs-glusterd.vol.log
none
glusterd.log
none
core dump none

Description Dustin Tsang 2013-12-13 13:34:46 UTC
Created attachment 836326 [details]
sos report

Description of problem:
glusterd crashes and when rebalancing a volume with the name StartMigrationDuringRebalanceTest. glusterd crashes again shortly after restarting the process.


Version-Release number of selected component (if applicable):
glusterfs-server-3.4.0.49rhs-1.el6rhs.x86_64

How reproducible:
100% reproducible

Steps to Reproduce:
1. create a simple dist volume with the name StartMigrationDuringRebalanceTest
2. start the volume
3. start rebalance on the volume

Actual results:
glusterd crashes, rebalance failed to start


Expected results:
rebalance starts successfully

Additional info:

Comment 1 Dustin Tsang 2013-12-13 13:35:57 UTC
Created attachment 836327 [details]
etc-glusterfs-glusterd.vol.log

Comment 2 Dustin Tsang 2013-12-13 13:37:10 UTC
Created attachment 836328 [details]
glusterd.log

Comment 4 Dustin Tsang 2013-12-16 14:13:50 UTC
Hi Dusmant,

The bug is reproducible by using gluster cli, rhsc gui and the rhsc rest api.
Please let me know what other information I can provide.

>Steps to Reproduce: 
>1. create a simple dist volume with the name StartMigrationDuringRebalanceTest >2. start the volume  
>[Dusmant] From CLI or through RHSC or through REST API? 
>3. start rebalance on the volume 
>[Dusmant] From CLI or through RHSC or through REST API?

Comment 5 Dustin Tsang 2013-12-16 14:49:11 UTC
Here is the input/ouput from the gluster cli:

[root@rhs-21u2-20131208-c ~]# gluster peer probe latest-d
peer probe: success. 
[root@rhs-21u2-20131208-c ~]# gluster vol create StartMigrationDuringRebalanceTest
Usage: volume create <NEW-VOLNAME> [stripe <COUNT>] [replica <COUNT>] [device vg] [transport <tcp|rdma|tcp,rdma>] <NEW-BRICK> ... [force]
[root@rhs-21u2-20131208-c ~]# gluster vol create StartMigrationDuringRebalanceTest latest-c:/bricks/foo latest-d:/bricks/foo
volume create: StartMigrationDuringRebalanceTest: success: please start the volume to access data
[root@rhs-21u2-20131208-c ~]# gluster vol start StartMigrationDuringRebalanceTest
volume start: StartMigrationDuringRebalanceTest: success
[root@rhs-21u2-20131208-c ~]# gluster vol rebalance StartMigrationDuringRebalanceTest start
Connection failed. Please check if gluster daemon is operational.

Comment 6 krishnan parthasarathi 2013-12-17 03:15:55 UTC
Dustin,

Could you attach the core file of the glusterd crash you are seeing? I couldn't see any backtrace corresponding to a crash in the log files nor a core file in sosreport that is attached to this bug.

Comment 7 Dustin Tsang 2013-12-17 17:21:27 UTC
krishnan, I wasn't able to locate the core dump either.
If you run the steps listed in comment#5, glusterd goes down while starting reblance on both nodes in the cluster.

Comment 8 Vivek Agarwal 2013-12-19 12:15:16 UTC
The information provided is insufficient. There is no core file. Log files don't provide any backtrace to make any conclusion. The issue is not reproducible.  We have asked Dustin Tsang to provide more information in the bug, with no response.

Comment 9 Dustin Tsang 2013-12-19 13:39:12 UTC
Created attachment 838985 [details]
core dump

Comment 10 Dustin Tsang 2013-12-19 13:39:49 UTC
Hi Vivek, core dump attached.

Comment 12 Pavithra 2014-01-08 09:36:20 UTC
KP, 
I've made minor edits. Can you please verify the doc text for technical accuracy?

Comment 13 Dustin Tsang 2014-01-13 17:10:07 UTC
in glusterfs-3.4.0.55rhs-1.el6rhs.x86_64:

However rebalance command does not cause a crash, rebalance is unsuccessful for volumes with names greater than equal 33 characters


steps:
1. create a volume with name greater than equal to 33 characters in length
2. start rebalance
=> start rebalance succeeds
3. poll rebalance status `gluster rebalance $VOLNAME status`
=>
[root@rhs-21u2-20131223-errata-a ~]# gluster vol rebalance aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa status
volume rebalance: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa: failed: error

Comment 15 Dustin Tsang 2014-01-15 20:25:06 UTC
kirshnan, setting to verified.

Comment 17 errata-xmlrpc 2014-02-25 08:08:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0208.html