Bug 1042830

Summary:

glusterd crashes and when rebalancing a volume with the name StartMigrationDuringRebalanceTest

Product:

[Red Hat Storage] Red Hat Gluster Storage

Reporter:

Dustin Tsang <dtsang>

Component:

glusterfs

Assignee:

krishnan parthasarathi <kparthas>

Status:

CLOSED ERRATA

QA Contact:

Dustin Tsang <dtsang>

Severity:

high

Docs Contact:

Priority:

high

Version:

2.1

CC:

dpati, dtsang, knarra, kparthas, mmahoney, mmccune, nsathyan, pprakash, psriniva, sdharane, ssampat, vagarwal, vbellur

Target Milestone:

---

Keywords:

ZStream

Target Release:

RHGS 2.1.2

Flags:

dtsang: needinfo-

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

glusterfs-3.4.0.53rhs

Doc Type:

Bug Fix

Doc Text:

Previously, glusterd service would crash when a Rebalance operation was started on a volume name that was 33 characters long. Now, with this update, glusterd service would not crash regardless of the size of volume name.

Story Points:

---

Clone Of:

Clones:

1046308 (view as bug list)

Environment:

Last Closed:

2014-02-25 08:08:58 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1046308

Attachments:

Description	Flags
sos report	none
etc-glusterfs-glusterd.vol.log	none
glusterd.log	none
core dump	none

Description Dustin Tsang 2013-12-13 13:34:46 UTC

Created attachment 836326 [details]
sos report

Description of problem:
glusterd crashes and when rebalancing a volume with the name StartMigrationDuringRebalanceTest. glusterd crashes again shortly after restarting the process.


Version-Release number of selected component (if applicable):
glusterfs-server-3.4.0.49rhs-1.el6rhs.x86_64

How reproducible:
100% reproducible

Steps to Reproduce:
1. create a simple dist volume with the name StartMigrationDuringRebalanceTest
2. start the volume
3. start rebalance on the volume

Actual results:
glusterd crashes, rebalance failed to start


Expected results:
rebalance starts successfully

Additional info:

Comment 1 Dustin Tsang 2013-12-13 13:35:57 UTC

Created attachment 836327 [details]
etc-glusterfs-glusterd.vol.log

Comment 2 Dustin Tsang 2013-12-13 13:37:10 UTC

Created attachment 836328 [details]
glusterd.log

Comment 4 Dustin Tsang 2013-12-16 14:13:50 UTC

Hi Dusmant,

The bug is reproducible by using gluster cli, rhsc gui and the rhsc rest api.
Please let me know what other information I can provide.

>Steps to Reproduce: 
>1. create a simple dist volume with the name StartMigrationDuringRebalanceTest >2. start the volume  
>[Dusmant] From CLI or through RHSC or through REST API? 
>3. start rebalance on the volume 
>[Dusmant] From CLI or through RHSC or through REST API?

Comment 5 Dustin Tsang 2013-12-16 14:49:11 UTC

Here is the input/ouput from the gluster cli:

[root@rhs-21u2-20131208-c ~]# gluster peer probe latest-d
peer probe: success. 
[root@rhs-21u2-20131208-c ~]# gluster vol create StartMigrationDuringRebalanceTest
Usage: volume create <NEW-VOLNAME> [stripe <COUNT>] [replica <COUNT>] [device vg] [transport <tcp|rdma|tcp,rdma>] <NEW-BRICK> ... [force]
[root@rhs-21u2-20131208-c ~]# gluster vol create StartMigrationDuringRebalanceTest latest-c:/bricks/foo latest-d:/bricks/foo
volume create: StartMigrationDuringRebalanceTest: success: please start the volume to access data
[root@rhs-21u2-20131208-c ~]# gluster vol start StartMigrationDuringRebalanceTest
volume start: StartMigrationDuringRebalanceTest: success
[root@rhs-21u2-20131208-c ~]# gluster vol rebalance StartMigrationDuringRebalanceTest start
Connection failed. Please check if gluster daemon is operational.

Comment 6 krishnan parthasarathi 2013-12-17 03:15:55 UTC

Dustin,

Could you attach the core file of the glusterd crash you are seeing? I couldn't see any backtrace corresponding to a crash in the log files nor a core file in sosreport that is attached to this bug.

Comment 7 Dustin Tsang 2013-12-17 17:21:27 UTC

krishnan, I wasn't able to locate the core dump either.
If you run the steps listed in comment#5, glusterd goes down while starting reblance on both nodes in the cluster.

Comment 8 Vivek Agarwal 2013-12-19 12:15:16 UTC

The information provided is insufficient. There is no core file. Log files don't provide any backtrace to make any conclusion. The issue is not reproducible.  We have asked Dustin Tsang to provide more information in the bug, with no response.

Comment 9 Dustin Tsang 2013-12-19 13:39:12 UTC

Created attachment 838985 [details]
core dump

Comment 10 Dustin Tsang 2013-12-19 13:39:49 UTC

Hi Vivek, core dump attached.

Comment 12 Pavithra 2014-01-08 09:36:20 UTC

KP, 
I've made minor edits. Can you please verify the doc text for technical accuracy?

Comment 13 Dustin Tsang 2014-01-13 17:10:07 UTC

in glusterfs-3.4.0.55rhs-1.el6rhs.x86_64:

However rebalance command does not cause a crash, rebalance is unsuccessful for volumes with names greater than equal 33 characters


steps:
1. create a volume with name greater than equal to 33 characters in length
2. start rebalance
=> start rebalance succeeds
3. poll rebalance status `gluster rebalance $VOLNAME status`
=>
[root@rhs-21u2-20131223-errata-a ~]# gluster vol rebalance aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa status
volume rebalance: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa: failed: error

Comment 15 Dustin Tsang 2014-01-15 20:25:06 UTC

kirshnan, setting to verified.

Comment 17 errata-xmlrpc 2014-02-25 08:08:58 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0208.html