862981 – [RHEV-RHS] Crash in rebalance process

Bug 862981 - [RHEV-RHS] Crash in rebalance process

Summary: [RHEV-RHS] Crash in rebalance process

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterfs
Sub Component:
Version:	unspecified
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Raghavendra Bhat
QA Contact:	shylesh
Docs Contact:
URL:
Whiteboard:
Depends On:	875076
Blocks:
TreeView+	depends on / blocked

Reported:	2012-10-04 05:56 UTC by shylesh
Modified:	2015-08-10 19:30 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.3.0rhsvirt1-7.el6rhs
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-08-10 07:46:00 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description shylesh 2012-10-04 05:56:59 UTC

Description of problem:
After rebalncing with glusterd restart there was a crash

Version-Release number of selected component (if applicable):
[root@rhs-gp-srv4 core]# rpm -qa | grep gluster
glusterfs-fuse-3.3.0rhsvirt1-6.el6rhs.x86_64
glusterfs-debuginfo-3.3.0rhsvirt1-6.el6rhs.x86_64
vdsm-gluster-4.9.6-14.el6rhs.noarch
gluster-swift-plugin-1.0-5.noarch
gluster-swift-container-1.4.8-4.el6.noarch
org.apache.hadoop.fs.glusterfs-glusterfs-0.20.2_0.2-1.noarch
glusterfs-devel-3.3.0rhsvirt1-6.el6rhs.x86_64
glusterfs-3.3.0rhsvirt1-6.el6rhs.x86_64
glusterfs-server-3.3.0rhsvirt1-6.el6rhs.x86_64
glusterfs-rdma-3.3.0rhsvirt1-6.el6rhs.x86_64
gluster-swift-proxy-1.4.8-4.el6.noarch
gluster-swift-account-1.4.8-4.el6.noarch
gluster-swift-doc-1.4.8-4.el6.noarch
glusterfs-geo-replication-3.3.0rhsvirt1-6.el6rhs.x86_64
gluster-swift-1.4.8-4.el6.noarch
gluster-swift-object-1.4.8-4.el6.noarch


 


Steps to Reproduce:
1. created a single brick distribute volume, which was serving as VM store
2. added one more brick and started rebalance
3. while rebalance is happening restarted glusterd
  
Actual results:
Crash of rebalance process
 


Additional info:
Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id rebal --xlator-option *dht.use-re'.
Program terminated with signal 6, Aborted.
#0  0x0000003910a32885 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6_2.12.x86_64 libgcc-4.4.6-3.el6.x86_64 openssl-1.0.0-20.el6_2.5.x86_64 zlib-1.2.3-27.el6.x86_64


bt
====
(gdb) bt
#0  0x0000003910a32885 in raise () from /lib64/libc.so.6
#1  0x0000003910a34065 in abort () from /lib64/libc.so.6
#2  0x0000003910a6f977 in __libc_message () from /lib64/libc.so.6
#3  0x0000003910a75296 in malloc_printerr () from /lib64/libc.so.6
#4  0x00007f5bb2faa707 in gf_defrag_start_crawl (data=<value optimized out>) at dht-rebalance.c:1499
#5  0x000000397ee4bd72 in synctask_wrap (old_task=<value optimized out>) at syncop.c:120
#6  0x0000003910a43610 in ?? () from /lib64/libc.so.6
#7  0x0000000000000000 in ?? ()



(gdb) f 4 
#4  0x00007f5bb2faa707 in gf_defrag_start_crawl (data=<value optimized out>) at dht-rebalance.c:1499
1499                    GF_FREE (defrag);
(gdb) l
1494                    defrag->is_exiting = 1;
1495            }
1496            UNLOCK (&defrag->lock);
1497
1498            if (defrag)
1499                    GF_FREE (defrag);
1500
1501            return ret;
1502    }
1503

Note You need to log in before you can comment on or make changes to this bug.