Bug 1022510

Summary: GlusterFS client crashes during add-brick and rebalance
Product: [Community] GlusterFS Reporter: Samuli Heinonen <samppah>
Component: coreAssignee: bugs <bugs>
Status: CLOSED EOL QA Contact:
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.4.1CC: bugs, gluster-bugs, joe, johan.huysmans, lmohanty, spalai, viktor.krivak
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-10-07 13:15:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
GlusterFS client log during rebalance
none
Backtrace of coredump none

Description Samuli Heinonen 2013-10-23 12:46:05 UTC
Created attachment 815391 [details]
GlusterFS client log during rebalance

Description of problem:
GlusterFS client crashes during rebalance after add-brick.


GlusterFS setup before add-brick

Volume Name: dev-el6-sata1
Type: Replicate
Volume ID: 840eccd5-b3fb-4dc8-b67d-966bd22e8557
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: boar1:/gluster/sata/brick1/dev-el6-sata1
Brick2: boar2:/gluster/sata/brick1/dev-el6-sata1
Options Reconfigured:
server.allow-insecure: on
performance.client-io-threads: enable
storage.owner-uid: 36
storage.owner-gid: 36
network.ping-timeout: 10
performance.quick-read: off
performance.io-cache: off
performance.stat-prefetch: off
network.remote-dio: enable


Version-Release number of selected component (if applicable):
Servers:
CentOS 6.4: 
glusterfs-fuse-3.4.1-1.el6.x86_64
glusterfs-server-3.4.1-1.el6.x86_64
glusterfs-libs-3.4.1-1.el6.x86_64
glusterfs-3.4.1-1.el6.x86_64
glusterfs-cli-3.4.1-1.el6.x86_64

Client:
RHEL 6.5 (beta)
glusterfs-3.4.1-2.el6.x86_64
glusterfs-libs-3.4.1-2.el6.x86_64
glusterfs-fuse-3.4.1-2.el6.x86_64
glusterfs-api-3.4.1-2.el6.x86_64
glusterfs-rdma-3.4.1-2.el6.x86_64
glusterfs-cli-3.4.1-2.el6.x86_64


Steps to Reproduce:
Backend filesystem is on logical volume mounted as:
/dev/mapper/sata--brick1-export on /gluster/sata/brick1 type xfs (rw,noatime,inode64,nobarrier,nobarrier)

For testing purposes new bricks are on same logical volume as older ones

1. gluster vol add-brick dev-el6-sata1 replica 2 boar1:/gluster/sata/brick1/dev-el6-sata2  boar2:/gluster/sata/brick1/dev-el6-sata2
2. gluster vol rebalance dev-el6-sata1 fix-layout start
3. gluster vol rebalance dev-el6-sata1 start

Actual results:
GlusterFS client crashes during rebalance and mount point goes unaccessible (Transpoint endpoint is not connected). After rebalance is finished it's required to use umount -fl to unmount the volume.


Expected results:
Gluster client doesn't crash and mount point is usable during rebalance.

Comment 1 Samuli Heinonen 2013-10-23 12:47:11 UTC
Created attachment 815392 [details]
Backtrace of coredump

Comment 2 Joe Julian 2014-06-05 08:47:11 UTC
*** Bug 1104940 has been marked as a duplicate of this bug. ***

Comment 3 Joe Julian 2014-06-05 16:16:03 UTC
afaict, this bug occurs as the file is migrated to a different server and a fuse cache invalidation is triggered.

Comment 4 Joe Julian 2014-06-06 03:48:23 UTC
I'm not sure if this is relevant.

On the source server, some of the files that were migrated to the destination are still showing as open in lsof, despite their having been deleted.

Comment 5 Susant Kumar Palai 2014-06-11 10:38:27 UTC
Hey Joe, a patch(http://review.gluster.org/#/c/8029/) is sent  addressing the same crash as part of the bug: https://bugzilla.redhat.com/show_bug.cgi?id=961615

Comment 6 Pranith Kumar K 2014-06-16 10:18:07 UTC
*** Bug 1019874 has been marked as a duplicate of this bug. ***

Comment 7 Joe Julian 2014-06-16 13:31:36 UTC
In bug 961615 (above) I tested the backport against 3.4.4. Prior to applying the patch I could crash the clients every time. After the patch I could not. (Yes, I reviewed it verified)

Comment 8 Niels de Vos 2015-05-17 21:57:24 UTC
GlusterFS 3.7.0 has been released (http://www.gluster.org/pipermail/gluster-users/2015-May/021901.html), and the Gluster project maintains N-2 supported releases. The last two releases before 3.7 are still maintained, at the moment these are 3.6 and 3.5.

This bug has been filed against the 3,4 release, and will not get fixed in a 3.4 version any more. Please verify if newer versions are affected with the reported problem. If that is the case, update the bug with a note, and update the version if you can. In case updating the version is not possible, leave a comment in this bug report with the version you tested, and set the "Need additional information the selected bugs from" below the comment box to "bugs".

If there is no response by the end of the month, this bug will get automatically closed.

Comment 9 Kaleb KEITHLEY 2015-10-07 13:15:42 UTC
GlusterFS 3.4.x has reached end-of-life.

If this bug still exists in a later release please reopen this and change the version or open a new bug.