Bug 911361

Summary: Bricks grow when other bricks heal
Product: [Community] GlusterFS Reporter: john.r.moser
Component: replicateAssignee: bugs <bugs>
Status: CLOSED DEFERRED QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.3.1CC: bugs, gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-12-14 19:40:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description john.r.moser 2013-02-14 20:30:59 UTC
Description of problem:

Installed GlusterFS on two servers, gluster-1 and gluster-2.  Created a volume 'web' with replica=2 with bricks gluster-1:/mnt/silo0 and gluster-2:/mnt/silo0 each of which is ext4 at 516057528 1K blocks.

Mounted with glusterfs fuse on /mnt/exports/web on both.

Proceeded to rsync a large directory with tons of files from a Web server to /mnt/exports/web on gluster-1.

Shutdown gluster-2.

When gluster-2 came up, it did nothing.  Eventually I ran a self-heal with 'gluster volume heal web', 'gluster volume heal web full', and finally resorted to just a 'find /mnt/exports/web -noleaf -print0 | xargs --null stat'.

On gluster-1, /mnt/silo0 had the same Used space as /mnt/exports/web

On gluster-2, /mnt/silo0 was about 800MB smaller.

Made multiple attempts at self-heal.  Checked the logs, got lots of stuff in 'heal-failed', mostly UUIDs.

I then proceeded to use 'gluster volume remove-brick replica 1 gluster-2:/mnt/silo0' to drop gluster-2 out.  Then I unmounted the partition, formatted it ext4, and mounted it again.  Finally I reintroduced it with 'gluster volume add-brick replica 2 gluster-2:/mnt/silo0'.

This ended with it still wrong.

I dropped gluster-1, did that again, dropped gluster-2, brought it back up what I see happening is the exported volume AND the brick on gluster-1 are gaining in size!

gluster-1 /mnt/silo0 started at 3421412, as did the exported volume.

After gluster-2 finished self-heal, gluster-1 /mnt/silo0 and the exported volume sit at 3666816, while gluster-2 /mnt/silo0 sits at 3421412.  Nothing was writing to the exported volume during the heal.

Comment 1 Pranith Kumar K 2013-03-16 02:18:23 UTC
John,
     What commands are you using to see the disk usage on these bricks?

Pranith

Comment 2 john.r.moser 2013-03-16 02:52:33 UTC
df

It's weird, the volume that's being read from gets larger.

Comment 3 Pranith Kumar K 2013-03-16 04:04:40 UTC
John,
   Would it be possible to talk to you on IRC? it is irc.gnu.org #glusterfs
Giveme your nick so that I can ping you
Pranith.

Comment 4 Pranith Kumar K 2013-03-16 06:41:25 UTC
John,
      There was a typo. the channel is #gluster not #glusterfs

Pranith

Comment 5 Niels de Vos 2014-11-27 14:54:10 UTC
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug.

If there has been no update before 9 December 2014, this bug will get automatocally closed.