Bug 764653 (GLUSTER-2921) - gfid not being replicated to all replica in 4 x 3 volume
Summary: gfid not being replicated to all replica in 4 x 3 volume
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-2921
Product: GlusterFS
Classification: Community
Component: replicate
Version: 3.1.4
Hardware: x86_64
OS: Linux
urgent
high
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-05-20 20:11 UTC by Joe Julian
Modified: 2015-12-01 16:45 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: ---
Mount Type: fuse
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)
Trace level log files and state dumps from the bricks and client. (49.63 KB, application/x-bzip)
2011-05-20 17:11 UTC, Joe Julian
no flags Details

Description Joe Julian 2011-05-20 20:11:43 UTC
Having upgraded from 3.0.7, I find that gfid's are getting out-of-sync across the servers. My original volfile configuration is the well know one from my blog ( http://goo.gl/EH4x ).

The new configuration matches the bricks:
Volume Name: gluster1
Type: Distributed-Replicate
Status: Started
Number of Bricks: 4 x 3 = 12
Transport-type: tcp
Bricks:
Brick1: ewcs2:/cluster/0
Brick2: ewcs4:/cluster/0
Brick3: ewcs7:/cluster/0
Brick4: ewcs2:/cluster/1
Brick5: ewcs4:/cluster/1
Brick6: ewcs7:/cluster/1
Brick7: ewcs2:/cluster/2
Brick8: ewcs4:/cluster/2
Brick9: ewcs7:/cluster/2
Brick10: ewcs2:/cluster/3
Brick11: ewcs4:/cluster/3
Brick12: ewcs7:/cluster/3
Options Reconfigured:
diagnostics.brick-log-level: INFO
network.frame-timeout: 600
diagnostics.client-log-level: INFO


(network.frame-timeout is overridden just for debugging purposes)

The attached trace level logs and state dumps were generated from the client command:
stat /mnt/glusterfs/centos/5.6/centosplus/i386/RPMS/xfsdump-2.2.46-1.el5.centos.i386.rpm

where /mnt/glusterfs was the client mountpoint.

I don't believe this bug is related to bug 764196, but it might be worth looking into.

Comment 1 Pranith Kumar K 2011-05-23 08:51:25 UTC
(In reply to comment #0)
> Created an attachment (id=493) [details]
> Trace level log files and state dumps from the bricks and client.
> 
> Having upgraded from 3.0.7, I find that gfid's are getting out-of-sync across
> the servers. My original volfile configuration is the well know one from my
> blog ( http://goo.gl/EH4x ).
> 
> The new configuration matches the bricks:
> Volume Name: gluster1
> Type: Distributed-Replicate
> Status: Started
> Number of Bricks: 4 x 3 = 12
> Transport-type: tcp
> Bricks:
> Brick1: ewcs2:/cluster/0
> Brick2: ewcs4:/cluster/0
> Brick3: ewcs7:/cluster/0
> Brick4: ewcs2:/cluster/1
> Brick5: ewcs4:/cluster/1
> Brick6: ewcs7:/cluster/1
> Brick7: ewcs2:/cluster/2
> Brick8: ewcs4:/cluster/2
> Brick9: ewcs7:/cluster/2
> Brick10: ewcs2:/cluster/3
> Brick11: ewcs4:/cluster/3
> Brick12: ewcs7:/cluster/3
> Options Reconfigured:
> diagnostics.brick-log-level: INFO
> network.frame-timeout: 600
> diagnostics.client-log-level: INFO
> 
> 
> (network.frame-timeout is overridden just for debugging purposes)
> 
> The attached trace level logs and state dumps were generated from the client
> command:
> stat
> /mnt/glusterfs/centos/5.6/centosplus/i386/RPMS/xfsdump-2.2.46-1.el5.centos.i386.rpm
> 
> where /mnt/glusterfs was the client mountpoint.
> 
> I don't believe this bug is related to bug 764196, but it might be worth looking
> into.

hi,
  We are looking into this with priority, just want to check with you if there is only one client or more than one client accessing the volume. I mean are there multiple mount points for the same volume?.

Pranith

Comment 2 Joe Julian 2011-05-23 11:32:52 UTC
/me tries to remember how he built his test at 3am...

I'm sure there were multiple clients, at least on multiple workstations. 

On just this one workstation, normally I would have 2 mountpoints (one is just mounted with debug logs in case I don't want to change the entire volume just for one quick test), but for these logs I only mounted it once.

Comment 3 Pranith Kumar K 2011-08-08 07:28:04 UTC
I think this bug is already fixed on the master. Need to check once and update. This should not happen for the replicas now as it takes entry locks on all the children and then updates the gfid.

Comment 4 Pranith Kumar K 2011-10-19 10:20:53 UTC
This is fixed for both 3.2.5 and 3.3.


Note You need to log in before you can comment on or make changes to this bug.