Bug 1248123

Summary: writes to glusterfs folder are not synced to other nodes unless they are explicitly read from gluster mount
Product: [Community] GlusterFS Reporter: Ivo Petrov <cmptuomp3>
Component: replicateAssignee: bugs <bugs>
Status: CLOSED WORKSFORME QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.7.2CC: atalur, bugs, cmptuomp3, gluster-bugs, kdhananj, pkarampu, ravishankar, rtalur
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-09-24 01:55:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ivo Petrov 2015-07-29 16:05:26 UTC
Description of problem: writes to glusterfs folder are not synced to other nodes unless they are explicitly read from glusterfs mount

for example, I can write 1000 files in a folder in the glusterfs mount, then if I ssh to another server and query the actual brick, the files are not there, if I `cat` each file from the glusterfs mount - files appear in the brick mount. Aren't files supposed to be automatically copied to the brick? What if meanwhile the server on which I performed the writes goes down, are they lost?


Version-Release number of selected component (if applicable): 3.7.2-3.el7


How reproducible: very


Steps to Reproduce:
1. touch /ssd_data/test.file on 10.0.1.3 (ssd_data is glusterfs mount)
2. try and open /ssd_storage/test.file on 10.0.1.2 (not found) (ssd_storage is the brick mount)

Actual results:
 file not found

Expected results:
 file is present



Additional info:

Number of volumes 2
Volume Names ssd-gluster-data, sata-gluster-data
Type of volumes replica 3

Output of gluster volume info

gluster volume info
 
Volume Name: sata-gluster-data
Type: Replicate
Volume ID: dbb498cc-fa8f-4513-8d6c-1aa9a159b7cf
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.0.1.3:/sata_storage/GLUSTER_DO_NOT_MODIFY
Brick2: 10.0.1.2:/sata_storage/GLUSTER_DO_NOT_MODIFY
Brick3: 10.0.1.1:/sata_storage/GLUSTER_DO_NOT_MODIFY
Options Reconfigured:
nfs.disable: off
cluster.server-quorum-type: server
cluster.server-quorum-ratio: 60
 
Volume Name: ssd-gluster-data
Type: Replicate
Volume ID: 967b2328-c5d3-4ea1-8cd0-8114796f3b50
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.0.1.3:/ssd_storage/GLUSTER_DO_NOT_MODIFY
Brick2: 10.0.1.2:/ssd_storage/GLUSTER_DO_NOT_MODIFY
Brick3: 10.0.1.1:/ssd_storage/GLUSTER_DO_NOT_MODIFY
Options Reconfigured:
nfs.disable: off
cluster.server-quorum-type: server
cluster.server-quorum-ratio: 60

Output of gluster volume status

gluster volume status
Status of volume: sata-gluster-data
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.0.1.3:/sata_storage/GLUSTER_DO_NOT
_MODIFY                                     49153     0          Y       16358
Brick 10.0.1.2:/sata_storage/GLUSTER_DO_NOT
_MODIFY                                     49155     0          Y       8691 
Brick 10.0.1.1:/sata_storage/GLUSTER_DO_NOT
_MODIFY                                     49154     0          Y       24495
NFS Server on localhost                     2049      0          Y       8705 
Self-heal Daemon on localhost               N/A       N/A        Y       8713 
NFS Server on 10.0.1.1                      2049      0          Y       24509
Self-heal Daemon on 10.0.1.1                N/A       N/A        Y       24517
NFS Server on 10.0.1.3                      2049      0          Y       30660
Self-heal Daemon on 10.0.1.3                N/A       N/A        Y       30699
 
Task Status of Volume sata-gluster-data
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: ssd-gluster-data
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.0.1.3:/ssd_storage/GLUSTER_DO_NOT_
MODIFY                                      49152     0          Y       16366
Brick 10.0.1.2:/ssd_storage/GLUSTER_DO_NOT_
MODIFY                                      49152     0          Y       8698 
Brick 10.0.1.1:/ssd_storage/GLUSTER_DO_NOT_
MODIFY                                      49153     0          Y       24502
NFS Server on localhost                     2049      0          Y       8705 
Self-heal Daemon on localhost               N/A       N/A        Y       8713 
NFS Server on 10.0.1.1                      2049      0          Y       24509
Self-heal Daemon on 10.0.1.1                N/A       N/A        Y       24517
NFS Server on 10.0.1.3                      2049      0          Y       30660
Self-heal Daemon on 10.0.1.3                N/A       N/A        Y       30699
 
Task Status of Volume ssd-gluster-data
------------------------------------------------------------------------------
There are no active volume tasks

fstab:

10.0.1.3:/ssd-gluster-data		/ssd_data	glusterfs	defaults,_netdev,backup-volfile-servers=10.0.1.1:10.0.1.2	0 0
10.0.1.3:/sata-gluster-data		/sata_data	glusterfs	defaults,_netdev,backup-volfile-servers=10.0.1.1:10.0.1.2	0 0


Client Information
OS Type: Linux
Mount type: fuse.glusterfs
OS Version: CentOS 7, x86_64

Comment 1 Ivo Petrov 2015-07-29 16:51:11 UTC
Forgot to add, my ssd_data.log is full with:
[2015-07-26 14:05:43.676576] E [MSGID: 114031] [client-rpc-fops.c:251:client3_3_mknod_cbk] 0-ssd-gluster-data-client-2: remote operation failed: Permission denied. Path: (null) [Permission denied]
[2015-07-26 14:05:43.841046] E [MSGID: 114031] [client-rpc-fops.c:251:client3_3_mknod_cbk] 0-ssd-gluster-data-client-1: remote operation failed: Permission denied. Path: (null) [Permission denied]
[2015-07-26 14:05:43.841409] E [MSGID: 114031] [client-rpc-fops.c:251:client3_3_mknod_cbk] 0-ssd-gluster-data-client-2: remote operation failed: Permission denied. Path: (null) [Permission denied]
[2015-07-26 14:05:44.251452] E [MSGID: 114031] [client-rpc-fops.c:251:client3_3_mknod_cbk] 0-ssd-gluster-data-client-1: remote operation failed: Permission denied. Path: (null) [Permission denied]
[2015-07-26 14:05:44.251818] E [MSGID: 114031] [client-rpc-fops.c:251:client3_3_mknod_cbk] 0-ssd-gluster-data-client-2: remote operation failed: Permission denied. Path: (null) [Permission denied]
[2015-07-26 14:05:44.643556] E [MSGID: 114031] [client-rpc-fops.c:321:client3_3_mkdir_cbk] 0-ssd-gluster-data-client-1: remote operation failed: Permission denied. Path: (null) [Permission denied]
[2015-07-26 14:05:44.643940] E [MSGID: 114031] [client-rpc-fops.c:321:client3_3_mkdir_cbk] 0-ssd-gluster-data-client-2: remote operation failed: Permission denied. Path: (null) [Permission denied]
[2015-07-26 14:05:44.816162] E [MSGID: 114031] [client-rpc-fops.c:251:client3_3_mknod_cbk] 0-ssd-gluster-data-client-1: remote operation failed: Permission denied. Path: (null) [Permission denied]


that's on servers 10.0.1.1 and 10.0.1.2

10.0.1.3 has just:
[2015-07-29 15:19:11.311369] W [fuse-bridge.c:1263:fuse_err_cbk] 0-glusterfs-fuse: 352139993: REMOVEXATTR() /somefile.. => -1 (No data available)


selinux is disabled on all 3 servers

Comment 2 Krutika Dhananjay 2015-08-04 12:41:22 UTC
Ivo,

Could you share the brick logs from 10.0.1.1 and 10.0.1.2?

Thanks,
Krutika

Comment 3 Ivo Petrov 2015-08-04 12:50:26 UTC
Hello Krutika,
 see my first comment here - that's all that's relevant in those logs.

Comment 4 Krutika Dhananjay 2015-08-04 12:53:52 UTC
Ivo,

Sorry I did not make myself clear. Could you share the logs at /var/log/glusterfs/bricks/*.log on 10.0.1.1 and 10.0.1.2?

-Krutika

Comment 5 Ivo Petrov 2015-08-04 13:12:52 UTC
Sorry - on 10.0.1.1 and 10.0.1.2 I have tons of 

[2015-08-04 13:10:28.801167] E [MSGID: 115057] [server-rpc-fops.c:535:server_mknod_cbk] 0-ssd-gluster-data-server: 193268: MKNOD /$FILE (04cc4296-7183-4efb-9bcc-7e6857eb2076/$FILE) ==> (Permission denied) [Permission denied]

when healing (cat from glusterfs mount) log is full with:


[2015-08-04 12:59:51.012863] I [dict.c:467:dict_get] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f4c15b2cef6] (--> /usr/lib64/libglusterfs.so.0(dict_get+0x90)[0x7f4c15b25ce0] (--> /usr/lib64/glusterfs/3.7.2/xlator/storage/posix.so(posix_setxattr+0x1b9)[0x7f4c0823c439] (--> /usr/lib64/libglusterfs.so.0(default_setxattr+0x75)[0x7f4c15b313a5] (--> /usr/lib64/glusterfs/3.7.2/xlator/features/changetimerecorder.so(ctr_setxattr+0x186)[0x7f4c037cc696] ))))) 0-dict: !this || key=dht-get-iatt-in-xattr

Comment 6 Krutika Dhananjay 2015-08-05 06:26:40 UTC
(In reply to Ivo Petrov from comment #5)
> Sorry - on 10.0.1.1 and 10.0.1.2 I have tons of 
> 
> [2015-08-04 13:10:28.801167] E [MSGID: 115057]
> [server-rpc-fops.c:535:server_mknod_cbk] 0-ssd-gluster-data-server: 193268:
> MKNOD /$FILE (04cc4296-7183-4efb-9bcc-7e6857eb2076/$FILE) ==> (Permission
> denied) [Permission denied]
> 
> when healing (cat from glusterfs mount) log is full with:
> 
> 
> [2015-08-04 12:59:51.012863] I [dict.c:467:dict_get] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f4c15b2cef6] (-->
> /usr/lib64/libglusterfs.so.0(dict_get+0x90)[0x7f4c15b25ce0] (-->
> /usr/lib64/glusterfs/3.7.2/xlator/storage/posix.
> so(posix_setxattr+0x1b9)[0x7f4c0823c439] (-->
> /usr/lib64/libglusterfs.so.0(default_setxattr+0x75)[0x7f4c15b313a5] (-->
> /usr/lib64/glusterfs/3.7.2/xlator/features/changetimerecorder.
> so(ctr_setxattr+0x186)[0x7f4c037cc696] ))))) 0-dict: !this ||
> key=dht-get-iatt-in-xattr

This looks like an error that could *possibly* be originating from posix-acl translator. CC'ing Raghavendra Talur.

Raghavendra,

Would you be able to help us out here? Basically, entry creations on Ivo's 3-way replicated volume seem to be succeeding on only the first sub-volume of AFR and failing with EACCES (?) on the remaining two.

Comment 7 Raghavendra Talur 2015-08-08 18:03:46 UTC
(In reply to Ivo Petrov from comment #5)
> when healing (cat from glusterfs mount) log is full with:
> 
> 
> [2015-08-04 12:59:51.012863] I [dict.c:467:dict_get] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f4c15b2cef6] (-->
> /usr/lib64/libglusterfs.so.0(dict_get+0x90)[0x7f4c15b25ce0] (-->
> /usr/lib64/glusterfs/3.7.2/xlator/storage/posix.
> so(posix_setxattr+0x1b9)[0x7f4c0823c439] (-->
> /usr/lib64/libglusterfs.so.0(default_setxattr+0x75)[0x7f4c15b313a5] (-->
> /usr/lib64/glusterfs/3.7.2/xlator/features/changetimerecorder.
> so(ctr_setxattr+0x186)[0x7f4c037cc696] ))))) 0-dict: !this ||
> key=dht-get-iatt-in-xattr

This is fixed in 3.7.3 version, you can look at this bug
https://bugzilla.redhat.com/show_bug.cgi?id=1235923

Comment 8 Raghavendra Talur 2015-08-08 18:09:18 UTC
(In reply to Ivo Petrov from comment #5)
> Sorry - on 10.0.1.1 and 10.0.1.2 I have tons of 
> 
> [2015-08-04 13:10:28.801167] E [MSGID: 115057]
> [server-rpc-fops.c:535:server_mknod_cbk] 0-ssd-gluster-data-server: 193268:
> MKNOD /$FILE (04cc4296-7183-4efb-9bcc-7e6857eb2076/$FILE) ==> (Permission
> denied) [Permission denied]

For this original problem, my first question is about the user/process which is doing the write on gluster mount. Is it root or someone else?

If someone else, does that user exist on the other two nodes?


If non of the above the help us with determining whats wrong, we will have to check the brick logs at debug level.
You can do that by executing "gluster volume set <VOLNAME> brick-log-level DEBUG".

Comment 9 Pranith Kumar K 2015-09-23 10:20:10 UTC
Ivo Petrov,
      Are you still experiencing this bug? Could you provide the information asked by Raghavendra Talur in comment #8?

Pranith

Comment 10 Ivo Petrov 2015-09-23 16:09:12 UTC
Hello,
I'm sorry, I've missed comment #8.

I've tested this again and it seems to be working as expected now, the only difference is I had umount/mount the problematic mount on the node not from itself (local IP) but from another node, after that I checked consistency with a script I wrote -

get all files from a "good node"
get all files from "problematic node"
diff the two lists and `cat to dev null` all files that are different on the prolematic node (cat is from glusterfs mount which forces them to be synced locally)

After all that above, I can no longer reproduce the original issue. The mounts/setup is currently as in the original post and all is working as expected.

Comment 11 Pranith Kumar K 2015-09-24 01:55:33 UTC
Thanks Ivo Petrov. Closing the bug as per Comment #10