Due to a recent update on Javascript code a full page refresh on your browser might be needed.
Bug 1670253 - Writes on Gluster 5 volumes fail with EIO when "cluster.consistent-metadata" is set
Summary: Writes on Gluster 5 volumes fail with EIO when "cluster.consistent-metadata" ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: mainline
Hardware: Unspecified
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Ravishankar N
QA Contact:
URL:
Whiteboard:
Depends On: 1665145
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-01-29 04:15 UTC by Ravishankar N
Modified: 2019-12-30 22:06 UTC (History)
8 users (show)

Fixed In Version: glusterfs-6.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1665145
Environment:
Last Closed: 2019-01-31 06:18:47 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Gluster.org Gerrit 22110 None Merged readdir-ahead: do not zero-out iatt in fop cbk 2019-01-31 06:18:46 UTC

Description Ravishankar N 2019-01-29 04:15:10 UTC
+++ This bug was initially created as a clone of Bug #1665145 +++

After upgrading a Gluster cluster from version 3.12.15 to 5.2 writes to volumes with "cluster.consistent-metadata" enabled fail with EIO. Reproduction steps follow.

## Package information

Package source: http://mirror.centos.org/centos/7/storage/x86_64/gluster-5/

# gluster --version
glusterfs 5.2

# lsb_release -a
LSB Version:    :core-4.1-amd64:core-4.1-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-4.1-amd64:desktop-4.1-noarch:languages-4.1-amd64:languages-4.1-noarch:printing-4.1-amd64:printing-4.1-noarch
Distributor ID: RedHatEnterpriseServer
Description:    Red Hat Enterprise Linux Server release 7.6 (Maipo)
Release:        7.6
Codename:       Maipo

## Cluster information

Peers:

* storage1
* storage2
* storage3

# gluster peer status
Number of Peers: 2

Hostname: storage2
Uuid: 0acf152f-4cba-4757-bdab-113c34333eeb
State: Peer in Cluster (Connected)

Hostname: storage3
Uuid: c0302a36-149a-44ca-984f-8120a1fa360a
State: Peer in Cluster (Connected)

## Reproduction steps

# On each storage server
mkdir /data/testpv1

# Create volume
gluster volume create testpv1 replica 3 storage1:/data/testpv1/brick storage2:/data/testpv1/brick storage3:/data/testpv1/brick force
gluster volume start testpv1

# Mount volume on a client
mount -t glusterfs -o log-level=DEBUG storage1:/testpv1 /mnt/gluster

# Writes succeed
$ echo foo > /mnt/gluster/a; ls -l /mnt/gluster/a
-rw-r--r--. 1 root root 4 10. Jan 15:53 /mnt/gluster/a

# Enable "cluster.consistent-metadata"
gluster volume set testpv1 cluster.consistent-metadata on

# Writes fail with EIO after truncation
$ echo foo > /mnt/gluster/a; ls -l /mnt/gluster/a
bash: /mnt/gluster/a: Input/output error
-rw-r--r--. 1 root root 0 10. Jan 15:53 /mnt/gluster/a

# Disable "cluster.consistent-metadata" again
gluster volume set testpv1 cluster.consistent-metadata off

# Writes work again
$ echo foo > /mnt/gluster/a; ls -l /mnt/gluster/a
-rw-r--r--. 1 root root 4 10. Jan 15:54 /mnt/gluster/a

# Debug logs from FUSE client

[2019-01-10 14:56:30.725669] D [MSGID: 0] [dht-common.c:1559:dht_revalidate_cbk] 0-testpv1-dht: revalidate lookup of / returned with op_ret 0 [Structure needs cleaning]
[2019-01-10 14:56:30.725747] D [MSGID: 0] [dht-common.c:1648:dht_revalidate_cbk] 0-testpv1-dht: internal xattr trusted.glusterfs.dht.mds is not present on path / gfid is 00000000-0000-0000-0000-000000000001
[2019-01-10 14:56:30.726483] D [write-behind.c:1764:wb_process_queue] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fcad4e13f1b] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x9a0b)[0x7fcac6b20a0b] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0xd288)[0x7fcac6b24288] (--> /lib64/libglusterfs.so.0(default_lookup+0xdd)[0x7fcad4ea5c1d] (--> /lib64/libglusterfs.so.0(default_lookup+0xdd)[0x7fcad4ea5c1d] ))))) 0-testpv1-write-behind: processing queues
[2019-01-10 14:56:30.726519] D [MSGID: 0] [write-behind.c:1710:__wb_pick_winds] 0-testpv1-write-behind: (unique=24, fop=LOOKUP, gfid=60fcb860-2470-45b4-9060-853979a8bc4d, gen=0): picking the request for winding
[2019-01-10 14:56:30.726572] D [MSGID: 0] [dht-common.c:3374:dht_do_revalidate] 0-testpv1-dht: path = /a. Calling revalidate lookup on testpv1-replicate-0
[2019-01-10 14:56:30.726939] D [write-behind.c:419:__wb_request_unref] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fcad4e13f1b] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x26b2)[0x7fcac6b196b2] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x504c)[0x7fcac6b1c04c] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x98c1)[0x7fcac6b208c1] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x99cb)[0x7fcac6b209cb] ))))) 0-testpv1-write-behind: (unique = 24, fop=LOOKUP, gfid=60fcb860-2470-45b4-9060-853979a8bc4d, gen=0): destroying request, removing from all queues
[2019-01-10 14:56:30.727875] D [MSGID: 0] [dht-common.c:1559:dht_revalidate_cbk] 0-testpv1-dht: revalidate lookup of /a returned with op_ret 0
[2019-01-10 14:56:30.728405] D [write-behind.c:1764:wb_process_queue] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fcad4e13f1b] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x9a0b)[0x7fcac6b20a0b] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0xc268)[0x7fcac6b23268] (--> /usr/lib64/glusterfs/5.2/xlator/performance/read-ahead.so(+0x7331)[0x7fcac690e331] (--> /usr/lib64/glusterfs/5.2/xlator/performance/readdir-ahead.so(+0x44f2)[0x7fcac66fb4f2] ))))) 0-testpv1-write-behind: processing queues
[2019-01-10 14:56:30.728442] D [MSGID: 0] [write-behind.c:1710:__wb_pick_winds] 0-testpv1-write-behind: (unique=27, fop=TRUNCATE, gfid=60fcb860-2470-45b4-9060-853979a8bc4d, gen=0): picking the request for winding
[2019-01-10 14:56:30.728722] D [write-behind.c:419:__wb_request_unref] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fcad4e13f1b] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x26b2)[0x7fcac6b196b2] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x504c)[0x7fcac6b1c04c] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x98c1)[0x7fcac6b208c1] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x99cb)[0x7fcac6b209cb] ))))) 0-testpv1-write-behind: (unique = 27, fop=TRUNCATE, gfid=60fcb860-2470-45b4-9060-853979a8bc4d, gen=0): destroying request, removing from all queues
[2019-01-10 14:56:30.729207] D [MSGID: 0] [afr-transaction.c:2070:afr_post_nonblocking_inodelk_cbk] 0-testpv1-replicate-0: Non blocking inodelks done. Proceeding to FOP
[2019-01-10 14:56:30.729245] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.dirty' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.729271] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-2' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.729286] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-1' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.729297] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-0' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.729345] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.dirty' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.729368] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-2' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.729380] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-1' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.729396] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-0' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.729435] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.dirty' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.729450] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-2' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.729461] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-1' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.729472] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-0' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.730173] D [MSGID: 0] [client-rpc-fops_v2.c:1553:client4_0_xattrop_cbk] 0-testpv1-client-0: resetting op_ret to 0 from 0
[2019-01-10 14:56:30.730289] D [MSGID: 0] [client-rpc-fops_v2.c:1553:client4_0_xattrop_cbk] 0-testpv1-client-2: resetting op_ret to 0 from 0
[2019-01-10 14:56:30.730516] D [MSGID: 0] [client-rpc-fops_v2.c:1553:client4_0_xattrop_cbk] 0-testpv1-client-1: resetting op_ret to 0 from 0
[2019-01-10 14:56:30.731446] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.dirty' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.731483] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-2' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.731499] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-1' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.731513] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-0' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.731582] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.dirty' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.731609] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-2' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.731624] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-1' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.731637] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-0' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.731680] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.dirty' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.731710] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-2' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.731726] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-1' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.731740] D [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: key 'trusted.afr.testpv1-client-0' would not be sent on wire in the future [Invalid argument]
[2019-01-10 14:56:30.732161] D [MSGID: 0] [client-rpc-fops_v2.c:1553:client4_0_xattrop_cbk] 0-testpv1-client-0: resetting op_ret to 0 from 0
[2019-01-10 14:56:30.732480] D [MSGID: 0] [client-rpc-fops_v2.c:1553:client4_0_xattrop_cbk] 0-testpv1-client-2: resetting op_ret to 0 from 0
[2019-01-10 14:56:30.732548] D [MSGID: 0] [client-rpc-fops_v2.c:1553:client4_0_xattrop_cbk] 0-testpv1-client-1: resetting op_ret to 0 from 0
[2019-01-10 14:56:30.735749] D [write-behind.c:1764:wb_process_queue] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fcad4e13f1b] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x9a0b)[0x7fcac6b20a0b] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0xd288)[0x7fcac6b24288] (--> /lib64/libglusterfs.so.0(default_lookup+0xdd)[0x7fcad4ea5c1d] (--> /lib64/libglusterfs.so.0(default_lookup+0xdd)[0x7fcad4ea5c1d] ))))) 0-testpv1-write-behind: processing queues
[2019-01-10 14:56:30.735787] D [MSGID: 0] [write-behind.c:1710:__wb_pick_winds] 0-testpv1-write-behind: (unique=29, fop=LOOKUP, gfid=60fcb860-2470-45b4-9060-853979a8bc4d, gen=0): picking the request for winding
[2019-01-10 14:56:30.735827] D [MSGID: 0] [dht-common.c:3374:dht_do_revalidate] 0-testpv1-dht: path = /a. Calling revalidate lookup on testpv1-replicate-0
[2019-01-10 14:56:30.736354] D [write-behind.c:419:__wb_request_unref] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fcad4e13f1b] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x26b2)[0x7fcac6b196b2] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x504c)[0x7fcac6b1c04c] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x98c1)[0x7fcac6b208c1] (--> /usr/lib64/glusterfs/5.2/xlator/performance/write-behind.so(+0x99cb)[0x7fcac6b209cb] ))))) 0-testpv1-write-behind: (unique = 29, fop=LOOKUP, gfid=60fcb860-2470-45b4-9060-853979a8bc4d, gen=0): destroying request, removing from all queues
[2019-01-10 14:56:30.738253] D [MSGID: 0] [dht-common.c:1559:dht_revalidate_cbk] 0-testpv1-dht: revalidate lookup of /a returned with op_ret 0
[2019-01-10 14:56:30.738311] D [fuse-bridge.c:53:fuse_invalidate] 0-fuse: Invalidate inode id 140508580080760.
[2019-01-10 14:56:30.738390] E [fuse-bridge.c:219:check_and_dump_fuse_W] 0-glusterfs-fuse: writing to fuse device failed: No such file or directory

Comment 1 Worker Ant 2019-01-29 04:20:28 UTC
REVIEW: https://review.gluster.org/22110 (readdir-ahead: do not zero-out iatt in fop cbk) posted (#1) for review on master by Ravishankar N

Comment 2 Worker Ant 2019-01-31 06:18:47 UTC
REVIEW: https://review.gluster.org/22110 (readdir-ahead: do not zero-out iatt in fop cbk) merged (#3) on master by Raghavendra G

Comment 3 Shyamsundar 2019-03-25 16:33:11 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report.

glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html
[2] https://www.gluster.org/pipermail/gluster-users/

Comment 4 Florijan Hamzic 2019-11-22 21:48:53 UTC
i am running 7.0-1 glusterfs on debian, the transfer of small files is super slow and i found this in the logs:

[2019-11-22 21:45:40.490618] E [fuse-bridge.c:227:check_and_dump_fuse_W] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x138)[0x7fed163d9da8] (--> /usr/lib/x86_64-linux-gnu/glusterfs/7.0/xlator/mount/fuse.so(+0x7bca)[0x7fed13b07bca] (--> /usr/lib/x86_64-linux-gnu/glusterfs/7.0/xlator/mount/fuse.so(+0x7d33)[0x7fed13b07d33] (--> /lib/x86_64-linux-gnu/libpthread.so.0(+0x74a4)[0x7fed156334a4] (--> /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7fed14edbd0f] ))))) 0-glusterfs-fuse: writing to fuse device failed: No such file or directory

Comment 5 Mikhail 2019-11-27 09:18:35 UTC
I have same issue under centos 7.7.1908

2019-11-27 09:11:18.145047] E [fuse-bridge.c:220:check_and_dump_fuse_W] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fbc7c8cbadb] (--> /usr/lib64/glusterfs/6.6/xlator/mount/fuse.so(+0x8211)[0x7fbc73c76211] (--> /usr/lib64/glus erfs/6.6/xlator/mount/fuse.so(+0x8aea)[0x7fbc73c76aea] (--> /lib64/libpthread.so.0(+0x7e65)[0x7fbc7b709e65] (--> /lib64/libc.so.6(clone+0x6d)[0x7fbc7afcf88d] ))))) 0-glusterfs-fuse: writing to fuse device failed: No such file or directory

Comment 6 Ravishankar N 2019-11-27 12:07:26 UTC
Florijan, Mikhail, please raise a separate bug with all the details like gluster version, logs, steps to reproduce etc. This bug and fix was specifically for the issue given in the bug's description.

Comment 7 Arkadiy Night 2019-12-30 21:08:09 UTC
Ravishankar, I just updated my gluster (servers and clients) to 6.6 on Centos 7 and have this in my clients logs:

[2019-12-30 21:03:13.865357] E [fuse-bridge.c:220:check_and_dump_fuse_W] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f2921721adb] (--> /usr/lib64/glusterfs/6.6/xlator/mount/fuse.so(+0x8211)[0x7f291eff6211] (--> /usr/lib64/glusterfs/6.6/xlator/mount/fuse.so(+0x8aea)[0x7f291eff6aea] (--> /lib64/libpthread.so.0(+0x7e65)[0x7f292055fe65] (--> /lib64/libc.so.6(clone+0x6d)[0x7f291fe2588d] ))))) 0-glusterfs-fuse: writing to fuse device failed: No such file or directory
[2019-12-30 21:03:16.039712] E [fuse-bridge.c:220:check_and_dump_fuse_W] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f2921721adb] (--> /usr/lib64/glusterfs/6.6/xlator/mount/fuse.so(+0x8211)[0x7f291eff6211] (--> /usr/lib64/glusterfs/6.6/xlator/mount/fuse.so(+0x8aea)[0x7f291eff6aea] (--> /lib64/libpthread.so.0(+0x7e65)[0x7f292055fe65] (--> /lib64/libc.so.6(clone+0x6d)[0x7f291fe2588d] ))))) 0-glusterfs-fuse: writing to fuse device failed: No such file or directory

Unfortunately, I have no idea how to reproduce this or what exactly causes this.

Comment 8 Arkadiy Night 2019-12-30 21:30:17 UTC
I just updated one of the clients to 7.1 and this problem happens there as well:

[2019-12-30 21:18:03.778578] E [fuse-bridge.c:227:check_and_dump_fuse_W] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f336e6de8ea] (--> /usr/lib64/glusterfs/7.1/xlator/mount/fuse.so(+0x8221)[0x7f3365a8c221] (--> /usr/lib64/glusterfs/7.1/xlator/mount/fuse.so(+0x8b3a)[0x7f3365a8cb3a] (--> /lib64/libpthread.so.0(+0x7e65)[0x7f336d51fe65] (--> /lib64/libc.so.6(clone+0x6d)[0x7f336cde588d] ))))) 0-glusterfs-fuse: writing to fuse device failed: No such file or directory

Comment 9 Arkadiy Night 2019-12-30 22:06:27 UTC
I rolled back the whole cluster and clients to 5.11 and the problem is gone.


Note You need to log in before you can comment on or make changes to this bug.