Description of problem: ======================== I see that If i create a 10gb file from one client, the other client shows the file as size of 250MB (ie only quarter the actual size or the size of the file on one data brick) I don't think caching is the only culprit, because when the file is being created from c1, c2 would have the size it got at a older time(not the real time size) However once the file is created completely in c1, if we do a stat on c2, the c2 shows the size updated immediately, but instead of showing the correct size, it shows the size of file as only 1/4th the actual size I was able to reproduce consistently on my setup I have a 2x(4+2) ec volume mounted on nfs-ganesha clients. Setup is as follows: node1:10.70.35.37 VIP:10.70.35.50 node1:10.70.35.116 VIP:10.70.35.60 node1:10.70.35.239 VIP:10.70.35.70 node1:10.70.35.135 ==>not part of ganesha cluster node1:10.70.35.8 ==>not part of ganesha cluster node1:10.70.35.196 VIP:10.70.35.80 I mounted the volume using ganesha type, on two clients as below: c1:rhs-client45(el7.2):10.70.35.50:/disperse on /mnt/disperse type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.70.36.69,local_lock=none,addr=10.70.35.50) c2:dhcp35-107:(el6.8):10.70.35.80:disperse on /mnt/disperse type nfs (rw,vers=4,addr=10.70.35.80,clientaddr=10.70.35.107) The same inconsistency can be observed even with md5sum Volume Name: disperse Type: Distributed-Disperse Volume ID: ea0b329a-f2ff-462d-a3a2-457a80b35e08 Status: Started Snapshot Count: 0 Number of Bricks: 2 x (4 + 2) = 12 Transport-type: tcp Bricks: Brick1: 10.70.35.37:/rhs/brick2/disperse Brick2: 10.70.35.116:/rhs/brick2/disperse Brick3: 10.70.35.239:/rhs/brick2/disperse Brick4: 10.70.35.135:/rhs/brick2/disperse Brick5: 10.70.35.8:/rhs/brick2/disperse Brick6: 10.70.35.196:/rhs/brick2/disperse Brick7: 10.70.35.37:/rhs/brick3/disperse Brick8: 10.70.35.116:/rhs/brick3/disperse Brick9: 10.70.35.239:/rhs/brick3/disperse Brick10: 10.70.35.135:/rhs/brick3/disperse Brick11: 10.70.35.8:/rhs/brick3/disperse Brick12: 10.70.35.196:/rhs/brick3/disperse Options Reconfigured: performance.md-cache-timeout: 600 performance.cache-invalidation: on performance.stat-prefetch: on features.cache-invalidation-timeout: 600 ganesha.enable: on features.cache-invalidation: off transport.address-family: inet performance.readdir-ahead: on nfs.disable: on nfs-ganesha: enable cluster.enable-shared-storage: enable [root@dhcp35-37 log]# gluster v status disperse Status of volume: disperse Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.35.37:/rhs/brick2/disperse 49153 0 Y 21634 Brick 10.70.35.116:/rhs/brick2/disperse 49153 0 Y 17741 Brick 10.70.35.239:/rhs/brick2/disperse 49153 0 Y 8056 Brick 10.70.35.135:/rhs/brick2/disperse 49154 0 Y 31840 Brick 10.70.35.8:/rhs/brick2/disperse 49154 0 Y 31773 Brick 10.70.35.196:/rhs/brick2/disperse 49152 0 Y 6865 Brick 10.70.35.37:/rhs/brick3/disperse 49154 0 Y 21654 Brick 10.70.35.116:/rhs/brick3/disperse 49154 0 Y 17761 Brick 10.70.35.239:/rhs/brick3/disperse 49154 0 Y 8076 Brick 10.70.35.135:/rhs/brick3/disperse 49155 0 Y 31860 Brick 10.70.35.8:/rhs/brick3/disperse 49155 0 Y 31793 Brick 10.70.35.196:/rhs/brick3/disperse 49153 0 Y 6885 Self-heal Daemon on localhost N/A N/A Y 21675 Self-heal Daemon on dhcp35-8.lab.eng.blr.re dhat.com N/A N/A Y 31815 Self-heal Daemon on dhcp35-196.lab.eng.blr. redhat.com N/A N/A Y 6914 Self-heal Daemon on dhcp35-135.lab.eng.blr. redhat.com N/A N/A Y 31881 Self-heal Daemon on dhcp35-116.lab.eng.blr. redhat.com N/A N/A Y 17794 Self-heal Daemon on dhcp35-239.lab.eng.blr. redhat.com N/A N/A Y 8121 Task Status of Volume disperse ------------------------------------------------------------------------------ There are no active volume tasks Version-Release number of selected component (if applicable): ===================================== [root@dhcp35-37 log]# rpm -qa|grep gluster glusterfs-geo-replication-3.8.4-10.el7rhgs.x86_64 nfs-ganesha-gluster-2.4.1-3.el7rhgs.x86_64 glusterfs-3.8.4-10.el7rhgs.x86_64 python-gluster-3.8.4-10.el7rhgs.noarch glusterfs-cli-3.8.4-10.el7rhgs.x86_64 glusterfs-rdma-3.8.4-10.el7rhgs.x86_64 glusterfs-api-3.8.4-10.el7rhgs.x86_64 glusterfs-server-3.8.4-10.el7rhgs.x86_64 glusterfs-api-devel-3.8.4-10.el7rhgs.x86_64 glusterfs-libs-3.8.4-10.el7rhgs.x86_64 glusterfs-devel-3.8.4-10.el7rhgs.x86_64 glusterfs-fuse-3.8.4-10.el7rhgs.x86_64 glusterfs-ganesha-3.8.4-10.el7rhgs.x86_64 glusterfs-client-xlators-3.8.4-10.el7rhgs.x86_64 glusterfs-events-3.8.4-10.el7rhgs.x86_64 Steps to Reproduce: ===================== 1.create a disp-ec volume and mount it on two different clients using different VIPs of ganesha cluster 2.from c1, start to create a file of say 1GB 3. While the file is getting created, do a lookup on client c2 3.once c1 completes ,immediately issue stat of file on c1, it will show size as 1GB 4. now issue on c2, it will show as 250MB(instead of 1GB) this is wrong, as it will lead to inconsistent data for applications relying on these details Now even if you do a md5sum, it calculated for the cached or 250mb file which is wrong Note, that this file updates to the correct data after sometime(hence not sure if mdcaching time settings are playing a role here)
I disable mdcache settings after which I don't see this problem hence seems to be with all likeliness mdcache issue
I am seeing this issue even with fuse protocol, hence changing the title to something more generic(removing ganesha) This is seen with mdcache settings only Also, note that the reason it can skip someone's eye or observation with fuse protocol, is that, whenever we do the lookup say ll of the file, it gets the update information of the file(while dd is happening from other client) in fuse mount. However once file is completed with dd , then only the ll reflects to only the size of file on one brick . Whereas in ganesha, the ll doesnt always serve the an updated size information, probably with ganesha caching related stuff. Hence it is more easy for human eye to catch the discrepancy.
upstream mainline patch http://review.gluster.org/16329 posted for review.
https://code.engineering.redhat.com/gerrit/#/c/94317/
QATP: ran the testcase which was mentioned while raising the bug, the case passed Other cases: saw a problem with truncate of file fixing atime as epoch starting time, raised a new bug https://bugzilla.redhat.com/show_bug.cgi?id=1411277 truncate setting accesstime of a file (atime) to unix epoch beginning time(01-01-1970) Also, noticed that there were some atime inconsistencies, like below on ganesha setup,with mdcache settings enabled when we access the file, the atime changes, on the local machine, but is not saved on the brick side, hence the atime is lost later. I will discuss and raise a new bug accordingly Moving to verified [root@dhcp35-37 ~]# gluster v status disperse Status of volume: disperse Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.35.37:/rhs/brick2/disperse 49153 0 Y 20857 Brick 10.70.35.116:/rhs/brick2/disperse 49153 0 Y 18063 Brick 10.70.35.239:/rhs/brick2/disperse 49153 0 Y 18140 Brick 10.70.35.135:/rhs/brick2/disperse 49154 0 Y 4732 Brick 10.70.35.8:/rhs/brick2/disperse 49154 0 Y 4737 Brick 10.70.35.196:/rhs/brick2/disperse 49152 0 Y 17851 Brick 10.70.35.37:/rhs/brick3/disperse 49154 0 Y 20877 Brick 10.70.35.116:/rhs/brick3/disperse 49154 0 Y 18083 Brick 10.70.35.239:/rhs/brick3/disperse 49154 0 Y 18160 Brick 10.70.35.135:/rhs/brick3/disperse 49155 0 Y 4752 Brick 10.70.35.8:/rhs/brick3/disperse 49155 0 Y 4757 Brick 10.70.35.196:/rhs/brick3/disperse 49153 0 Y 17871 Self-heal Daemon on localhost N/A N/A Y 8229 Self-heal Daemon on 10.70.35.135 N/A N/A Y 5022 Self-heal Daemon on 10.70.35.196 N/A N/A Y 5010 Self-heal Daemon on 10.70.35.239 N/A N/A Y 5265 Self-heal Daemon on 10.70.35.8 N/A N/A Y 5042 Self-heal Daemon on 10.70.35.116 N/A N/A Y 5273 Task Status of Volume disperse ------------------------------------------------------------------------------ There are no active volume tasks [root@dhcp35-37 ~]# gluster v info disperse Volume Name: disperse Type: Distributed-Disperse Volume ID: 53f82e48-d215-4301-be59-5f3e58f8b846 Status: Started Snapshot Count: 0 Number of Bricks: 2 x (4 + 2) = 12 Transport-type: tcp Bricks: Brick1: 10.70.35.37:/rhs/brick2/disperse Brick2: 10.70.35.116:/rhs/brick2/disperse Brick3: 10.70.35.239:/rhs/brick2/disperse Brick4: 10.70.35.135:/rhs/brick2/disperse Brick5: 10.70.35.8:/rhs/brick2/disperse Brick6: 10.70.35.196:/rhs/brick2/disperse Brick7: 10.70.35.37:/rhs/brick3/disperse Brick8: 10.70.35.116:/rhs/brick3/disperse Brick9: 10.70.35.239:/rhs/brick3/disperse Brick10: 10.70.35.135:/rhs/brick3/disperse Brick11: 10.70.35.8:/rhs/brick3/disperse Brick12: 10.70.35.196:/rhs/brick3/disperse Options Reconfigured: cluster.shd-max-threads: 5 performance.md-cache-timeout: 600 performance.cache-invalidation: on performance.stat-prefetch: on features.cache-invalidation-timeout: 600 ganesha.enable: on features.cache-invalidation: off transport.address-family: inet performance.readdir-ahead: on nfs.disable: on nfs-ganesha: enable cluster.enable-shared-storage: enable [root@dhcp35-37 ~]# rpm -qa|egrep "ganesha|gluster" glusterfs-events-3.8.4-11.el7rhgs.x86_64 glusterfs-rdma-3.8.4-11.el7rhgs.x86_64 glusterfs-api-3.8.4-11.el7rhgs.x86_64 glusterfs-server-3.8.4-11.el7rhgs.x86_64 nfs-ganesha-gluster-2.4.1-3.el7rhgs.x86_64 nfs-ganesha-debuginfo-2.4.1-3.el7rhgs.x86_64 glusterfs-libs-3.8.4-11.el7rhgs.x86_64 glusterfs-cli-3.8.4-11.el7rhgs.x86_64 nfs-ganesha-2.4.1-3.el7rhgs.x86_64 glusterfs-3.8.4-11.el7rhgs.x86_64 glusterfs-fuse-3.8.4-11.el7rhgs.x86_64 glusterfs-debuginfo-3.8.4-11.el7rhgs.x86_64 glusterfs-client-xlators-3.8.4-11.el7rhgs.x86_64 python-gluster-3.8.4-11.el7rhgs.noarch glusterfs-ganesha-3.8.4-11.el7rhgs.x86_64 glusterfs-geo-replication-3.8.4-11.el7rhgs.x86_64 [root@dhcp35-37 ~]#
TestCaseId:RHG3-11784
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html