+++ This bug was initially created as a clone of Bug #1379935 +++ Description of problem: ******************************* On running test cases related to AFR self heal on creating multiple files and directories , there are log errors related to gf_uuid_is_null in upcall_cache_invalidate. The errors occurs where there is file create/write operation. [2016-09-27 06:35:20.639063] E [upcall-internal.c:512:upcall_cache_invalidate] (-->/usr/lib64/glusterfs/3.8.3/xlator/features/access-control.so(+0xad09) [0x7fd90aca4d09] -->/usr/lib64/glusterfs/3.8.3/xlator/features/locks.so(+0xd4f2) [0x7fd90aa834f2] -->/usr/lib64/glusterfs/3.8.3/xlator/features/upcall.so(+0x6cae) [0x7fd90a238cae] ) 0-upcall_cache_invalidate: invalid argument: !(gf_uuid_is_null (up_inode_ctx->gfid)) [Invalid argument] [2016-09-27 06:35:22.717196] E [upcall-internal.c:512:upcall_cache_invalidate] (-->/usr/lib64/glusterfs/3.8.3/xlator/features/access-control.so(+0xad09) [0x7fd90aca4d09] -->/usr/lib64/glusterfs/3.8.3/xlator/features/locks.so(+0xd4f2) [0x7fd90aa834f2] -->/usr/lib64/glusterfs/3.8.3/xlator/features/upcall.so(+0x6cae) [0x7fd90a238cae] ) 0-upcall_cache_invalidate: invalid argument: !(gf_uuid_is_null (up_inode_ctx->gfid)) [Invalid argument] [2016-09-27 06:36:20.976194] E [upcall-internal.c:512:upcall_cache_invalidate] (-->/usr/lib64/glusterfs/3.8.3/xlator/features/access-control.so(+0xad09) [0x7fd90aca4d09] -->/usr/lib64/glusterfs/3.8.3/xlator/features/locks.so(+0xd4f2) [0x7fd90aa834f2] -->/usr/lib64/glusterfs/3.8.3/xlator/features/upcall.so(+0x6cae) [0x7fd90a238cae] ) 0-upcall_cache_invalidate: invalid argument: !(gf_uuid_is_null (up_inode_ctx->gfid)) [Invalid argument] Volume Name: vol1 Type: Distributed-Replicate Volume ID: afb432f1-55d4-463c-bd50-ba19a63561e3 Status: Started Snapshot Count: 0 Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.70.47.64:/mnt/brick/vol1/b1 Brick2: 10.70.47.66:/mnt/brick/vol1/b2 Brick3: 10.70.47.64:/mnt/brick/vol1/b3 Brick4: 10.70.47.66:/mnt/brick/vol1/b4 Options Reconfigured: nfs.disable: on performance.readdir-ahead: on transport.address-family: inet storage.batch-fsync-delay-usec: 0 server.allow-insecure: on performance.stat-prefetch: on features.cache-invalidation: on features.cache-invalidation-timeout: 600 performance.cache-samba-metadata: on performance.cache-invalidation: on performance.md-cache-timeout: 600 cluster.metadata-self-heal: off cluster.data-self-heal: off cluster.entry-self-heal: off cluster.self-heal-daemon: on cluster.data-self-heal-algorithm: full Version-Release number of selected component (if applicable): glusterfs-3.8.3-0.39.git97d1dde.el7.x86_64 How reproducible: Always Steps to Reproduce: 1. Create 2x2 volume, se vol options for md-cache , mount it on cifs 2. Start creating directories and files and run afr cases with brick down 3. observe logs for any errors Actual results: Lots of error messages related to upcall_cache_invalidate on create/write. Expected results: These type of error messages shouldn't be there. Additional info: --- Additional comment from Prasad Desala on 2016-10-14 09:34:05 EDT --- Same issue is seen with private glusterfs build: 3.8.4-2.26.git0a405a4.el7rhgs.x86_64 This issue is specific to md-cache upcall, with the setup in same state disabled below md-cache options (see gluster v info ouput for more info) and ERROR messages were not seen in brick logs. Enabling the md-cache started spamming the brick logs with the error messages. Steps that were performed: 1. Create a distributed replica volume and started it. 2. Enabled md-cache supported options to the volume. Please see below gluster v info for the more details on the md-cache enabled options. 3. Mounted volume on multiple clients. Simultaneosuly, from one client touch 10000 files and from another client create 10000 hardlinks for the same file. 4. Add few bricks and start rebalance. 5. Once the rebalance is completed, remove all the files on the mount point using "rm -rf". 6. Check for brick logs, for the invalid argument messages. --- Additional comment from Prasad Desala on 2016-10-14 09:34:58 EDT --- [root@dhcp42-185 ~]# gluster v status Status of volume: distrep Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.42.185:/bricks/brick0/b0 49152 0 Y 16587 Brick 10.70.43.152:/bricks/brick0/b0 49152 0 Y 19074 Brick 10.70.42.39:/bricks/brick0/b0 49152 0 Y 19263 Brick 10.70.42.84:/bricks/brick0/b0 49152 0 Y 19630 Brick 10.70.42.185:/bricks/brick1/b1 49153 0 Y 16607 Brick 10.70.43.152:/bricks/brick1/b1 49153 0 Y 19094 Brick 10.70.42.39:/bricks/brick1/b1 49153 0 Y 19283 Brick 10.70.42.84:/bricks/brick1/b1 49153 0 Y 19650 Brick 10.70.42.185:/bricks/brick2/b2 49154 0 Y 16627 Brick 10.70.43.152:/bricks/brick2/b2 49154 0 Y 19114 Brick 10.70.42.39:/bricks/brick2/b2 49154 0 Y 19303 Brick 10.70.42.84:/bricks/brick2/b2 49154 0 Y 19670 Brick 10.70.42.185:/bricks/brick3/b3 49155 0 Y 19472 Brick 10.70.43.152:/bricks/brick3/b3 49155 0 Y 19380 NFS Server on localhost N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 19493 NFS Server on 10.70.42.39 N/A N/A N N/A Self-heal Daemon on 10.70.42.39 N/A N/A Y 19588 NFS Server on 10.70.42.84 N/A N/A N N/A Self-heal Daemon on 10.70.42.84 N/A N/A Y 19979 NFS Server on 10.70.43.152 N/A N/A N N/A Self-heal Daemon on 10.70.43.152 N/A N/A Y 19401 Task Status of Volume distrep ------------------------------------------------------------------------------ Task : Rebalance ID : 19b1127e-246e-4afd-b59b-9690b9569122 Status : completed [root@dhcp42-185 ~]# gluster v info Volume Name: distrep Type: Distributed-Replicate Volume ID: 4ad479e4-fa01-4d91-8743-4e1510ba2c13 Status: Started Snapshot Count: 0 Number of Bricks: 7 x 2 = 14 Transport-type: tcp Bricks: Brick1: 10.70.42.185:/bricks/brick0/b0 Brick2: 10.70.43.152:/bricks/brick0/b0 Brick3: 10.70.42.39:/bricks/brick0/b0 Brick4: 10.70.42.84:/bricks/brick0/b0 Brick5: 10.70.42.185:/bricks/brick1/b1 Brick6: 10.70.43.152:/bricks/brick1/b1 Brick7: 10.70.42.39:/bricks/brick1/b1 Brick8: 10.70.42.84:/bricks/brick1/b1 Brick9: 10.70.42.185:/bricks/brick2/b2 Brick10: 10.70.43.152:/bricks/brick2/b2 Brick11: 10.70.42.39:/bricks/brick2/b2 Brick12: 10.70.42.84:/bricks/brick2/b2 Brick13: 10.70.42.185:/bricks/brick3/b3 Brick14: 10.70.43.152:/bricks/brick3/b3 Options Reconfigured: performance.md-cache-timeout: 600 performance.cache-invalidation: on performance.stat-prefetch: on features.cache-invalidation-timeout: 600 features.cache-invalidation: on transport.address-family: inet performance.readdir-ahead: on --- Additional comment from Atin Mukherjee on 2016-10-15 06:35:05 EDT --- (In reply to Prasad Desala from comment #1) > Same issue is seen with private glusterfs build: > 3.8.4-2.26.git0a405a4.el7rhgs.x86_64 Prasad - this is an upstream bug and no reference of downstream build should be mentioned here. If required, please file a downstream bug to track the issue. > > This issue is specific to md-cache upcall, with the setup in same state > disabled below md-cache options (see gluster v info ouput for more info) and > ERROR messages were not seen in brick logs. Enabling the md-cache started > spamming the brick logs with the error messages. > > Steps that were performed: > > 1. Create a distributed replica volume and started it. > 2. Enabled md-cache supported options to the volume. Please see below > gluster v info for the more details on the md-cache enabled options. > 3. Mounted volume on multiple clients. Simultaneosuly, from one client touch > 10000 files and from another client create 10000 hardlinks for the same file. > 4. Add few bricks and start rebalance. > 5. Once the rebalance is completed, remove all the files on the mount point > using "rm -rf". > 6. Check for brick logs, for the invalid argument messages. --- Additional comment from Poornima G on 2016-10-19 01:27:30 EDT --- The fix for this would be to reduce the loglevel from Error to Debug, this bug is not introduced as a part of md-cache changes. --- Additional comment from Worker Ant on 2016-11-04 06:47:12 EDT --- REVIEW: http://review.gluster.org/15777 (upcall: Fix a log level) posted (#1) for review on master by Poornima G (pgurusid)
COMMIT: http://review.gluster.org/15777 committed in master by Jeff Darcy (jdarcy) ------ commit 68d1480e6056d1be91cde5129a6809642eeee857 Author: Poornima G <pgurusid> Date: Fri Nov 4 15:56:42 2016 +0530 upcall: Fix a log level In upcall_cache_invalidation(), the gfid can be NULL in certain valid test cases(eg: entry for ".." in readdirp), hence change the log level from WARNING to DEBUG. Change-Id: Ic90167a0e2076694e9131913114460df7b939b30 BUG: 1392167 Signed-off-by: Poornima G <pgurusid> Reviewed-on: http://review.gluster.org/15777 NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> Reviewed-by: Rajesh Joseph <rjoseph> Reviewed-by: Jeff Darcy <jdarcy>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report. glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html [2] https://www.gluster.org/pipermail/gluster-users/