Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1392167

Summary:	SMB[md-cache Private Build]:Error messages in brick logs related to upcall_cache_invalidate gf_uuid_is_null
Product:	[Community] GlusterFS	Reporter:	Poornima G <pgurusid>
Component:	upcall	Assignee:	bugs <bugs>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	medium	Docs Contact:
Priority:	medium
Version:	mainline	CC:	amukherj, bugs, rjoseph, sbhaloth, tdesala
Target Milestone:	---	Keywords:	Triaged
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	glusterfs-3.10.0	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1379935
Clones:	1394186 (view as bug list)		Environment:
Last Closed:	2017-03-06 17:32:36 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1379935
Bug Blocks:	1389422, 1394186, 1394187, 1394188

Description Poornima G 2016-11-05 13:27:03 UTC

+++ This bug was initially created as a clone of Bug #1379935 +++

Description of problem:
*******************************

On running test cases related to AFR self heal on creating multiple files and directories , there are log errors related to gf_uuid_is_null in upcall_cache_invalidate.

The errors occurs where there is file create/write operation.

[2016-09-27 06:35:20.639063] E [upcall-internal.c:512:upcall_cache_invalidate] (-->/usr/lib64/glusterfs/3.8.3/xlator/features/access-control.so(+0xad09) [0x7fd90aca4d09] -->/usr/lib64/glusterfs/3.8.3/xlator/features/locks.so(+0xd4f2) [0x7fd90aa834f2] -->/usr/lib64/glusterfs/3.8.3/xlator/features/upcall.so(+0x6cae) [0x7fd90a238cae] ) 0-upcall_cache_invalidate: invalid argument: !(gf_uuid_is_null (up_inode_ctx->gfid)) [Invalid argument]
[2016-09-27 06:35:22.717196] E [upcall-internal.c:512:upcall_cache_invalidate] (-->/usr/lib64/glusterfs/3.8.3/xlator/features/access-control.so(+0xad09) [0x7fd90aca4d09] -->/usr/lib64/glusterfs/3.8.3/xlator/features/locks.so(+0xd4f2) [0x7fd90aa834f2] -->/usr/lib64/glusterfs/3.8.3/xlator/features/upcall.so(+0x6cae) [0x7fd90a238cae] ) 0-upcall_cache_invalidate: invalid argument: !(gf_uuid_is_null (up_inode_ctx->gfid)) [Invalid argument]
[2016-09-27 06:36:20.976194] E [upcall-internal.c:512:upcall_cache_invalidate] (-->/usr/lib64/glusterfs/3.8.3/xlator/features/access-control.so(+0xad09) [0x7fd90aca4d09] -->/usr/lib64/glusterfs/3.8.3/xlator/features/locks.so(+0xd4f2) [0x7fd90aa834f2] -->/usr/lib64/glusterfs/3.8.3/xlator/features/upcall.so(+0x6cae) [0x7fd90a238cae] ) 0-upcall_cache_invalidate: invalid argument: !(gf_uuid_is_null (up_inode_ctx->gfid)) [Invalid argument]

Volume Name: vol1
Type: Distributed-Replicate
Volume ID: afb432f1-55d4-463c-bd50-ba19a63561e3
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.47.64:/mnt/brick/vol1/b1
Brick2: 10.70.47.66:/mnt/brick/vol1/b2
Brick3: 10.70.47.64:/mnt/brick/vol1/b3
Brick4: 10.70.47.66:/mnt/brick/vol1/b4
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
storage.batch-fsync-delay-usec: 0
server.allow-insecure: on
performance.stat-prefetch: on
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.cache-samba-metadata: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
cluster.metadata-self-heal: off
cluster.data-self-heal: off
cluster.entry-self-heal: off
cluster.self-heal-daemon: on
cluster.data-self-heal-algorithm: full



Version-Release number of selected component (if applicable):
glusterfs-3.8.3-0.39.git97d1dde.el7.x86_64


How reproducible:
Always 

Steps to Reproduce:
1. Create 2x2 volume, se vol options for md-cache , mount it on cifs
2. Start creating directories and files and run afr cases with brick down
3. observe logs for any errors

Actual results:
Lots of error messages related to upcall_cache_invalidate on create/write.


Expected results:
These type of error messages shouldn't be there.

Additional info:

--- Additional comment from Prasad Desala on 2016-10-14 09:34:05 EDT ---

Same issue is seen with private glusterfs build: 3.8.4-2.26.git0a405a4.el7rhgs.x86_64

This issue is specific to md-cache upcall, with the setup in same state disabled below md-cache options (see gluster v info ouput for more info) and ERROR messages were not seen in brick logs. Enabling the md-cache started spamming the brick logs with the error messages.

Steps that were performed:

1. Create a distributed replica volume and started it.
2. Enabled md-cache supported options to the volume. Please see below gluster v info for the more details on the md-cache enabled options.
3. Mounted volume on multiple clients. Simultaneosuly, from one client touch 10000 files and from another client create 10000 hardlinks for the same file.
4. Add few bricks and start rebalance.
5. Once the rebalance is completed, remove all the files on the mount point using "rm -rf".
6.  Check for brick logs, for the invalid argument messages.

--- Additional comment from Prasad Desala on 2016-10-14 09:34:58 EDT ---


[root@dhcp42-185 ~]# gluster v status
Status of volume: distrep
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.42.185:/bricks/brick0/b0        49152     0          Y       16587
Brick 10.70.43.152:/bricks/brick0/b0        49152     0          Y       19074
Brick 10.70.42.39:/bricks/brick0/b0         49152     0          Y       19263
Brick 10.70.42.84:/bricks/brick0/b0         49152     0          Y       19630
Brick 10.70.42.185:/bricks/brick1/b1        49153     0          Y       16607
Brick 10.70.43.152:/bricks/brick1/b1        49153     0          Y       19094
Brick 10.70.42.39:/bricks/brick1/b1         49153     0          Y       19283
Brick 10.70.42.84:/bricks/brick1/b1         49153     0          Y       19650
Brick 10.70.42.185:/bricks/brick2/b2        49154     0          Y       16627
Brick 10.70.43.152:/bricks/brick2/b2        49154     0          Y       19114
Brick 10.70.42.39:/bricks/brick2/b2         49154     0          Y       19303
Brick 10.70.42.84:/bricks/brick2/b2         49154     0          Y       19670
Brick 10.70.42.185:/bricks/brick3/b3        49155     0          Y       19472
Brick 10.70.43.152:/bricks/brick3/b3        49155     0          Y       19380
NFS Server on localhost                     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       19493
NFS Server on 10.70.42.39                   N/A       N/A        N       N/A  
Self-heal Daemon on 10.70.42.39             N/A       N/A        Y       19588
NFS Server on 10.70.42.84                   N/A       N/A        N       N/A  
Self-heal Daemon on 10.70.42.84             N/A       N/A        Y       19979
NFS Server on 10.70.43.152                  N/A       N/A        N       N/A  
Self-heal Daemon on 10.70.43.152            N/A       N/A        Y       19401
 
Task Status of Volume distrep
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : 19b1127e-246e-4afd-b59b-9690b9569122
Status               : completed           
 
[root@dhcp42-185 ~]# gluster v info
 
Volume Name: distrep
Type: Distributed-Replicate
Volume ID: 4ad479e4-fa01-4d91-8743-4e1510ba2c13
Status: Started
Snapshot Count: 0
Number of Bricks: 7 x 2 = 14
Transport-type: tcp
Bricks:
Brick1: 10.70.42.185:/bricks/brick0/b0
Brick2: 10.70.43.152:/bricks/brick0/b0
Brick3: 10.70.42.39:/bricks/brick0/b0
Brick4: 10.70.42.84:/bricks/brick0/b0
Brick5: 10.70.42.185:/bricks/brick1/b1
Brick6: 10.70.43.152:/bricks/brick1/b1
Brick7: 10.70.42.39:/bricks/brick1/b1
Brick8: 10.70.42.84:/bricks/brick1/b1
Brick9: 10.70.42.185:/bricks/brick2/b2
Brick10: 10.70.43.152:/bricks/brick2/b2
Brick11: 10.70.42.39:/bricks/brick2/b2
Brick12: 10.70.42.84:/bricks/brick2/b2
Brick13: 10.70.42.185:/bricks/brick3/b3
Brick14: 10.70.43.152:/bricks/brick3/b3
Options Reconfigured:
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
transport.address-family: inet
performance.readdir-ahead: on

--- Additional comment from Atin Mukherjee on 2016-10-15 06:35:05 EDT ---

(In reply to Prasad Desala from comment #1)
> Same issue is seen with private glusterfs build:
> 3.8.4-2.26.git0a405a4.el7rhgs.x86_64

Prasad - this is an upstream bug and no reference of downstream build should be mentioned here. If required, please file a downstream bug to track the issue.
> 
> This issue is specific to md-cache upcall, with the setup in same state
> disabled below md-cache options (see gluster v info ouput for more info) and
> ERROR messages were not seen in brick logs. Enabling the md-cache started
> spamming the brick logs with the error messages.
> 
> Steps that were performed:
> 
> 1. Create a distributed replica volume and started it.
> 2. Enabled md-cache supported options to the volume. Please see below
> gluster v info for the more details on the md-cache enabled options.
> 3. Mounted volume on multiple clients. Simultaneosuly, from one client touch
> 10000 files and from another client create 10000 hardlinks for the same file.
> 4. Add few bricks and start rebalance.
> 5. Once the rebalance is completed, remove all the files on the mount point
> using "rm -rf".
> 6.  Check for brick logs, for the invalid argument messages.

--- Additional comment from Poornima G on 2016-10-19 01:27:30 EDT ---

The fix for this would be to reduce the loglevel from Error to Debug, this bug is not introduced as a part of md-cache changes.

--- Additional comment from Worker Ant on 2016-11-04 06:47:12 EDT ---

REVIEW: http://review.gluster.org/15777 (upcall: Fix a log level) posted (#1) for review on master by Poornima G (pgurusid)

Comment 1 Worker Ant 2016-11-08 20:44:51 UTC

COMMIT: http://review.gluster.org/15777 committed in master by Jeff Darcy (jdarcy) 
------
commit 68d1480e6056d1be91cde5129a6809642eeee857
Author: Poornima G <pgurusid>
Date:   Fri Nov 4 15:56:42 2016 +0530

    upcall: Fix a log level
    
    In upcall_cache_invalidation(), the gfid can be NULL in certain
    valid test cases(eg: entry for ".." in readdirp), hence change
    the log level from WARNING to DEBUG.
    
    Change-Id: Ic90167a0e2076694e9131913114460df7b939b30
    BUG: 1392167
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/15777
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Rajesh Joseph <rjoseph>
    Reviewed-by: Jeff Darcy <jdarcy>

Comment 2 Shyamsundar 2017-03-06 17:32:36 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/