Bug 1368842

Summary: Applications not calling glfs_h_poll_upcall() have upcall events cached for no use
Product: [Community] GlusterFS Reporter: Niels de Vos <ndevos>
Component: libgfapiAssignee: Niels de Vos <ndevos>
Status: CLOSED CURRENTRELEASE QA Contact: Sudhir D <sdharane>
Severity: medium Docs Contact:
Priority: medium    
Version: mainlineCC: bugs, pgurusid
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.9.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1368841
: 1368843 (view as bug list) Environment:
Last Closed: 2017-03-27 18:19:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1211863, 1368841, 1368843    

Description Niels de Vos 2016-08-21 19:38:30 UTC
+++ This bug was initially created as a clone of Bug #1368841 +++

Description of problem:
When a volume has upcalls (features.cache-invalidation) enabled, but does not call glfs_h_poll_upcall(), the upcall events accumulate in a list that is toed to the 'struct glfs'. The application might not be interested in the events, causing the list to grow indefinitely. This manifests in some sort of memory leak.

Version-Release number of selected component (if applicable):
3.7 and upwards

How reproducible:
100%

Steps to Reproduce:
1. install glusterfs-coreutils and open a connection like
      # gfcli glfs://localhost/test-volume
2. mount the volume over fuse
3. over fuse, create a file like
      # while sleep 0.01 ; do date >> TIME ; done
4. over the glfcli, do a regular 'tail TIME'
5. in a 3rd terminal, watch 'top -p $(pidof gfcli)' increasing memory

Actual results:
Memory of gfcli grows slowly, but constantly.

Expected results:
gfcli should have stable memory usage.

Additional info:
Reported in relation to md-cache improvements by Poornima.

Comment 1 Vijay Bellur 2016-08-21 19:41:41 UTC
REVIEW: http://review.gluster.org/15191 (gfapi: do not cache upcalls if the application is not interested) posted (#2) for review on master by Niels de Vos (ndevos)

Comment 2 Worker Ant 2016-08-25 20:35:47 UTC
COMMIT: http://review.gluster.org/15191 committed in master by Niels de Vos (ndevos) 
------
commit 218c9b033fa44eacbc27d87491abd830548b362e
Author: Niels de Vos <ndevos>
Date:   Wed Aug 17 16:44:55 2016 +0200

    gfapi: do not cache upcalls if the application is not interested
    
    When the volume option 'features.cache-invalidation' is enabled, upcall
    events are sent from the brick process to the client. Even if the client
    is not interested in upcall events itself, md-cache or other xlators may
    benefit from them.
    
    By adding a new 'cache_upcalls' boolean in the 'struct glfs', we can
    enable the caching of upcalls when the application called
    glfs_h_poll_upcall(). NFS-Ganesha sets up a thread for handling upcalls
    in the initialization phase, and calls glfs_h_poll_upcall() before any
    NFS-client accesses the NFS-export.
    
    In the future there will be a more flexible registration API for
    enabling certain kind of upcall events. Until that is available, this
    should work just fine.
    
    Verificatio of this change is not trivial within our current regression
    test framework. The bug report contains a description on how to reliably
    reproduce the problem with the glusterfs-coreutils.
    
    Change-Id: I818595c92db50e6e48f7bfe287ee05103a4a30a2
    BUG: 1368842
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: http://review.gluster.org/15191
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Poornima G <pgurusid>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: soumya k <skoduri>
    Reviewed-by: Kaleb KEITHLEY <kkeithle>

Comment 3 Shyamsundar 2017-03-27 18:19:57 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.9.0, please open a new bug report.

glusterfs-3.9.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2016-November/029281.html
[2] https://www.gluster.org/pipermail/gluster-users/