Description of problem: There is a huge memory leak when gluster get-state command issued with geo-replication session is configured. Version-Release number of selected component (if applicable): 3.4.0 How reproducible: Always Steps to Reproduce: 1. Configure a geo-replication session 2. run gluster get-state for 1000 times We can see glusterd memory is increasing. Before executing the commands: glusterd's memory is 13360 KB After executing get-state for 1000 times: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9095 root 20 0 669264 36296 7400 S 0.7 3.6 0:39.33 glusterd Actual results: glusterd's memory is increasing. Expected results: glusterd's memory should not increase. Setup has 2 volumes: [root@server3 rhs-glusterfs]# gluster v info Volume Name: master Type: Distributed-Replicate Volume ID: c63b71a9-c3c7-4fa7-8077-646556f70edc Status: Started Snapshot Count: 0 Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: server3:/root/bricks/brick1 Brick2: server3:/root/bricks/brick2 Brick3: server3:/root/bricks/brick3 Brick4: server3:/root/bricks/brick4 Options Reconfigured: performance.client-io-threads: off nfs.disable: on transport.address-family: inet geo-replication.indexing: on geo-replication.ignore-pid-check: on changelog.changelog: on Volume Name: slave Type: Distributed-Replicate Volume ID: 8d81cfad-508e-4e14-8754-63d3345e26ca Status: Started Snapshot Count: 0 Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: server3:/root/bricks/brick5 Brick2: server3:/root/bricks/brick6 Brick3: server3:/root/bricks/brick7 Brick4: server3:/root/bricks/brick8 Options Reconfigured: performance.client-io-threads: off nfs.disable: on transport.address-family: inet [root@server3 rhs-glusterfs]# gluster v geo-rep status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED -------------------------------------------------------------------------------------------------------------------------------------------- server3 master /root/bricks/brick1 root ssh://server3::slave N/A Faulty N/A N/A server3 master /root/bricks/brick2 root ssh://server3::slave N/A Faulty N/A N/A server3 master /root/bricks/brick3 root ssh://server3::slave N/A Faulty N/A N/A server3 master /root/bricks/brick4 root ssh://server3::slave N/A Faulty N/A N/A [root@server3 rhs-glusterfs]#
Build: 3.12.2-15 1. Created a two clusters of 3 nodes each. 2. Setup a geo-replication session with 2X3 volumes. 3. Triggered IO on master and started doing gluster get-state continuosly for 30K times on master node Before the fix it was 20MB increase of glusterd memory for 1000 times. There is an increase of 20 MB for 30k(30,000) times triggering gluster get-state which is in acceptable range. Hence marking it as verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607