Bug 761775 (GLUSTER-43)

Summary: client memory leak
Product: [Community] GlusterFS Reporter: Basavanagowda Kanur <gowda>
Component: coreAssignee: Anand Avati <aavati>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: mainlineCC: chrisw, gluster-bugs, gowda
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTNR Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Basavanagowda Kanur 2009-06-24 04:16:54 UTC
[Migrated from savannah BTS] - bug []
Thu 05 Feb 2009 08:17:41 PM GMT, original submission:

I have memory increase in newest glusterfs client glusterfs-2.0.0rc1:

Leak is about 1 Mb per minute during usage. For example
# ps aux|grep glusterfs
root 16674 1.8 2.6 246676 216344 ? Ssl 10:25 5:06 glusterfs -f client.vol /mnt/gluster/

-------------------------------------------------------------
client config:
cat client.vol

volume home1
type protocol/client
option transport-type tcp/client
option remote-host xx.xx.xx.xx # IP address of the remote brick
option remote-subvolume home # name of the remote volumei
option transport-timeout 30
end-volume

volume home2
type protocol/client
option transport-type tcp/client
option remote-host xx.xx.xx.xx # IP address of the remote brick
option remote-subvolume home # name of the remote volume
option transport-timeout 10
end-volume

volume home-ha
type cluster/ha
subvolumes home1 home2
end-volume
----------------------------------------------------------------
client machine is 64bit kernel 2.6.25.4
fuse is compiled inside kernel

Leak is present with both standard fuse 2.7.4 and your custom fuse

I logged with WARNING level, there is nothing in log file, except
2009-02-05 10:02:45 E [client-protocol.c:263:call_bail] home2: activating bail-out. pending frames = 1. last sent = 2009-02-05 10:02:29. last received = 2009-02-05 10:02:29. transport-timeout = 10
2009-02-05 10:02:45 C [client-protocol.c:298:call_bail] home2: bailing transport
2009-02-05 10:02:45 E [saved-frames.c:148:saved_frames_unwind] home2: forced unwinding frame type(1) op(OPEN)
2009-02-05 10:25:34 W [fuse-bridge.c:2526:fuse_thread_proc] fuse: unmounting /mnt/gluster/
2009-02-05 10:25:34 W [glusterfsd.c:775:cleanup_and_exit] glusterfs: shutting down
-------------------------------------------------------------
server1 configs:

volume posix
type storage/posix
option directory /home2
end-volume

volume brick
type features/posix-locks
subvolumes posix
end-volume

volume krishna
type protocol/client
option transport-type tcp/client
option remote-host xx.xx.xx.xx
option remote-subvolume brick
end-volume

volume rama
type protocol/client
option transport-type tcp/client
option remote-host xx.xx.xxx.xx
option remote-subvolume brick
end-volume

volume home
type cluster/afr
option read-subvolume krishna
subvolumes krishna rama
end-volume

volume server
type protocol/server
option transport-type tcp/server
# solaire tornado
option auth.ip.brick.allow *
option auth.ip.home.allow xxx.xx.xxx.xxx,xx.xx.xxx.xx
subvolumes brick home
end-volume

----------------------------------------------------------------
server2 config:
volume posix
type storage/posix
option directory /home2
end-volume

volume brick
type features/posix-locks
subvolumes posix
end-volume

#volume brick
# type performance/io-threads
# option thread-count 8
# subvolumes locks
#end-volume

volume krishna
type protocol/client
option transport-type tcp/client
option remote-host 64.88.252.51
option remote-subvolume brick
end-volume

volume rama
type protocol/client
option transport-type tcp/client
option remote-host 64.88.252.15
option remote-subvolume brick
end-volume

volume home
type cluster/afr
option read-subvolume rama
subvolumes krishna rama
end-volume

volume server
type protocol/server
option transport-type tcp/server
option auth.ip.brick.allow xx.xx.xx.xx
option auth.ip.home.allow xx.xx.xx.xx
subvolumes brick home
end-volume

glusterfs server memory is VZS 100 Mb, although it started with 33. Maybe they also leak something, but not much

Let me know if you need any other info.

Hrvoje
--------------------------------------------------------------------------------
Mon 09 Feb 2009 06:36:15 AM GMT, comment #1 by 	Raghavendra <raghavendra>:

What were the operations being run on glusterfs mount point when the client was leaking memory?

--------------------------------------------------------------------------------


Mon 09 Feb 2009 10:09:57 AM GMT, comment #2 by 	Hrvoje <hrvoje>:

Copying files with cp command

big number of small jpg files in tree that runs 3 levels deep.

/xxx/yyy/zzz/some.jpg

Thousands of them

Copy from hard drive to gluster fs mounted point

Hrvoje
--------------------------------------------------------------------------------

Sat 21 Mar 2009 10:21:20 PM GMT, comment #3 by 	Hrvoje <hrvoje>:

Same error is happening with 2.0.0rc4 also.

Even without afr, most simple setup also leaks:

----------------------------------------------
# SERVER:
volume posix
type storage/posix
option directory /home2
end-volume

volume brick
type features/posix-locks
subvolumes posix
end-volume

volume server
type protocol/server
option transport-type tcp/server
option auth.addr.brick.allow x
subvolumes brick
end-volume

-----------------------------------------
# CLIENT

volume home2
type protocol/client
option transport-type tcp/client
option remote-host 64.88.252.51
option remote-subvolume brick
option transport-timeout 10
end-volume

This is busy server that is doing other things, but this is situation after running for 90 minutes, copying files the whole time:

root 12501 1.9 0.5 72892 42364 ? Ssl 16:36 2:00 glusterfs -f client-afr.vol /mnt/gluster

--------------------------------------------------------------------------------
Thu 09 Apr 2009 11:14:06 AM GMT, comment #4 by 	Krzysztof Strasburger <strasbur>:

I observe similar client memory leak with the older, 1.3.12 release.
It is sufficient to run du on a big directory tree.
My setup is quite complicated (6 pairs of replicated data, the volume is distributed over these 6 pairs)

--------------------------------------------------------------------------------
Wed 22 Apr 2009 06:16:49 AM GMT, comment #5 by Krzysztof Strasburger <strasbur>:

Memory still leaks with the 2.0.0rc8 client. While runing "du on a freshly mounted, quite big volume (/home with c.a. 14 GB of files), the memory consumption increases steadily from 15 to 130 MB. My setup is as follows: 6 afr (2x) volumes, unify over these 6 volumes. This leak is probably not setup- or even translator-dependent, as "stripe" leaks memory as well as "unify over afr" (tried that out). Servers are all OK.