Red Hat Bugzilla – Bug 109368
rsync (all versions to 2.5.6) not returning memory properly
Last modified: 2014-08-31 19:25:37 EDT
Description of problem:
rsync (up to and including 2.5.6) does not seem to release memory properly once a
task is completed. See the following example:
free (before rsync)
total used free shared buffers cached
Mem: 501 279 221 0 39 177
rsync -rpa /home /tmp/test
(for the sake of the test, /home was 453MB)
free (after rsync)
total used free shared buffers cached
Mem: 501 490 10 0 43 383
Here is where things get interesting, IF the destination directory is NOT on a remote
machine, meaning if it is local, there is a partial return of used memory IF you delete
the destination directory. ex:
rm -rf /tmp/test
free(after removing destination directory)
Mem: 501 294 207 0 43 184
-/+ buffers/cache: 65 436
IF you do not remove the destination directory, the memory does not seem to get
returned at all, if you are sync'ing directories to a remote machine then the memory is
utilized and not returned until reboot.
Version-Release number of selected component (if applicable):
Tested on RH versions 8.0 , 9.0 with rsync 2.5.5 and 2.5.6
tested on single processor Athlon 1700 512MB , Deall presicion 650 dual Xeon
2.7 3GB Ram, single processor Pentium IV 1GB Ram
reproducible every time for me
Steps to Reproduce:
1. free -m, cat /proc/meminfo
2. rsync -rpva <decent sized directory> /<destination directory>
3. free -m (available memory much lower)
memory is used and not returned
I would expect that the memory should be recycled to the system once the rsync
I have tested this on various platforms and the results seem to be consistenet,
however, I doubt that I am the only one that would notice such an issue. Other
platforms which were tested ar Solaris, BSD and OS X, all three of which would not
consume memory in this manner when running rsync. Memory usage was monitored
with vmstat, free and by watching /proc/meminfo
Difference in free RAM (from data above): (490 - 279) = 211
Difference in cache sizes (from to data above): (383 - 177) + (43 -
39) = 210
It didn't swallow your RAM, it just moved most of the free RAM to
somewhere useful - in this case, the buffer/inode/dentry caches.
Additionally, internal kernel data structures (vm, socket structures,
cache entries, etc.) are allocated and freed all the time - even
running 'free' affects them.
I probably won't explain this perfectly, but here it comes:
The data is cached based on the principle "Well, you just accessed
this file's data, so I'm going to hold on to it incase you need it
again ...". When you access new files or start other applications, I
believe the least-recently-used cache data gets freed to make space
for the incoming data/application.
The reason the 'free' RAM 'jumps' when you remove the directory is
simple: When you unlink (remove) a file, cache data about that file
freed because they're no longer needed - if the file doesn't exist,
you certainly won't be using its data any time soon. In this case,
you're unlinking a lot of files. This means that a lot of cache
information (that the system thought you might need) is freed.
FYI - Running "cp -a <src> <dest>" should have a similar effect as
Clarification - rsync doesn't allocate/remove cache entries; this is
one of the things the kernel does.
Your explanation makes sense to me, thanks for that. ( I pretty much
knew that if there were a problem it would have been noticed some time
This does however, raise a question which has been presented to me
numerous times. According to your description, the least recently used
cached data gets freed to make room for new applications. This being
said, if it seemed as though 95% memory was used and an application
such as Matlab is launched and run, then this least-recently-used
would be allocated for the process such as Matlab which would explain
why the swap space did not seem to be utilized.
This is all making perfect sense now, thanks for the explanation.
Indeed, Lon's behaviour explains things.
The kernel will always try to keep files cached in RAM, because if a
file was rsynced by the user there's a chance the user might want to
use that file later on and it would be much faster if the file was
already in RAM and didn't need to be read from disk.
If an application needs the RAM, we can always reclaim it very easily.
The data is already on disk, so we just forget that we had it in the
cache and use the RAM for something else ...