Description of problem: we make use of sftp over a persistent ssh tunnel to move files from one node to another. We've noticed that, over time, the sftp client connection will grow to several hundred megabytes in size, meaning we need to restart this service about weekly to avoid them consuming all available memory. Version-Release number of selected component (if applicable): 3.6.1p2-33.30.3 How reproducible: Always I've attached an strace of one iteration of the daemon.
Created attachment 111994 [details] strace of the problem
I cannot reproduce this memory leak here. But it's possible that I don't use the same commands as you do.
Created attachment 114822 [details] Patch Can you try this patch if it helps? It removes some memory leaks on some error paths.
We'll roll this into our server today, it'll be a few days to see if the problem goes away, of course - there's no easy way to accelerate the "leak rate" on our end. As for what commands we use, one entire iteration of what the daemon does is in the strace - that exact command set is what is run, once per minute. The only thing that changes is the quantity of files received.
I can see the commands in the strace, however the command lines aren't complete because strace cuts them. And there seem to be many failed commands in the strace which I don't suppose to be the case in your real environment. The patch is backported from 4.0p1 and there are only fixes for few error paths in the code so I wouldn't be very surprised if it didn't help.
Actually, it's a capture from our production environment. :) a lot of distinct files are inserted here, so there are a lot of gets that simply have no file (because of the directory size, it's not feasible to stat the directory, the directory contents can be more than a megabyte when transferred, even though it's rotated every 60 seconds or so). So, it's quite possible that the leaks *are* occuring only on errors. Either way, we should know definitively after I've given the patch a few days to run.
So did the patch help?
Actually yes! Significantly! The memory footprints of the instances are now within a meg of each other (most within a few K). Compare this to several hundred Megabytes of discrepancy prior to this patch, and it's obviously a significant improvement. I can't say for certain that *all* leaks have been caught, but certainly 99% of them were covered in the patch.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-550.html