Bug 175134 - Memory leak in nanny.c
Memory leak in nanny.c
Status: CLOSED CURRENTRELEASE
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: piranha (Show other bugs)
3
All Linux
medium Severity high
: ---
: ---
Assigned To: Stanko Kupcevic
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-12-06 16:14 EST by Lon Hohberger
Modified: 2009-04-16 16:13 EDT (History)
1 user (show)

See Also:
Fixed In Version: 0.8.3-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-05-10 14:57:18 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Lon Hohberger 2005-12-06 16:14:33 EST
+++ This bug was initially created as a clone of Bug #174315 +++

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050524
Fedora/1.0.4-4 Firefox/1.0.4

Description of problem:
When using an external send program (-e), nanny fails to deallocate the result
buffer and leaks memory upon every external invocation.



Version-Release number of selected component (if applicable):
piranha-0.7.0 through piranha-0.8.1

How reproducible:
Always

Steps to Reproduce:
1. start a nanny instance with a verbose external check program executed often (1s):

nanny -c -h 192.168.0.0 -p 1234 -e /sbin/lspci -x BLAH -q -t 1 --lvs&

2. watch the process memory footprint

watch -n1 "cat /proc/$(PID)/status"
  

Actual Results:  The memory footprint continues to grow indefinitely.

Expected Results:  The memory footprint should stabilize.

Additional info:

The actual leak is in nanny.c::external_check() which fails to deallocate the
"result" buffer allocated by getExecOutput() using strdup:

        result = getExecOutput (flags, argv, timeout);

        if (expect_str != NULL) {
                if (strcmp (expect_str, result) != 0) {
                        piranha_log (flags, (char *)
                                     "Trouble. Recieved results are not what we 
expected from (%s)\n",
                                     inet_ntoa (*remoteAddr));
                        return 1;
                } else {
                        return 0;
                }

        }

A patch will be available shortly.

-- Additional comment from fmalita@gmail.com on 2005-11-27 14:04 EST --
Created an attachment (id=121522)
Fix for the nanny memory leak.


-- Additional comment from lhh@redhat.com on 2005-11-28 09:39 EST --
Patch looks correct.

-- Additional comment from lhh@redhat.com on 2005-11-28 15:51 EST --
Bump


-----------------------------
Duplicate copy for RHCS3
Comment 1 Lon Hohberger 2006-01-04 13:05:15 EST
patch already in CVS
Comment 2 Albert Graham 2006-02-04 23:10:28 EST
This bug is also in RHEL 4 as well as 3, can you make sure this fix and the
others mentioned below are included in the next release

The whole system hangs (but is still pingable) soon after starting on every
system I've tried, I think this could be an IP_VS bug, there is a patch but it
has "dont think"  it has been applied to the lastest RHEL 4 updates which is 
available here:

http://lkml.org/lkml/2004/11/24/375

This hang is a show stopper as it locks up the primary, then locks up the
failover server ? and with is with just a few users.

My tests included 500 connections/s which worked fine, but that was from a
single source IP address, I think the problem/bug kicks in when there are many
different source IP addresses e.g. web server.

I was hoping to replace a few hardware load balancers and have been working on
trying to get piranha to work reliabily for weeks without success.

Could I also bring your attention to the follow messages on the piranha mailing
list that describe the this "hang" in more detail:

https://www.redhat.com/archives/piranha-list/2005-December/thread.html

Please note, one solution was to use the UP kernel as it was suggested that the
IP_VS/ipvsadm or piranha was not SMP safe, However, I can confirm the same
problems on both UP and SMP kernels (2.6.9-22.0.2.EL), I've tried all previous
kernels with the same results.









Note You need to log in before you can comment on or make changes to this bug.