Bug 552211 - dhcpd memory leak when failover configured
Summary: dhcpd memory leak when failover configured
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: dhcp
Version: 5.4
Hardware: All
OS: Linux
urgent
high
Target Milestone: rc
: ---
Assignee: Jiri Popelka
QA Contact: Alexander Todorov
URL:
Whiteboard:
Depends On: 534117
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-01-04 11:29 UTC by RHEL Program Management
Modified: 2010-01-14 10:12 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-01-14 10:12:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
valgrind.log (73.20 KB, text/plain)
2010-01-05 13:56 UTC, Alexander Todorov
no flags Details
valgrind.log iwth dhcpd -f (95.81 KB, text/plain)
2010-01-05 15:50 UTC, Alexander Todorov
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2010:0042 0 normal SHIPPED_LIVE dhcp bug fix update 2010-01-14 10:12:13 UTC

Description RHEL Program Management 2010-01-04 11:29:08 UTC
This bug has been copied from bug #534117 and has been proposed
to be backported to 5.4 z-stream (EUS).

Comment 5 Alexander Todorov 2010-01-04 18:49:06 UTC
Jiri,
i've tested with dhcp-3.0.5-21.el5_4.1 (from brew) on RHEL5.4 with 3 virtual systems on isolated network. 

1 primary DHCP server, 2 secondary server and one client. 

On the client I have a script which runs dhclient several hundred times, removing any .leases files between each run. On the primary server I'm continuously running top and monitoring the free memory field. It is decreasing steadily with every loop of the test case and is several megabytes off from the initial value after 2 executions of the script below. 

Is this the correct NVR which fixes this for 5.4.z? It looks so because the patch is applied in the spec file. Is dhcpd leaking memory somewhere else?


Test script:
#!/bin/sh

for i in `seq 200`; do
   echo "----- $i -----"
   killall -9 dhclient
   sleep 1
   /bin/rm /var/lib/dhclient/dhclient*
   dhclient 
   sleep 1
done

Comment 6 Jiri Popelka 2010-01-05 09:32:45 UTC
Alexander,
your script looks good to me.
Dhcpd is leaking memory for each DHCPDISCOVER package received.
Client is broadcasting DHCPDISCOVER message when he's trying to discover available DHCP servers.
Running dhclient and removing any .leases files a while after (because when client has valid lease he skips DHCPDISCOVER message and sends directly DHCPREQUEST message) should be sufficient to test this problem.

dhcp-3.0.5-21.el5_4.1 is the correct NVR.

Comment 7 Alexander Todorov 2010-01-05 13:56:34 UTC
Created attachment 381758 [details]
valgrind.log

memory leak test from
# valgrind -v --log-file=valgrind.log --tool=memcheck --trace-children=yes --leak-check=full --show-reachable=yes /usr/sbin/dhcpd

there are 110 references to omapi related functions which makes me think that we're missing another patch from upstream:

Changes since 3.1.1b1
 * A memory leak when using omapi has been fixed. 

This is with the hotfix from https://bugzilla.redhat.com/show_bug.cgi?id=534117#c1

I've tail -f'ed the valgrind.log file during execution of my test and there were no more errors reported. However the DHCP server was running and answering client requests.

Comment 8 Alexander Todorov 2010-01-05 15:50:35 UTC
Created attachment 381780 [details]
valgrind.log iwth dhcpd -f

memory leak test from
# valgrind -v --log-file=valgrind.log --tool=memcheck --trace-children=yes
--leak-check=full --show-reachable=yes /usr/sbin/dhcpd -f

Starts dhcpd with -f to stay in foreground. Now that we know how to find some leaks I'll re-test it with older package version which should have the bug present and compare the logs.

Comment 9 Alexander Todorov 2010-01-07 16:44:44 UTC
QE has performed some more testing and the results are:

The old version dhcp-3.0.5-21.el5 leaks approximately 20 MB in 30 minutes. 
The patched version dhcp-3.0.5-21.el5_4.1 appears to leak way less and the primary vs. secondary server leak differs: The primary server leaked some 7 MB in 30 minutes, the secondary server leaked some 4 MB in 30 minutes in our test environment. 

We believe that the leak in load_balance_mine() function has been properly fixed and we're moving this bug to VERIFIED. 

There however may be other memory leaks depending on your setup and patches for them may not have been pulled from upstream. If you experience further issues please file a separate bug.

Comment 12 errata-xmlrpc 2010-01-14 10:12:16 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0042.html


Note You need to log in before you can comment on or make changes to this bug.