Red Hat Bugzilla – Bug 585212
Recent updates to the collector caused a memory leak.
Last modified: 2010-10-20 07:30:07 EDT
Description of problem:
Memory leak in the collector on invalidate ads.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Have a large pool with dynamic slots
2. run a lot of jobs
3. watch memory usage of collector
steady increase in memory usage.
should stay constant.
Fixed in 7.4.3-0.11
How to quickly reproduce:
- configure a cluster of at least two condor instances (1 Central Manager, >=1 Execute nodes)
- enable Dynamic Slots on both (one big slot for each machine)
- increase the number of "generated" slots with NUM_CPUS (at least 32)
- submit a huge number of simple jobs (for example, a job description files which queues 15000 instances of "uname -a", each jdf submitted every 30 minutes)
- watch memory (RSS) used by collector on CM
With a simple cluster of two machines, RSS memory used by collector/7.4.3-0.10, increases quickly (in one or two hours), while it stays constants with condor-7.4.4-0.4 after one week of uninterrupted job processing.
Verified on RHEL5.5, i386/x86_64.