Bug 699716

Summary: Avoid memory cleanup on exit for condor_job_server
Product: Red Hat Enterprise MRG Reporter: Matthew Farrellee <matt>
Component: condor-qmfAssignee: Matthew Farrellee <matt>
Status: CLOSED UPSTREAM QA Contact: MRG Quality Engineering <mrgqe-bugs>
Severity: low Docs Contact:
Priority: low    
Version: 1.3CC: mkudlej, tstclair
Target Milestone: 2.0.1   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: condor-7.6.2-0.1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-06 13:55:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Matthew Farrellee 2011-04-26 12:53:37 UTC
condor-qmf-7.6.1-0.1.el5 and probably all time

src/condor_contrib/mgmt/qmf/daemons/job_server_main.cpp and probably the query_server -

--

void Stop()
{
   if (param_boolean("DUMP_STATE", false)) {
      Dump();
   }

   consumer->Reset();

   mirror->stop();

   delete schedd_oid; schedd_oid = NULL;
   delete job_server; job_server = NULL;
   delete singleton; singleton = NULL;
   delete mirror; mirror = NULL;

   DC_Exit(0);
}

//-------------------------------------------------------------

int main_shutdown_fast()
{
   dprintf(D_ALWAYS, "main_shutdown_fast() called\n");

   Stop();

   DC_Exit(0);
   return TRUE;   // to satisfy c++
}

//-------------------------------------------------------------

int main_shutdown_graceful()
{
   dprintf(D_ALWAYS, "main_shutdown_graceful() called\n");

   Stop();

   DC_Exit(0);
   return TRUE;   // to satisfy c++
}

--

On shutdown, the daemon attempts to free all memory. Doing so is useful when running under a leak detector, but doing so can be very expensive (imagine 19GB job_server that is partially in swap). There should be a param to force the free()ing, defaulted to false.

Comment 1 Matthew Farrellee 2011-04-26 12:55:24 UTC
"consumer->Reset();" is primary offender.

Comment 2 Matthew Farrellee 2011-04-26 12:55:59 UTC
Workaround: send SIGKILL

Comment 3 Matthew Farrellee 2011-06-15 20:15:54 UTC
Upstream on V7_6-branch

commit 4dbb4724da668cfa2950ca0a57c147eae1038d00
Author: Matthew Farrellee <matt@redhat>
Date:   Wed Jun 15 16:12:20 2011 -0400

    In cases of high memory usage, full free/delete of all memory on exit introduces unnecessary delays

diff --git a/src/condor_contrib/mgmt/qmf/daemons/job_server_main.cpp b/src/condor_contrib/mgmt/qmf/daemons/job_server_main.cpp
index 4605689..27ffbd9 100644
--- a/src/condor_contrib/mgmt/qmf/daemons/job_server_main.cpp
+++ b/src/condor_contrib/mgmt/qmf/daemons/job_server_main.cpp
@@ -276,7 +276,9 @@ void Stop()
      Dump();
   }
 
-  consumer->Reset();
+  if (param_boolea("CLEANUP_ON_EXIT", false)) {
+     consumer->Reset();
+  }
 
   mirror->stop();