Red Hat Bugzilla – Bug 548090
RESERVED_SWAP doesn't default to 0 as stated in docs
Last modified: 2018-10-27 08:18:27 EDT
Description of problem: This is for MRG 1.2 release version of condor. First, a goal: During schedd testing I've decided it would be better to run without swap. The service latencies introduced by working with swap cause performance cliffs for other daemons/roles. It would be better to have items like condor_shadow's fail and be restarted on memory pressure than for the overall system to fall behind and never be able to recover. Per the manual for RESERVED_SWAP here: http://www.cs.wisc.edu/condor/manual/v7.4/3_3Configuration.html#SECTION00433000000000000000
Issue is the shadow still has a default of 5 for RESERVED_SWAP. Reproduce with: 1) swapoff -a 2) submit a job 3a) job stays idle 3b) ShadowLog shows: Not enough reserved swap space **** condor_shadow (condor_SHADOW) pid 15794 EXITING WITH STATUS 105 Expected behavior is the job should run.
Both the condor_shadow and condor_shadow.std have this problem. The shadow.std does not contain the expected check to let RESERVED_SWAP=0 mean no restriction.
commit c002b816115f696f69f522f24602f31f0299c661 Author: Matthew Farrellee <matt@> Date: Fri Dec 18 08:55:02 2009 -0500 RESERVED_SWAP did not default to 0, disabling swap tests. It defaulted to 5. condor_shadow and condor_shadow.std had a default of 5 for RESERVED_SWAP condor_shadow.std also did not honor RESERVED_SWAP=0 as a means to disable the swap tests diff --git a/src/condor_shadow.V6.1/baseshadow.cpp b/src/condor_shadow.V6.1/baseshadow.cpp index 7db4860..cf14213 100644 --- a/src/condor_shadow.V6.1/baseshadow.cpp +++ b/src/condor_shadow.V6.1/baseshadow.cpp @@ -997,7 +997,7 @@ BaseShadow::checkSwap( void ) { int reserved_swap, free_swap; // Reserved swap is specified in megabytes - reserved_swap = param_integer( "RESERVED_SWAP", 5 ); + reserved_swap = param_integer( "RESERVED_SWAP", 0 ); reserved_swap *= 1024; if( reserved_swap == 0 ) { diff --git a/src/condor_shadow.std/shadow.cpp b/src/condor_shadow.std/shadow.cpp index d115195..36d59b5 100644 --- a/src/condor_shadow.std/shadow.cpp +++ b/src/condor_shadow.std/shadow.cpp @@ -310,7 +310,7 @@ main(int argc, char *argv[] ) dprintf( D_ALWAYS, "** %s\n", CondorPlatform() ); dprintf( D_ALWAYS, "*******************************************\n" ); - reserved_swap = param_integer("RESERVED_SWAP", 5); + reserved_swap = param_integer("RESERVED_SWAP", 0); reserved_swap *= 1024; /* megabytes -> kb */ bool use_sql_log = param_boolean("QUILL_USE_SQL_LOG", false); @@ -320,7 +320,7 @@ main(int argc, char *argv[] ) dprintf( D_FULLDEBUG, "*** Reserved Swap = %d\n", reserved_swap ); dprintf( D_FULLDEBUG, "*** Free Swap = %d\n", free_swap ); - if( free_swap < reserved_swap ) { + if( reserved_swap && free_swap < reserved_swap ) { dprintf( D_ALWAYS, "Not enough reserved swap space\n" ); if(FILEObj) { delete FILEObj;
Existing workaround: Explicitly set RESERVED_SWAP=0 in configuration.
Fixed in 7.4.2-0.1
RESERVED_SWAP now defaults to 0 and job can be executed when swap is disabled. Tested on RHEL4.8/5.5, i386/x86_64, condor-7.4.3-0.11 Changing the status to VERIFIED.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: This update allows for teh scheduler daemon to run without swap.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -This update allows for teh scheduler daemon to run without swap.+This update allows for the scheduler daemon to run without swap.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0773.html