Bug 803359 - [RFE]change UNHIBERNATE default value to not wake up all the machines
[RFE]change UNHIBERNATE default value to not wake up all the machines
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor-wallaby-base-db (Show other bugs)
2.1
All All
low Severity low
: 2.3
: ---
Assigned To: Robert Rati
Lubos Trilety
: FutureFeature
Depends On: 756096
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-14 10:29 EDT by Timothy St. Clair
Modified: 2013-03-06 13:42 EST (History)
8 users (show)

See Also:
Fixed In Version: condor-wallaby-base-db-1.24-1
Doc Type: Enhancement
Doc Text:
C: The UNHIBERATE parameter in the remote configuration base database was too loosely defined C: Hibernated machines would be woken up unnecessarily C: Tightened the definition of UNHIBERNATE R: Not all hibernated machines will be woken up
Story Points: ---
Clone Of: 756096
Environment:
Last Closed: 2013-03-06 13:42:56 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Timothy St. Clair 2012-03-14 10:29:39 EDT
+++ This bug was initially created as a clone of Bug #756096 +++

Description of problem:
The out of the box definition of UNHIBERNATE is:
UNHIBERNATE = MachineLastMatchTime =!= UNDEFINED
Thus all the sleeping machines are woken up after each ROOSTER_INTERVAL (because ROOSTER_UNHIBERNATE=Offline && Unhibernate).

I understand that this value would probably need a tuning in every cluster, but I think that a more reasonable default value would be something like:

Unhibernate = CurrentTime - MachineLastMatchTime < 1200

as described here: 
https://lists.cs.wisc.edu/archive/condor-users/2010-April/msg00063.shtml

This could be maybe changed upstream.

Version-Release number of selected component:
condor-7.6.5-0.7

--- Additional comment from tstclair@redhat.com on 2012-03-14 10:14:56 EDT ---

Added config default as mentioned.  Tracking in V7_6-build-branch
Comment 2 Robert Rati 2012-03-28 15:30:17 EDT
Changed UNHIBERNATE on PowerManagementSubnetManager to the above value.

Updated on master.
Comment 3 Robert Rati 2012-04-02 11:02:39 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
C: The UNHIBERATE parameter in the remote configuration base database was too loosely defined
C: Hibernated machines would be woken up unnecessarily
C: Tightened the definition of UNHIBERNATE
R: Not all hibernated machines will be woken up
Comment 6 Robert Rati 2012-10-03 16:06:03 EDT
Updated the value for UNHIBERNATE to above value for feature and param.

PowerManagementSubnetManager
  name:  "PowerManagementSubnetManager"
  params:  {"ROOSTER"=>"$(LIBEXEC)/condor_rooster", "ROOSTER_UNHIBERNATE_RANK"=>"Mips*Cpus", "ROOSTER_MAX_UNHIBERNATE"=>"0", "ROOSTER_INTERVAL"=>"300", "ROOSTER_WAKEUP_CMD"=>"\"$(BIN)/condor_power -d -i -s $(ROOSTER_SUBNET_MASK)\"", "ROOSTER_UNHIBERNATE"=>"Offline && Unhibernate", "DAEMON_LIST"=>">= ROOSTER", "UNHIBERNATE"=>"CurrentTime - MachineLastMatchTime < 1200", "ROOSTER_SUBNET_MASK"=>""}
  depends:  []
  conflicts:  ["PowerManagementNode"]
  included_features:  []

UNHIBERNATE
  kind:  "String"
  default:  "CurrentTime - MachineLastMatchTime < 1200"
  description:  "A boolean expression that specifies when an offline machine should be woken up"
  must_change:  false
  requires_restart:  false
  visibility_level:  0
  depends:  []
  conflicts:  []

Fixed upstream on branch:
BZ803359-UNHIBERNATE-value
Comment 7 Lubos Trilety 2012-12-18 05:15:23 EST
tested with:
condor-wallaby-base-db-1.25-1

tested on:
RHEL6 i386,x86_64
RHEL5 i386,x86_64

- !ruby/object:Mrg::Grid::SerializedConfigs::Feature
  annotation: Enables power management wake up for a subnet
  conflicts:
  - PowerManagementNode
  depends: []

  included: []

  name: PowerManagementSubnetManager
  params:
    ROOSTER_WAKEUP_CMD: "\"$(BIN)/condor_power -d -i -s $(ROOSTER_SUBNET_MASK)\""
    UNHIBERNATE: CurrentTime - MachineLastMatchTime < 1200
    ROOSTER_UNHIBERNATE_RANK: Mips*Cpus
    ROOSTER_MAX_UNHIBERNATE: "0"
    ROOSTER_SUBNET_MASK: 0
    ROOSTER: $(LIBEXEC)/condor_rooster
    ROOSTER_INTERVAL: "300"
    ROOSTER_UNHIBERNATE: Offline && Unhibernate
    DAEMON_LIST: ">= ROOSTER"

- !ruby/object:Mrg::Grid::SerializedConfigs::Parameter
  annotation: ""
  conflicts: []

  default_val: CurrentTime - MachineLastMatchTime < 1200
  depends: []

  description: A boolean expression that specifies when an offline machine should be woken up
  kind: String
  level: 0
  must_change: false
  name: UNHIBERNATE
  needs_restart: false


# wallaby show-feature PowerManagementSubnetManager
Console Connection Established...
PowerManagementSubnetManager
  name:  "PowerManagementSubnetManager"
  params:  {"UNHIBERNATE"=>"CurrentTime - MachineLastMatchTime < 1200", "ROOSTER_INTERVAL"=>"300", "DAEMON_LIST"=>">= ROOSTER", "ROOSTER_MAX_UNHIBERNATE"=>"0", "ROOSTER"=>"$(LIBEXEC)/condor_rooster", "ROOSTER_WAKEUP_CMD"=>"\"$(BIN)/condor_power -d -i -s $(ROOSTER_SUBNET_MASK)\"", "ROOSTER_SUBNET_MASK"=>"", "ROOSTER_UNHIBERNATE_RANK"=>"Mips*Cpus", "ROOSTER_UNHIBERNATE"=>"Offline && Unhibernate"}
  depends:  []
  conflicts:  ["PowerManagementNode"]
  included_features:  []
  annotation:  "Enables power management wake up for a subnet"

# wallaby show-param UNHIBERNATE
Console Connection Established...
UNHIBERNATE
  kind:  "String"
  default:  "CurrentTime - MachineLastMatchTime < 1200"
  description:  "A boolean expression that specifies when an offline machine should be woken up"
  must_change:  false
  requires_restart:  false
  visibility_level:  0
  depends:  []
  conflicts:  []
  annotation:  ""


>>> verified
Comment 9 errata-xmlrpc 2013-03-06 13:42:56 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0564.html

Note You need to log in before you can comment on or make changes to this bug.