Bug 663779 - The config/classad variables WallabyFeatures and WallabyGroups are not being populated on config update
Summary: The config/classad variables WallabyFeatures and WallabyGroups are not being ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor-wallaby-tools
Version: Development
Hardware: All
OS: All
medium
medium
Target Milestone: 1.3.2
: ---
Assignee: Robert Rati
QA Contact: Lubos Trilety
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-12-16 20:04 UTC by Erik Erlandson
Modified: 2011-02-15 13:02 UTC (History)
2 users (show)

Fixed In Version: condor-wallaby-client-3.8-6
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-02-15 13:02:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Dump of the wallaby store used during the scenario (11.74 KB, application/x-gzip)
2010-12-16 20:08 UTC, Erik Erlandson
no flags Details
tarball of my config.d from scenario (2.93 KB, application/x-gzip)
2010-12-16 20:13 UTC, Erik Erlandson
no flags Details

Description Erik Erlandson 2010-12-16 20:04:04 UTC
Description of problem:
When a new configuration is activated for a node, the new WallabyFeatures and WallabyGroups are not being populated

Version-Release number of selected component (if applicable):
condor-wallaby-tools-3.8-4

How reproducible:
100%

Steps to Reproduce:
# Begin with condor pool in a virgin, "node with empty wallaby config" state:

[eje@rorschach utscale]$ condor_configure_pool --load-snapshot grid_scale_2010/12/15_14:50:37_pretest
Snapshot loaded
[eje@rorschach utscale]$ condor_configure_pool --activate
Activating configuration.  This may take a while, please be patient
Configuration activated
[eje@rorschach utscale]$ condor_configure_pool -v -l -n rorschach
Node "rorschach":
Last Check-in Time: Thu Dec 16 12:45:00 2010
Group Memberships:
  Internal Default Group
Features Applied:
Explicitly Set Parameters:
Configuration:
  WALLABY_CONFIG_VERSION = 1292528644056361



# start condor up, with no wallaby_node.config file
# Log output from configd startup:
12/16 12:44:48 INFO: Starting Up
12/16 12:44:49 INFO: Hostname is "rorschach"
12/16 12:44:49 DEBUG: "QMF_BROKER_PORT" is not defined. Using default (5672)
12/16 12:44:49 DEBUG: "QMF_BROKER_AUTH_MECHANISM" is not defined. Using defaults
12/16 12:44:49 DEBUG: Writing configuration file to "/usr/local/condor/local/wallaby_node.config"
12/16 12:44:49 DEBUG: Connected to broker "localhost.localdomain:5672"
12/16 12:44:49 DEBUG: Looking for the store agent
12/16 12:45:00 DEBUG: Found the store agent
12/16 12:45:00 DEBUG: Retrieved node object from store
12/16 12:45:00 DEBUG: Checking version of configuration
12/16 12:45:00 DEBUG: Performing a checkin with the store
12/16 12:45:01 DEBUG: Checked in with the store
12/16 12:45:01 INFO: Retrieving configuration version "1292528644056361" from the store
12/16 12:45:02 ERROR: Store: 'DAEMON_LIST'
12/16 12:45:02 WARNING: Failed to retrieve subsystem list.  Configuration could break restart/reconfig functionality
12/16 12:45:02 INFO: Retrieved configuration from the store
12/16 12:45:02 DEBUG: Daemons to restart: []
12/16 12:45:02 DEBUG: Daemons to reconfig: []



[root@rorschach log]$ more /usr/local/condor/local/wallaby_node.config 
WALLABY_CONFIG_VERSION = 1292528644056361
WallabyFeatures = ""
WallabyGroups = ""
MASTER_ATTRS = $(MASTER_ATTRS), WallabyFeatures, WallabyGroups
STARTD_ATTRS = $(STARTD_ATTRS), WallabyFeatures, WallabyGroups
QMF_BROKER_HOST = localhost.localdomain
QMF_CONFIGD = /usr/sbin/condor_configd
QMF_CONFIGD_ARGS = -d
QMF_CONFIGD_LOG = $(LOG)/ConfigLog
MAX_QMF_CONFIGD_LOG = 1000000
DAEMON_LIST = $(DAEMON_LIST), QMF_CONFIGD
QMF_CONFIGD_CHECK_INTERVAL = 600
ALLOW_ADMINISTRATOR = $(ALLOW_ADMINISTRATOR), $(FULL_HOSTNAME)
SEC_DEFAULT_AUTHENTICATION_METHODS = $(SEC_DEFAULT_AUTHENTICATION_METHODS), FS, NTLM, CLAIMTOBE
SEC_CLIENT_AUTHENTICATION_METHODS = $(SEC_CLIENT_AUTHENTICATION_METHODS), FS, NTLM, CLAIMTOBE
MASTER.SEC_ADMINISTRATOR_AUTHENTICATION_METHODS = $(MASTER.SEC_ADMINISTRATOR_AUTHENTICATION_METHODS), FS, NTLM, CLAIMTOBE
#WINDOWS_SOFTKILL = $(SBIN)\pidkill.bat
SHUTDOWN_FAST_TIMEOUT = 5
#QMF_CONFIGD_WIN_INTERVAL = 3
LOCAL_CONFIG_FILE = $(LOCAL_DIR)/wallaby_node.config
REQUIRE_LOCAL_CONFIG_FILE = FALSE



# Now load up test configuration and activate it:
[eje@rorschach utscale]$ condor_configure_pool --load-snapshot grid_scale_2010/12/15_15:04:17_micro_test
Snapshot loaded
[eje@rorschach utscale]$ condor_configure_pool --activate
Activating configuration.  This may take a while, please be patient
Configuration activated
[eje@rorschach utscale]$ condor_configure_pool -v -l -n rorschach
Node "rorschach":
Last Check-in Time: Thu Dec 16 12:50:55 2010
Group Memberships:
  GridScaleTestMicro
  Internal Default Group
Features Applied:
  GridScaleTestMicro
Explicitly Set Parameters:
Configuration:
  GRID_SCALE_TEST_RESTART_TAG = 1292450676.8
  WALLABY_CONFIG_VERSION = 1292529034714247



# Here is log file from configd activation side:
12/16 12:50:35 DEBUG: Received a NodeUpdatedNotice
12/16 12:50:35 DEBUG: The event is for this node
12/16 12:50:36 DEBUG: Checking version of configuration
12/16 12:50:36 DEBUG: Performing a checkin with the store
12/16 12:50:36 DEBUG: Checked in with the store
12/16 12:50:36 INFO: Retrieving configuration version "1292529034714247" from the store
12/16 12:50:37 ERROR: Store: 'DAEMON_LIST'
12/16 12:50:37 WARNING: Failed to retrieve subsystem list.  Configuration could break restart/reconfig functionality
12/16 12:50:37 INFO: Retrieved configuration from the store
12/16 12:50:37 DEBUG: Daemons to restart: [u'master']
12/16 12:50:37 DEBUG: Daemons to reconfig: []
12/16 12:50:38 DEBUG: Sending command "condor_restart" to subsystem "master"
12/16 12:50:38 DEBUG: Shutting down
12/16 12:50:38 DEBUG: Closing QMF connections
12/16 12:50:38 DEBUG: Lost connection to the configuration store
12/16 12:50:38 DEBUG: Closed QMF connections
12/16 12:50:38 DEBUG: Setting stop flag
12/16 12:50:38 DEBUG: Sent command "condor_restart" to subsystem "master"
12/16 12:50:38 INFO: Exiting
12/16 12:50:42 INFO: Starting Up
12/16 12:50:42 INFO: Hostname is "rorschach"
12/16 12:50:42 DEBUG: "QMF_BROKER_PORT" is not defined. Using default (5672)
12/16 12:50:42 DEBUG: "QMF_BROKER_AUTH_MECHANISM" is not defined. Using defaults
12/16 12:50:42 DEBUG: Writing configuration file to "/usr/local/condor/local/wallaby_node.config"
12/16 12:50:42 DEBUG: Connected to broker "localhost.localdomain:5672"
12/16 12:50:42 DEBUG: Looking for the store agent
12/16 12:50:50 DEBUG: Found the store agent
12/16 12:50:50 DEBUG: Retrieved node object from store
12/16 12:50:54 DEBUG: Checking version of configuration
12/16 12:50:55 DEBUG: Performing a checkin with the store
12/16 12:50:55 DEBUG: Checked in with the store
12/16 12:50:55 DEBUG: The system is already running configuration version "1292529034714247"


# Here is wallaby_node.config -- WallabyFeatures and WallabyGroups were not populated:
[root@rorschach log]$ more /usr/local/condor/local/wallaby_node.config 
GRID_SCALE_TEST_RESTART_TAG = 1292450676.8
WALLABY_CONFIG_VERSION = 1292529034714247
WallabyFeatures = ""
WallabyGroups = ""
MASTER_ATTRS = $(MASTER_ATTRS), WallabyFeatures, WallabyGroups
STARTD_ATTRS = $(STARTD_ATTRS), WallabyFeatures, WallabyGroups
QMF_BROKER_HOST = localhost.localdomain
QMF_CONFIGD = /usr/sbin/condor_configd
QMF_CONFIGD_ARGS = -d
QMF_CONFIGD_LOG = $(LOG)/ConfigLog
MAX_QMF_CONFIGD_LOG = 1000000
DAEMON_LIST = $(DAEMON_LIST), QMF_CONFIGD
QMF_CONFIGD_CHECK_INTERVAL = 600
ALLOW_ADMINISTRATOR = $(ALLOW_ADMINISTRATOR), $(FULL_HOSTNAME)
SEC_DEFAULT_AUTHENTICATION_METHODS = $(SEC_DEFAULT_AUTHENTICATION_METHODS), FS, NTLM, CLAIMTOBE
SEC_CLIENT_AUTHENTICATION_METHODS = $(SEC_CLIENT_AUTHENTICATION_METHODS), FS, NTLM, CLAIMTOBE
MASTER.SEC_ADMINISTRATOR_AUTHENTICATION_METHODS = $(MASTER.SEC_ADMINISTRATOR_AUTHENTICATION_METHODS), FS, NTLM, CLAIMTOBE
#WINDOWS_SOFTKILL = $(SBIN)\pidkill.bat
SHUTDOWN_FAST_TIMEOUT = 5
#QMF_CONFIGD_WIN_INTERVAL = 3
LOCAL_CONFIG_FILE = $(LOCAL_DIR)/wallaby_node.config
REQUIRE_LOCAL_CONFIG_FILE = FALSE

  
Actual results:
WallabyFeatures and WallabyGroups are empty

Expected results:
Should see:
WallabyFeatures = "GridScaleTestMicro"
WallabyGroups = "GridScaleTestMicro"

Additional info:

Comment 1 Erik Erlandson 2010-12-16 20:08:23 UTC
Created attachment 469211 [details]
Dump of the wallaby store used during the scenario

Comment 2 Erik Erlandson 2010-12-16 20:13:33 UTC
Created attachment 469212 [details]
tarball of my config.d from scenario

Comment 3 Erik Erlandson 2010-12-16 23:34:40 UTC
I added some log output to WallabyHelpers.get_node_features(), and it looks like node.memberships is from the "current" config, as opposed to the config that is incoming.

# when I activate a config including group GridScaleTestMicro:

12/16 16:23:14 INFO: Retrieving configuration version "1292541793631614" from the store
12/16 16:23:15 INFO: in get_node_features
12/16 16:23:15 INFO: id_name= +++1d1676d34b812e185c99321e43602092
12/16 16:23:15 INFO: group_list= [u'+++1d1676d34b812e185c99321e43602092', '+++DEFAULT']
12/16 16:23:15 INFO:     list= []
12/16 16:23:15 INFO:     list= []
12/16 16:23:15 INFO: list= []


# next, when I activate a config that does *not* include group GridScaleTestMicro:

12/16 16:24:25 INFO: Retrieving configuration version "1292541863832308" from the store
12/16 16:24:25 INFO: in get_node_features
12/16 16:24:25 INFO: id_name= +++1d1676d34b812e185c99321e43602092
12/16 16:24:25 INFO: group_list= [u'+++1d1676d34b812e185c99321e43602092', u'GridScaleTestMicro', '+++DEFAULT']
12/16 16:24:25 INFO:     list= []
12/16 16:24:25 INFO:     list= []
12/16 16:24:26 INFO:     list= []
12/16 16:24:26 INFO: list= []

Comment 4 Robert Rati 2010-12-22 21:55:38 UTC
The issue is that the groups and features are retrieved from an attribute on the node obj, but get_config was not guaranteeing the local version of the node object was updated.  get_config will now call node.update before it accesses any attributes on the node object.

Comment 6 Lubos Trilety 2011-01-20 13:40:30 UTC
Unable to reproduce on previous version, because it's a new feature. Also unable to reproduce on devel version of condor from 22.11. till 22.12. because they are not working correctly.

Works fine on version:
condor-wallaby-client-3.8-8

Tested on:
RHEL5 i386,x86_64  - passed

>>> VERIFIED


Note You need to log in before you can comment on or make changes to this bug.