Hide Forgot
Description of problem: If absent nodes detection is enabled, the triggerd will except if QMF_BROKER_HOST isn't set to a valid host. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Repro with the following entries: QMF_BROKER_HOST=host1 ENABLE_ABSENT_NODES_DETECTION = TRUE In addition to other normal triggerd config.
The setup of the qpid/qmfv2 connections didn't handle exceptions, which would be thrown if the broker wasn't reachable. Fixed upstream on V7_6-branch
Reproduced on: $CondorVersion: 7.6.0 Mar 30 2011 BuildID: RH-7.6.0-0.4.el5 PRE-RELEASE-GRID $ $CondorPlatform: X86_64-Redhat_5.6 $ Config: QMF_BROKER_HOST=host1 ALL_DEBUG=D_FULLDEBUG STARTD_CRON_NAME = TRIGGER_DATA STARTD_CRON_AUTOPUBLISH = If_Changed TRIGGER_DATA_JOBLIST = GetData TRIGGER_DATA_GETDATA_PREFIX = Triggerd TRIGGER_DATA_GETDATA_EXECUTABLE = $(BIN)/get_trigger_data TRIGGER_DATA_GETDATA_PERIOD = 5m TRIGGER_DATA_GETDATA_RECONFIG = FALSE DAEMON_LIST = $(DAEMON_LIST), TRIGGERD ENABLE_ABSENT_NODES_DETECTION=True DC_DAEMON_LIST = $(DC_DAEMON_LIST), TRIGGERD MasterLog: 05/30/11 14:59:16 DaemonCore: No more children processes to reap. 05/30/11 14:59:16 The TRIGGERD (pid 8464) died due to signal 6 (Aborted) 05/30/11 14:59:16 ProcAPI::buildFamily failed: parent 8464 not found on system. 05/30/11 14:59:16 Sending obituary for "/usr/sbin/condor_triggerd" 05/30/11 14:59:16 Forking Mailer process... 05/30/11 14:59:16 restarting /usr/sbin/condor_triggerd in 11 seconds
Retested over all supported platforms RHEL5,RHEL6/x86,x86_64 with: condor-7.6.1-0.6 TriggerLog: 05/30/11 15:06:35 main_init() called 05/30/11 15:06:35 Triggerd::Triggerd called 05/30/11 15:06:35 Triggerd::init called 05/30/11 15:06:36 Triggerd Error: Failed to contact AMQP broker on host 'host1'. Absent nodes detection disabled 05/30/11 15:06:36 Triggerd::config called 05/30/11 15:06:36 Triggerd::SetInterval called 05/30/11 15:06:36 Triggerd: Registered PerformQueries() to evaluate triggers every 10 seconds MasterLog: 05/30/11 15:06:35 ::RealStart; TRIGGERD on_hold=0 05/30/11 15:06:35 Create_Process: using fast clone() to create child process. 05/30/11 15:06:35 SharedPortEndpoint: Inside destructor. 05/30/11 15:06:35 Started DaemonCore process "/usr/sbin/condor_triggerd -f", pid and pgroup = 17291 No such crash found in logs. >>> VERIFIED