Bug 698782 - triggerd excepts if bad QMF_BROKER_HOST
Summary: triggerd excepts if bad QMF_BROKER_HOST
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor-qmf
Version: Development
Hardware: All
OS: Linux
high
high
Target Milestone: 2.0
: ---
Assignee: Robert Rati
QA Contact: Tomas Rusnak
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-04-21 18:38 UTC by Robert Rati
Modified: 2011-06-27 15:32 UTC (History)
3 users (show)

Fixed In Version: condor-7.6.1-0.3
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-06-27 15:32:29 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Robert Rati 2011-04-21 18:38:42 UTC
Description of problem:
If absent nodes detection is enabled, the triggerd will except if QMF_BROKER_HOST isn't set to a valid host.

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Robert Rati 2011-04-21 18:42:00 UTC
Repro with the following entries:
QMF_BROKER_HOST=host1
ENABLE_ABSENT_NODES_DETECTION = TRUE

In addition to other normal triggerd config.

Comment 2 Robert Rati 2011-04-21 20:52:05 UTC
The setup of the qpid/qmfv2 connections didn't handle exceptions, which would be thrown if the broker wasn't reachable.

Fixed upstream on V7_6-branch

Comment 3 Tomas Rusnak 2011-05-30 12:00:59 UTC
Reproduced on:

$CondorVersion: 7.6.0 Mar 30 2011 BuildID: RH-7.6.0-0.4.el5 PRE-RELEASE-GRID $
$CondorPlatform: X86_64-Redhat_5.6 $

Config:
QMF_BROKER_HOST=host1
ALL_DEBUG=D_FULLDEBUG

STARTD_CRON_NAME = TRIGGER_DATA
STARTD_CRON_AUTOPUBLISH = If_Changed
TRIGGER_DATA_JOBLIST = GetData
TRIGGER_DATA_GETDATA_PREFIX = Triggerd
TRIGGER_DATA_GETDATA_EXECUTABLE = $(BIN)/get_trigger_data
TRIGGER_DATA_GETDATA_PERIOD = 5m
TRIGGER_DATA_GETDATA_RECONFIG = FALSE

DAEMON_LIST = $(DAEMON_LIST), TRIGGERD
ENABLE_ABSENT_NODES_DETECTION=True
DC_DAEMON_LIST = $(DC_DAEMON_LIST), TRIGGERD

MasterLog:
05/30/11 14:59:16 DaemonCore: No more children processes to reap.
05/30/11 14:59:16 The TRIGGERD (pid 8464) died due to signal 6 (Aborted)
05/30/11 14:59:16 ProcAPI::buildFamily failed: parent 8464 not found on system.
05/30/11 14:59:16 Sending obituary for "/usr/sbin/condor_triggerd"
05/30/11 14:59:16 Forking Mailer process...
05/30/11 14:59:16 restarting /usr/sbin/condor_triggerd in 11 seconds

Comment 4 Tomas Rusnak 2011-05-30 12:15:41 UTC
Retested over all supported platforms RHEL5,RHEL6/x86,x86_64 with:

condor-7.6.1-0.6

TriggerLog:
05/30/11 15:06:35 main_init() called
05/30/11 15:06:35 Triggerd::Triggerd called
05/30/11 15:06:35 Triggerd::init called
05/30/11 15:06:36 Triggerd Error: Failed to contact AMQP broker on host 'host1'.  Absent nodes detection disabled
05/30/11 15:06:36 Triggerd::config called
05/30/11 15:06:36 Triggerd::SetInterval called
05/30/11 15:06:36 Triggerd: Registered PerformQueries() to evaluate triggers every 10 seconds

MasterLog:
05/30/11 15:06:35 ::RealStart; TRIGGERD on_hold=0
05/30/11 15:06:35 Create_Process: using fast clone() to create child process.
05/30/11 15:06:35 SharedPortEndpoint: Inside destructor.
05/30/11 15:06:35 Started DaemonCore process "/usr/sbin/condor_triggerd -f", pid and pgroup = 17291

No such crash found in logs. 

>>> VERIFIED


Note You need to log in before you can comment on or make changes to this bug.