Description of problem: Condor was running on RHEL5 with plugins enabled via PLUGIN_DIR, then reconfigured to use <subsys>.PLUGINS and restarted. The master seems to coredump on shutdown. This was seem on multiple nodes of different configurations (HA CM only, HA Schedd only, execute node). Version-Release number of selected component (if applicable): 7.1.4-0.1 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: 10/31 09:49:22 DaemonCore: Command Socket at <10.16.32.110:49238> 10/31 09:49:22 Failed to load plugin: /usr/libexec/condor/MgmtScheddPlugin-plugin.so reason: /usr/libexec/condor/MgmtScheddPlugin-plugin.so: undefined symbol: _ZTI16ClassAdLogPlugin 10/31 09:49:22 MasterPlugin registration succeeded 10/31 09:49:22 Successfully loaded plugin: /usr/libexec/condor/MgmtMasterPlugin-plugin.so 10/31 09:49:22 Failed to load plugin: /usr/libexec/condor/MgmtNegotiatorPlugin-plugin.so reason: /usr/libexec/condor/MgmtNegotiatorPlugin-plugin.so: undefined symbol: _ZTI16NegotiatorPlugin 10/31 09:49:22 Failed to load plugin: /usr/libexec/condor/MgmtCollectorPlugin-plugin.so reason: /usr/libexec/condor/MgmtCollectorPlugin-plugin.so: undefined symbol: _ZTI15CollectorPlugin 10/31 09:49:22 MgmtMasterPlugin initializing... 10/31 09:49:22 Started DaemonCore process "/usr/sbin/condor_startd", pid and pgroup = 12940 10/31 10:10:02 Got SIGTERM. Performing graceful shutdown. 10/31 10:10:02 Sent SIGTERM to STARTD (pid 12940) 10/31 10:10:02 The STARTD (pid 12940) exited with status 0 10/31 10:10:02 All daemons are gone. Exiting. 10/31 10:10:02 **** condor_master (condor_MASTER) EXITING WITH STATUS 0 Stack dump for process 12934 at timestamp 1225465802 (12 frames) condor_master(dprintf_dump_stack+0xc0)[0x4c13cd] condor_master[0x4c16a2] /lib64/libpthread.so.0[0x375c40de70] /lib64/libc.so.6(gsignal+0x35)[0x3fc0030155] /lib64/libc.so.6(abort+0x110)[0x3fc0031bf0] /lib64/libc.so.6(__assert_fail+0xf6)[0x3fc00295d6] /usr/libexec/condor/MgmtMasterPlugin-plugin.so(_ZN4qpid3sys5Mutex4lockEv+0x54)[0x2b33021a8552] /usr/lib64/libqpidclient.so.0(_ZN4qpid6client10Dispatcher3runEv+0x9d0)[0x2b330285dc30] /usr/lib64/libqmfagent.so.0(_ZN4qpid10management19ManagementAgentImpl16ConnectionThread3runEv+0x43d)[0x2b3302ac903d] /usr/lib64/libqpidcommon.so.0[0x2b3302525eea] /lib64/libpthread.so.0[0x375c4062f7] /lib64/libc.so.6(clone+0x6d)[0x3fc00d1b6d] Expected results: Additional info:
This is fixed in condor-7.1.4-0.4 The testing procedure is the same as for BZ470167, except run the condor_master in place of the qmf-agent example. Plugins currently live in /usr/libexec/condor and can be configured with PLUGIN_DIR = /usr/libexec/condor Plugins must be enabled and loaded for the test to be meaningful. Verify plugins are loaded by looking at log output. Successful loading is reported as part of daemon startup. Running the master will also start other daemons that used to fail with an error as well. To see if any daemons are crashing: grep -i stack `condor_config_val LOG`/* Alternatively you can check the MasterLog for any non 0 exit values.
RHTS test qpid_test_qmf_agent_bz470167 validates that this issue has been fixed. ->VERIFIED
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-0036.html