Bug 853945 - RHHAv2 collector daemon SEGFAULT
RHHAv2 collector daemon SEGFAULT
Status: CLOSED NOTABUG
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor (Show other bugs)
2.2
Unspecified Unspecified
urgent Severity urgent
: 2.2
: ---
Assigned To: Timothy St. Clair
MRG Quality Engineering
:
Depends On:
Blocks: 799474 852537
  Show dependency treegraph
 
Reported: 2012-09-03 07:36 EDT by Tomas Rusnak
Modified: 2012-09-25 10:48 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-09-04 13:08:41 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
collector daemon strace (17.41 KB, text/plain)
2012-09-03 07:41 EDT, Tomas Rusnak
no flags Details

  None (edit)
Description Tomas Rusnak 2012-09-03 07:36:27 EDT
Description of problem:

Condor was upgraded from 7.6.5-0.18.el6 to 7.6.5-0.22.el6. RHHAv2 with multiple schedulers setup is used. After condor restart, the collector doesn't start with message:

09/03/12 11:28:33 ******************************************************
09/03/12 11:28:33 ** condor_collector (CONDOR_COLLECTOR) STARTING UP
09/03/12 11:28:33 ** /usr/sbin/condor_collector
09/03/12 11:28:33 ** SubsystemInfo: name=COLLECTOR type=COLLECTOR(3) class=DAEMON(1)
09/03/12 11:28:33 ** Configuration: subsystem:COLLECTOR local:<NONE> class:DAEMON
09/03/12 11:28:33 ** $CondorVersion: 7.6.5 Aug 30 2012 BuildID: RH-7.6.5-0.22.el6 $
09/03/12 11:28:33 ** $CondorPlatform: X86_64-RedHat_6.3 $
09/03/12 11:28:33 ** PID = 25162
09/03/12 11:28:33 ** Log last touched 9/3 11:27:20
09/03/12 11:28:33 ******************************************************
09/03/12 11:28:33 Using config source: /etc/condor/condor_config
09/03/12 11:28:33 Using local config sources: 
09/03/12 11:28:33    /etc/condor/config.d/00personal_condor.config
09/03/12 11:28:33    /etc/condor/config.d/50ha.config
09/03/12 11:28:33    /etc/condor/config.d/60condor-qmf.config
09/03/12 11:28:33    /etc/condor/config.d/61aviary.config
09/03/12 11:28:33    /etc/condor/config.d/99configd.config
09/03/12 11:28:33    /var/lib/condor/wallaby_node.config
09/03/12 11:28:33 DaemonCore: command socket at <10.34.1.106:9618>
09/03/12 11:28:33 DaemonCore: private command socket at <10.34.1.106:9618>
09/03/12 11:28:33 Setting maximum accepts per cycle 8.
09/03/12 11:28:33 In ViewServer::Init()
09/03/12 11:28:33 In CollectorDaemon::Init()
09/03/12 11:28:33 In ViewServer::Config()
09/03/12 11:28:33 In CollectorDaemon::Config()
09/03/12 11:28:33 OfflineCollectorPlugin::configure: no persistent store was defined for off-line ads.
09/03/12 11:28:33 enable: Creating stats hash table
09/03/12 11:28:33 Enabling CCB Server.
09/03/12 11:28:33 Plugin registration succeeded
09/03/12 11:28:33 Successfully loaded plugin: /usr/lib64/condor/plugins/MgmtCollectorPlugin-plugin.so
09/03/12 11:28:33 WARNING: forward resolution of localhost.localdomain doesn't match 6a01220a!
Stack dump for process 25162 at timestamp 1346664513 (17 frames)
condor_collector(dprintf_dump_stack+0x63)[0x4e87b3]
condor_collector[0x527252]
/lib64/libpthread.so.0(+0xf500)[0x7fd5f52de500]
/lib64/libc.so.6(__nss_hostname_digits_dots+0x49)[0x7fd5f50390e9]
/lib64/libc.so.6(gethostbyname+0x90)[0x7fd5f503e990]
condor_collector(_Z18verify_name_has_ipPc7in_addr+0x31)[0x4a2f91]
condor_collector(_ZN8IpVerify6VerifyE12DCpermissionPK11sockaddr_inPKcP8MyStringS7_+0x4f4)[0x4a5484]
condor_collector(_ZN10DaemonCore6VerifyEPKc12DCpermissionPK11sockaddr_inS1_+0x85)[0x47b8a5]
condor_collector(_ZN10DaemonCore9HandleReqEP6StreamS1_+0xc21)[0x48a131]
condor_collector(_ZN10DaemonCore24CallSocketHandler_workerEibP6Stream+0x73d)[0x48d5bd]
condor_collector(_ZN10DaemonCore35CallSocketHandler_worker_demarshallEPv+0x1a)[0x48d5fa]
condor_collector(_ZN13CondorThreads8pool_addEPFvPvES0_PiPKc+0x40)[0x524e50]
condor_collector(_ZN10DaemonCore17CallSocketHandlerERib+0x135)[0x483095]
condor_collector(_ZN10DaemonCore6DriverEv+0x2012)[0x487d62]
condor_collector(main+0x116b)[0x47660b]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7fd5f4f5acdd]
condor_collector[0x461879]

Version-Release number of selected component (if applicable):

$CondorVersion: 7.6.5 Aug 30 2012 BuildID: RH-7.6.5-0.22.el6 $
$CondorPlatform: X86_64-RedHat_6.3 $

How reproducible:
100%

Steps to Reproduce:
1. install/upgrade to 7.6.5-0.22
2. setup multiple scheduler on 3 nodes with 1 central manager
3. restart condor
4. take a look at /var/log/condor/CollectorLog
  
Actual results:
collector is not running and crashing

Expected results:
collector is available without crash

Additional info:

No core dump was generated even with:

ALL_DEBUG="D_FULLDEBUG"
ABORT_ON_EXCEPTION = True

Related packages:

condor-classads-7.6.5-0.22.el6.x86_64
condor-wallaby-tools-4.1.3-1.el6.noarch
python-condorutils-1.5-4.el6.noarch
condor-7.6.5-0.22.el6.x86_64
condor-qmf-7.6.5-0.22.el6.x86_64
condor-wallaby-client-4.1.3-1.el6.noarch
condor-aviary-7.6.5-0.22.el6.x86_64
condor-cluster-resource-agent-7.6.5-0.22.el6.x86_64
condor-wallaby-base-db-1.23-1.el6.noarch
wallaby-utils-0.12.5-10.el6.noarch
condor-wallaby-tools-4.1.3-1.el6.noarch
ruby-wallaby-0.12.5-10.el6.noarch
python-wallabyclient-4.1.3-1.el6.noarch
condor-wallaby-client-4.1.3-1.el6.noarch
wallaby-0.12.5-10.el6.noarch
condor-wallaby-base-db-1.23-1.el6.noarch

qpid-cpp-server-xml-0.14-21.el6_3.x86_64
qpid-java-client-0.18-1.el6.noarch
qpid-cpp-client-0.14-21.el6_3.x86_64
qpid-cpp-server-0.14-21.el6_3.x86_64
python-qpid-qmf-0.14-14.el6_3.x86_64
qpid-cpp-server-store-0.14-21.el6_3.x86_64
qpid-java-example-0.18-1.el6.noarch
qpid-cpp-client-devel-docs-0.14-21.el6_3.noarch
qpid-java-common-0.18-1.el6.noarch
qpid-cpp-client-devel-0.14-21.el6_3.x86_64
qpid-cpp-server-devel-0.14-21.el6_3.x86_64
ruby-qpid-qmf-0.14-14.el6_3.x86_64
Comment 1 Tomas Rusnak 2012-09-03 07:41:10 EDT
Created attachment 609338 [details]
collector daemon strace

Note You need to log in before you can comment on or make changes to this bug.