Bug 607596 - condor_startd SEGV apparently on dynamic slot invalidation
Summary: condor_startd SEGV apparently on dynamic slot invalidation
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: 1.2
Hardware: All
OS: Linux
high
high
Target Milestone: 1.3
: ---
Assignee: Matthew Farrellee
QA Contact: Martin Kudlej
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-06-24 13:05 UTC by Jon Thomas
Modified: 2018-10-27 11:58 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-09-03 13:31:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jon Thomas 2010-06-24 13:05:56 UTC
Condor version is:
condor-7.4.3-0.17.el5
classads-1.0.6-1.el5


Core was generated by `condor_startd -f'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000035cef52a88 in main_arena () from /lib64/libc.so.6
(gdb) bt
#0  0x00000035cef52a88 in main_arena () from /lib64/libc.so.6
#1  0x0000000000530713 in HashTable<YourString, AttrListElem*>::lookup (this=0xd1c150, index=...,
   value=@0x7fffffffe2e8) at ../condor_c++_util/HashTable.h:329
#2  0x000000000052c8e4 in AttrList::Lookup (this=0x35cef52a38, name=0x593f1d "FetchWorkDelay")
   at attrlist.cpp:1003
#3  0x000000000052d18c in AttrList::EvalInteger (this=0x35cef52a38,
   name=0x593f1d "FetchWorkDelay", target=0x4c215be7, value=@0x7fffffffe384) at attrlist.cpp:1303
#4  0x0000000000485ab8 in Resource::evalNextFetchWorkDelay (this=0xc36e40) at Resource.cpp:2377
#5  0x0000000000485de8 in Resource::tryFetchWork (this=0xc36e40) at Resource.cpp:2408
#6  0x0000000000484f28 in ResState::eval (this=0xdd0f40) at ResState.cpp:419
#7  0x000000000047f1a1 in ResMgr::walk (this=<value optimized out>, memberfunc=
   (void (Resource::*)(Resource *)) 0x4838a0 <Resource::eval_state()>) at ResMgr.cpp:1085
#8  0x000000000047fb1f in ResMgr::update_all (this=0x7fffffffe2f0) at ResMgr.cpp:1475
#9  0x00000000004c1165 in TimerManager::Timeout (this=0x7f12a0) at timer_manager.cpp:419
#10 0x00000000004a9018 in DaemonCore::Driver (this=0xc29280) at daemon_core.cpp:2933
#11 0x00000000004bc7a8 in main (argc=1, argv=0x7fffffffed18) at daemon_core_main.cpp:2277
(gdb) up
#1  0x0000000000530713 in HashTable<YourString, AttrListElem*>::lookup (this=0xd1c150, index=...,
   value=@0x7fffffffe2e8) at ../condor_c++_util/HashTable.h:329
329  int idx = (int)(hashfcn(index) % tableSize);
(gdb)
#2  0x000000000052c8e4 in AttrList::Lookup (this=0x35cef52a38, name=0x593f1d "FetchWorkDelay")
   at attrlist.cpp:1003
1003 hash->lookup(name, tmpNode);
(gdb)
#3  0x000000000052d18c in AttrList::EvalInteger (this=0x35cef52a38,
   name=0x593f1d "FetchWorkDelay", target=0x4c215be7, value=@0x7fffffffe384) at attrlist.cpp:1303
1303 tree = Lookup(name);
(gdb)
#4  0x0000000000485ab8 in Resource::evalNextFetchWorkDelay (this=0xc36e40) at Resource.cpp:2377
2377 if (r_classad->EvalInteger(ATTR_FETCH_WORK_DELAY, job_ad, value) == 0) {
(gdb)
#5  0x0000000000485de8 in Resource::tryFetchWork (this=0xc36e40) at Resource.cpp:2408
2408 evalNextFetchWorkDelay();
(gdb)
#6  0x0000000000484f28 in ResState::eval (this=0xdd0f40) at ResState.cpp:419
419 rip->tryFetchWork();
(gdb) list
414 // If we're compiled to support fetching work
415 // automatically and configured to do so, check now if we
416 // should try to fetch more work.  Even if we're in the
417 // owner state, we can still see if the expressions allow
418 // any fetched work at this point.
419 rip->tryFetchWork();
420 #endif /* HAVE_JOB_HOOKS */
421
422 break;
423

Comment 4 Martin Kudlej 2010-08-11 12:19:13 UTC
How can I reproduce this issue, please?


Note You need to log in before you can comment on or make changes to this bug.