Bug 607596 - condor_startd SEGV apparently on dynamic slot invalidation
condor_startd SEGV apparently on dynamic slot invalidation
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor (Show other bugs)
1.2
All Linux
high Severity high
: 1.3
: ---
Assigned To: Matthew Farrellee
Martin Kudlej
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-06-24 09:05 EDT by Jon Thomas
Modified: 2010-10-23 11:42 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-09-03 09:31:05 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jon Thomas 2010-06-24 09:05:56 EDT
Condor version is:
condor-7.4.3-0.17.el5
classads-1.0.6-1.el5


Core was generated by `condor_startd -f'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000035cef52a88 in main_arena () from /lib64/libc.so.6
(gdb) bt
#0  0x00000035cef52a88 in main_arena () from /lib64/libc.so.6
#1  0x0000000000530713 in HashTable<YourString, AttrListElem*>::lookup (this=0xd1c150, index=...,
   value=@0x7fffffffe2e8) at ../condor_c++_util/HashTable.h:329
#2  0x000000000052c8e4 in AttrList::Lookup (this=0x35cef52a38, name=0x593f1d "FetchWorkDelay")
   at attrlist.cpp:1003
#3  0x000000000052d18c in AttrList::EvalInteger (this=0x35cef52a38,
   name=0x593f1d "FetchWorkDelay", target=0x4c215be7, value=@0x7fffffffe384) at attrlist.cpp:1303
#4  0x0000000000485ab8 in Resource::evalNextFetchWorkDelay (this=0xc36e40) at Resource.cpp:2377
#5  0x0000000000485de8 in Resource::tryFetchWork (this=0xc36e40) at Resource.cpp:2408
#6  0x0000000000484f28 in ResState::eval (this=0xdd0f40) at ResState.cpp:419
#7  0x000000000047f1a1 in ResMgr::walk (this=<value optimized out>, memberfunc=
   (void (Resource::*)(Resource *)) 0x4838a0 <Resource::eval_state()>) at ResMgr.cpp:1085
#8  0x000000000047fb1f in ResMgr::update_all (this=0x7fffffffe2f0) at ResMgr.cpp:1475
#9  0x00000000004c1165 in TimerManager::Timeout (this=0x7f12a0) at timer_manager.cpp:419
#10 0x00000000004a9018 in DaemonCore::Driver (this=0xc29280) at daemon_core.cpp:2933
#11 0x00000000004bc7a8 in main (argc=1, argv=0x7fffffffed18) at daemon_core_main.cpp:2277
(gdb) up
#1  0x0000000000530713 in HashTable<YourString, AttrListElem*>::lookup (this=0xd1c150, index=...,
   value=@0x7fffffffe2e8) at ../condor_c++_util/HashTable.h:329
329  int idx = (int)(hashfcn(index) % tableSize);
(gdb)
#2  0x000000000052c8e4 in AttrList::Lookup (this=0x35cef52a38, name=0x593f1d "FetchWorkDelay")
   at attrlist.cpp:1003
1003 hash->lookup(name, tmpNode);
(gdb)
#3  0x000000000052d18c in AttrList::EvalInteger (this=0x35cef52a38,
   name=0x593f1d "FetchWorkDelay", target=0x4c215be7, value=@0x7fffffffe384) at attrlist.cpp:1303
1303 tree = Lookup(name);
(gdb)
#4  0x0000000000485ab8 in Resource::evalNextFetchWorkDelay (this=0xc36e40) at Resource.cpp:2377
2377 if (r_classad->EvalInteger(ATTR_FETCH_WORK_DELAY, job_ad, value) == 0) {
(gdb)
#5  0x0000000000485de8 in Resource::tryFetchWork (this=0xc36e40) at Resource.cpp:2408
2408 evalNextFetchWorkDelay();
(gdb)
#6  0x0000000000484f28 in ResState::eval (this=0xdd0f40) at ResState.cpp:419
419 rip->tryFetchWork();
(gdb) list
414 // If we're compiled to support fetching work
415 // automatically and configured to do so, check now if we
416 // should try to fetch more work.  Even if we're in the
417 // owner state, we can still see if the expressions allow
418 // any fetched work at this point.
419 rip->tryFetchWork();
420 #endif /* HAVE_JOB_HOOKS */
421
422 break;
423
Comment 4 Martin Kudlej 2010-08-11 08:19:13 EDT
How can I reproduce this issue, please?

Note You need to log in before you can comment on or make changes to this bug.