Bug 472599

Summary: missing slot_type_x config causes dump in startd
Product: Red Hat Enterprise MRG Reporter: Robert Rati <rrati>
Component: gridAssignee: Will Benton <willb>
Status: CLOSED ERRATA QA Contact: Kim van der Riet <kim.vdriet>
Severity: medium Docs Contact:
Priority: medium    
Version: 1.0CC: jsarenik, matt
Target Milestone: 1.1.1   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-04-21 16:19:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Robert Rati 2008-11-21 22:25:57 UTC
Description of problem:
Adding:

SLOT_TYPE_1_PARTITIONABLE = TRUE
NUM_SLOTS = 1
NUM_SLOTS_TYPE_1 = 1

to condor's configuration file in an attempt to get dynamic provisioning to work causes the startd to dump.  Trace from StartLog:

11/21 16:25:48 "/usr/sbin/condor_starter.std -classad" did not produce any output, ignoring
Stack dump for process 11476 at timestamp 1227306348 (12 frames)
condor_startd(dprintf_dump_stack+0xc0)[0x4e83d3]
condor_startd[0x4e86a6]
/lib64/libc.so.6[0x328f0301b0]
condor_startd(_ZN4ListIcE6RewindEv+0xc)[0x48f2d2]
condor_startd(_ZN10StringList6rewindEv+0x19)[0x48f2f9]
condor_startd(_ZN6ResMgr9buildSlotEiP10StringListib+0x93)[0x48bd8d]
condor_startd(_ZN6ResMgr13buildCpuAttrsEiPib+0xff)[0x48c935]
condor_startd(_ZN6ResMgr14init_resourcesEv+0x1a8)[0x48d8a0]
condor_startd(_Z9main_initiPPc+0x27a)[0x4a353e]
condor_startd(main+0x17df)[0x4e00fb]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x328f01d8b4]
condor_startd(__gxx_personality_v0+0x479)[0x47dee9]


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Matthew Farrellee 2008-11-21 22:30:40 UTC
The can be reproduced with:

NUM_SLOTS = 1
NUM_SLOTS_TYPE_1 = 1

ResMrg.cpp:780 is not checking to see if list is NULL

This has nothing to do with Dynamic Provisioning

Comment 2 Will Benton 2009-01-27 18:09:41 UTC
I have a fix for this in a branch at UW -- the SHA is 54997e6e05.  Once it gets reviewed, it should go into the 7.2 branch, but we can pull the patch sooner if need be.

Comment 3 Will Benton 2009-01-28 00:21:28 UTC
Matt reviewed this and it is now upstream in the 7.2 branch:  the SHA is 185c6ae41.

Comment 5 Jan Sarenik 2009-03-04 13:20:49 UTC
Reproduced on RHEL 5.3 i386, condor-7.1.4-0.3.el5.i386.rpm.

Verified that it works as expected on both RHEL 5.3 and RHEL 4.7,
i386 and x86_64 each.

Comment 7 errata-xmlrpc 2009-04-21 16:19:12 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-0434.html