Bug 808142 - condor_startd reconfig from static slots to partitionable slots causes ERROR and exit if job is running
Summary: condor_startd reconfig from static slots to partitionable slots causes ERROR ...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: 1.3
Hardware: All
OS: All
low
low
Target Milestone: ---
: ---
Assignee: grid-maint-list
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-29 17:10 UTC by Matthew Farrellee
Modified: 2016-05-26 19:57 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-05-26 19:57:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Matthew Farrellee 2012-03-29 17:10:34 UTC
Description of problem:

The condor_startd allows a reconfig to change its slot configuration. If a job
is running during that reconfiguration, the startd will ERROR and exit.


Version-Release number of selected component (if applicable):

Likely all, definitely 7.6.7-0.8.


How reproducible:

100%


Steps to Reproduce:
1. service condor start

2. echo 'cmd=/bin/sleep\nargs=1d\nqueue' | condor_submit

3. Wait for job to start running

4. Enable partitionable slots

 SLOT_TYPE_1=cpus=100%
 NUM_SLOTS=1
 NUM_SLOTS_TYPE_1 = 1
 SLOT_TYPE_1_PARTITIONABLE=true

5. condor_reconfig


Actual results:

In MasterLog -

 Sent SIGHUP to STARTD (pid 10522)
 The STARTD (pid 10522) exited with status 0

In StartLog -

 ERROR "Unknown state in ResState::leave_action" at line 604 in file .../src/condor_startd.V6/ResState.cpp


Expected results:

No ERROR, no exit.

Comment 2 Anne-Louise Tangring 2016-05-26 19:57:07 UTC
MRG-Grid is in maintenance and only customer escalations will be considered. This issue can be reopened if a customer escalation associated with it occurs.


Note You need to log in before you can comment on or make changes to this bug.