Bug 754733 - shadow fails to shut down, slot stays Claimed/Idle
Summary: shadow fails to shut down, slot stays Claimed/Idle
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: 2.1
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: 2.1
: ---
Assignee: Timothy St. Clair
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-11-17 15:26 UTC by Timothy St. Clair
Modified: 2011-11-17 17:19 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-11-17 17:15:29 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Timothy St. Clair 2011-11-17 15:26:20 UTC
Description of problem:
Upstream has observed a timing condition where the shadow can hang and the job stays idle during a transition at the end of the job. 

Version-Release number of selected component (if applicable):
condor-7.6.5-0.6

How reproducible:
100%

Steps to Reproduce:
In the configuration of the execute machine:

   MachineMaxVacateTime = 0
   KILLING_TIMEOUT = 0

In the submit file:

   periodic_remove = JobStatus == 2
  
Actual results:
When the shadow deactivates the claim, the startd hard-kills the starter, so there is no final update. The shadow then waits around indefinitely for a final update that will never arrive.

Expected results:
Clean Exit of job and correct update of JobStatus

Additional info:
https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2591

Comment 3 Timothy St. Clair 2011-11-17 17:19:53 UTC
Appears on upstream patch set only.


Note You need to log in before you can comment on or make changes to this bug.