Bug 754733

Summary: shadow fails to shut down, slot stays Claimed/Idle
Product: Red Hat Enterprise MRG Reporter: Timothy St. Clair <tstclair>
Component: condorAssignee: Timothy St. Clair <tstclair>
Status: CLOSED WORKSFORME QA Contact: MRG Quality Engineering <mrgqe-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 2.1CC: matt
Target Milestone: 2.1   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-11-17 17:15:29 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Timothy St. Clair 2011-11-17 15:26:20 UTC
Description of problem:
Upstream has observed a timing condition where the shadow can hang and the job stays idle during a transition at the end of the job. 

Version-Release number of selected component (if applicable):
condor-7.6.5-0.6

How reproducible:
100%

Steps to Reproduce:
In the configuration of the execute machine:

   MachineMaxVacateTime = 0
   KILLING_TIMEOUT = 0

In the submit file:

   periodic_remove = JobStatus == 2
  
Actual results:
When the shadow deactivates the claim, the startd hard-kills the starter, so there is no final update. The shadow then waits around indefinitely for a final update that will never arrive.

Expected results:
Clean Exit of job and correct update of JobStatus

Additional info:
https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2591

Comment 3 Timothy St. Clair 2011-11-17 17:19:53 UTC
Appears on upstream patch set only.