CLAIM_WORKLIFE specifies how long a claim can be reused SHADOW_WORKLIFE specifies how long a shadow can be reused Great scale and performance gains come from SHADOW_WORKLIFE. It eliminates the need for the Schedd to constantly spawn and reap shadows. Instead, the Schedd can pass new jobs to the Shadow. As of 7.6.1-0.2 and earlier, the Schedd will only pass jobs to a shadow that fit the claim the shadow was spawned with. This means that when the claim expires the shadow will appear to have no more work and will also exit. Thus the CLAIM_WORKLIFE bounds the worklife of the shadow. Scheduler::RecycleShadow() could search for a new claim to give to an existing shadow.
Now tracking upstream https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2229
Dev notes: Original patch would not work because of sequence of events which the schedd follows when starting a claim. If the CLAIM_WORKLIFE expires then the schedd has no method of blocking to request a new claim for a given match_rec. Presently, it enqueues a claim_request for a given match which then waits for a response. Once the response has been received, it then triggers another timer to start the job. Everything is 1:1 as it comes in from negotiation. I will consult with upstream, but it appears that if one wants to do this it would require rearchitecting how we handle claim_request and how jobs are spun. It does however seem possible to reuse other transient error conditions (107 & 108).
Dev notes: Consulted with DanB, and he is in agreement with the assessment. There is little to be gained with trying to recycle a shadow once a claim has expired because you would need to request a new claim, which is a high latency event. Will pursue the option of reusing for other error conditions.