Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 575777

Summary: scheduler universe jobs can start during schedd shutdown
Product: Red Hat Enterprise MRG Reporter: Matthew Farrellee <matt>
Component: condorAssignee: Matthew Farrellee <matt>
Status: CLOSED ERRATA QA Contact: Martin Kudlej <mkudlej>
Severity: high Docs Contact:
Priority: urgent    
Version: 1.2CC: fnadge, mkudlej, tao
Target Milestone: 1.3   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously, directed acyclic graph (dag) jobs were restarted during scheduler daemon shutdown,until shutdown time finally expires. With this update, dagman jobs do not anymore restart during scheduler shutdown.
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-10-14 16:00:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Matthew Farrellee 2010-03-22 12:11:33 UTC
Description of problem:

During Schedd shutdown dag jobs are restarted as they are terminated.

Steps to reproduce:

1.  Submit a thousand dagman jobs to a schedd
2.  condor_off -master
3.  condor_schedd never dies.  the condor_dagman child processes are removed and restarted during shutdown

How reproducible:

100%

See upstream: http://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=1299

Comment 1 Matthew Farrellee 2010-03-22 14:05:53 UTC
Fixed in 7.4.3-0.6

Comment 3 Martin Kudlej 2010-04-07 12:06:49 UTC
Tested on RHEL 5.5 i386 with condor-7.4.1-0.7.1.el5 and it doesn't
work.

Tested on RHEL 5.5 x86_64/i386 with condor-7.4.3-0.8.el5 and it works.


Set SCHEDD_INTERVAL=5 to configuration.

for i in `seq 20`; do condor_submit_dag diamond$i.dag; done

where diamond$i.dag:
JOB  A  /root/dag/diamond_job.condor
JOB  B  /root/dag/diamond_job.condor
JOB  C  /root/dag/diamond_job.condor
JOB  D  /root/dag/diamond_job.condor
PARENT A CHILD B C
PARENT B C CHILD D

and diamond_job.condor:
executable   = /bin/sleep
arguments    = 1h
universe     = vanilla
notification = NEVER
queue

after submition I run root:
condor_off -master

Comment 4 Martin Kudlej 2010-06-11 09:54:15 UTC
Tested on RHEL5.5/48 x x86_64/i386 with condor-7.4.3-0.17 and it works. -->VERIFIED

Comment 5 Florian Nadge 2010-10-07 12:12:32 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, directed acyclic graph (dag) jobs were terminated and restarted during scheduled shutdown, until shutdown time finally expires. With this update, dag jobs work as expected.

Comment 6 Florian Nadge 2010-10-07 12:23:06 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Previously, directed acyclic graph (dag) jobs were terminated and restarted during scheduled shutdown, until shutdown time finally expires. With this update, dag jobs work as expected.+Previously, directed acyclic graph (dag) jobs were terminated and restarted during scheduled shutdown, until shutdown time finally expires. With this update, dag jobs behave as expected.

Comment 7 Florian Nadge 2010-10-07 12:28:13 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Previously, directed acyclic graph (dag) jobs were terminated and restarted during scheduled shutdown, until shutdown time finally expires. With this update, dag jobs behave as expected.+Previously, directed acyclic graph (dag) jobs were restarted during scheduler daemon shutdown, until, until shutdown time finally expires. With this update, dagman jobs do not anymore restart during scheduler shutdown.

Comment 8 Florian Nadge 2010-10-07 12:30:58 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Previously, directed acyclic graph (dag) jobs were restarted during scheduler daemon shutdown, until, until shutdown time finally expires. With this update, dagman jobs do not anymore restart during scheduler shutdown.+Previously, directed acyclic graph (dag) jobs were restarted during scheduler daemon shutdown,until shutdown time finally expires. With this update, dagman jobs do not anymore restart during scheduler shutdown.

Comment 10 errata-xmlrpc 2010-10-14 16:00:53 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html