Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 575777 - scheduler universe jobs can start during schedd shutdown
scheduler universe jobs can start during schedd shutdown
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor (Show other bugs)
1.2
All Linux
urgent Severity high
: 1.3
: ---
Assigned To: Matthew Farrellee
Martin Kudlej
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-03-22 08:11 EDT by Matthew Farrellee
Modified: 2018-10-27 10:50 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, directed acyclic graph (dag) jobs were restarted during scheduler daemon shutdown,until shutdown time finally expires. With this update, dagman jobs do not anymore restart during scheduler shutdown.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-10-14 12:00:53 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0773 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Messaging and Grid Version 1.3 2010-10-14 11:56:44 EDT

  None (edit)
Description Matthew Farrellee 2010-03-22 08:11:33 EDT
Description of problem:

During Schedd shutdown dag jobs are restarted as they are terminated.

Steps to reproduce:

1.  Submit a thousand dagman jobs to a schedd
2.  condor_off -master
3.  condor_schedd never dies.  the condor_dagman child processes are removed and restarted during shutdown

How reproducible:

100%

See upstream: http://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=1299
Comment 1 Matthew Farrellee 2010-03-22 10:05:53 EDT
Fixed in 7.4.3-0.6
Comment 3 Martin Kudlej 2010-04-07 08:06:49 EDT
Tested on RHEL 5.5 i386 with condor-7.4.1-0.7.1.el5 and it doesn't
work.

Tested on RHEL 5.5 x86_64/i386 with condor-7.4.3-0.8.el5 and it works.


Set SCHEDD_INTERVAL=5 to configuration.

for i in `seq 20`; do condor_submit_dag diamond$i.dag; done

where diamond$i.dag:
JOB  A  /root/dag/diamond_job.condor
JOB  B  /root/dag/diamond_job.condor
JOB  C  /root/dag/diamond_job.condor
JOB  D  /root/dag/diamond_job.condor
PARENT A CHILD B C
PARENT B C CHILD D

and diamond_job.condor:
executable   = /bin/sleep
arguments    = 1h
universe     = vanilla
notification = NEVER
queue

after submition I run root:
condor_off -master
Comment 4 Martin Kudlej 2010-06-11 05:54:15 EDT
Tested on RHEL5.5/48 x x86_64/i386 with condor-7.4.3-0.17 and it works. -->VERIFIED
Comment 5 Florian Nadge 2010-10-07 08:12:32 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, directed acyclic graph (dag) jobs were terminated and restarted during scheduled shutdown, until shutdown time finally expires. With this update, dag jobs work as expected.
Comment 6 Florian Nadge 2010-10-07 08:23:06 EDT
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Previously, directed acyclic graph (dag) jobs were terminated and restarted during scheduled shutdown, until shutdown time finally expires. With this update, dag jobs work as expected.+Previously, directed acyclic graph (dag) jobs were terminated and restarted during scheduled shutdown, until shutdown time finally expires. With this update, dag jobs behave as expected.
Comment 7 Florian Nadge 2010-10-07 08:28:13 EDT
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Previously, directed acyclic graph (dag) jobs were terminated and restarted during scheduled shutdown, until shutdown time finally expires. With this update, dag jobs behave as expected.+Previously, directed acyclic graph (dag) jobs were restarted during scheduler daemon shutdown, until, until shutdown time finally expires. With this update, dagman jobs do not anymore restart during scheduler shutdown.
Comment 8 Florian Nadge 2010-10-07 08:30:58 EDT
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Previously, directed acyclic graph (dag) jobs were restarted during scheduler daemon shutdown, until, until shutdown time finally expires. With this update, dagman jobs do not anymore restart during scheduler shutdown.+Previously, directed acyclic graph (dag) jobs were restarted during scheduler daemon shutdown,until shutdown time finally expires. With this update, dagman jobs do not anymore restart during scheduler shutdown.
Comment 10 errata-xmlrpc 2010-10-14 12:00:53 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html

Note You need to log in before you can comment on or make changes to this bug.