Red Hat Bugzilla – Bug 575777
scheduler universe jobs can start during schedd shutdown
Last modified: 2018-10-27 10:50:09 EDT
Description of problem: During Schedd shutdown dag jobs are restarted as they are terminated. Steps to reproduce: 1. Submit a thousand dagman jobs to a schedd 2. condor_off -master 3. condor_schedd never dies. the condor_dagman child processes are removed and restarted during shutdown How reproducible: 100% See upstream: http://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=1299
Fixed in 7.4.3-0.6
Tested on RHEL 5.5 i386 with condor-7.4.1-0.7.1.el5 and it doesn't work. Tested on RHEL 5.5 x86_64/i386 with condor-7.4.3-0.8.el5 and it works. Set SCHEDD_INTERVAL=5 to configuration. for i in `seq 20`; do condor_submit_dag diamond$i.dag; done where diamond$i.dag: JOB A /root/dag/diamond_job.condor JOB B /root/dag/diamond_job.condor JOB C /root/dag/diamond_job.condor JOB D /root/dag/diamond_job.condor PARENT A CHILD B C PARENT B C CHILD D and diamond_job.condor: executable = /bin/sleep arguments = 1h universe = vanilla notification = NEVER queue after submition I run root: condor_off -master
Tested on RHEL5.5/48 x x86_64/i386 with condor-7.4.3-0.17 and it works. -->VERIFIED
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Previously, directed acyclic graph (dag) jobs were terminated and restarted during scheduled shutdown, until shutdown time finally expires. With this update, dag jobs work as expected.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -Previously, directed acyclic graph (dag) jobs were terminated and restarted during scheduled shutdown, until shutdown time finally expires. With this update, dag jobs work as expected.+Previously, directed acyclic graph (dag) jobs were terminated and restarted during scheduled shutdown, until shutdown time finally expires. With this update, dag jobs behave as expected.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -Previously, directed acyclic graph (dag) jobs were terminated and restarted during scheduled shutdown, until shutdown time finally expires. With this update, dag jobs behave as expected.+Previously, directed acyclic graph (dag) jobs were restarted during scheduler daemon shutdown, until, until shutdown time finally expires. With this update, dagman jobs do not anymore restart during scheduler shutdown.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -Previously, directed acyclic graph (dag) jobs were restarted during scheduler daemon shutdown, until, until shutdown time finally expires. With this update, dagman jobs do not anymore restart during scheduler shutdown.+Previously, directed acyclic graph (dag) jobs were restarted during scheduler daemon shutdown,until shutdown time finally expires. With this update, dagman jobs do not anymore restart during scheduler shutdown.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0773.html