Bug 712111 - condor doesn't control deferral_* numbers
Summary: condor doesn't control deferral_* numbers
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: 1.0
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: 2.1
: ---
Assignee: Timothy St. Clair
QA Contact: Daniel Horák
URL:
Whiteboard:
Depends On:
Blocks: 743350
TreeView+ depends on / blocked
 
Reported: 2011-06-09 14:28 UTC by Lubos Trilety
Modified: 2012-01-23 17:27 UTC (History)
4 users (show)

Fixed In Version: condor-7.6.4-0.5
Doc Type: Bug Fix
Doc Text:
When a negative or floating point number was entered into a deferral_* parameter during submission, jobs placed on hold as a result. With this update, these parameters are checked to contain a positive integer value before submission and now, only jobs with valid deferral_* parameters are allowed to run.
Clone Of:
Environment:
Last Closed: 2012-01-23 17:27:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2012:0045 0 normal SHIPPED_LIVE Red Hat Enterprise MRG Grid 2.1 bug fix and enhancement update 2012-01-23 22:22:58 UTC

Description Lubos Trilety 2011-06-09 14:28:21 UTC
Description of problem:
There are three parameters used for time scheduling for job execution:
deferral_time
deferral_window
deferral_prep_time
All these parameter should be positive integers only. But condor also accepts minus numbers and decimal numbers.

Version-Release number of selected component (if applicable):
condor-7.6.1-0.10

set:
SCHEDD_INTERVAL = 10

How reproducible:
100%

Steps to Reproduce:
1. submit job with minus deferral_prep_time
i.e.
# job.submit
universe = vanilla
cmd = /bin/sleep
args = 20
deferral_time = 1307629230
deferral_window = 300
deferral_prep_time = -70
queue 1

$ condor_submit job.submit
Submitting job(s).
1 job(s) submitted to cluster 8.

2. see when the job really starts
  
Actual results:
job start about one minute after deferral_time

Expected results:
job should not be submitted successfully

Another scenario:
1. submit job with minus deferral_prep_time
i.e.
# job.submit
universe = vanilla
cmd = /bin/sleep
args = 20
deferral_time = 1307629710.9
deferral_window = 300
queue 1

$ condor_submit job.submit
Submitting job(s).
1 job(s) submitted to cluster 9.

2. run condor_q -bet
$ condor_q -bet
-- Submitter: hostname : <IP:38731> : hostname
---
009.000:  Request is held.
Hold reason: Job missed deferred execution time
  
Actual results:
job is moved directly to the hold state even when the deferral time is in the future

Expected results:
job should not be submitted successfully


Additional info:

Comment 1 Timothy St. Clair 2011-09-21 21:03:04 UTC
Updated logic to validate constants are non-negative integers only 

It does not eval an expression.

Comment 2 Timothy St. Clair 2011-09-23 20:28:55 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
C: Input negative or floating point numbers into deferral_* params during submission.
C: Jobs will be placed on hold.
F: Input check so deferral_* params have to be positive integers only.
R: Only jobs with valid deferral_* params are allowed to run.

Comment 4 Daniel Horák 2011-10-21 08:30:50 UTC
Reproduced on RHEL 5.7 i386: 
# rpm -q condor
  condor-7.6.3-0.3.el5

1st scenario: 
# cat /tmp/bz712111a.job 
  universe = vanilla
  cmd = /bin/sleep
  args = 20
  deferral_time = 1319118800
  deferral_window = 300
  deferral_prep_time = -70
  queue 1

date +%s
  1319118706
# runuser -s /bin/bash -l condor -c "condor_submit /tmp/bz712111a.job"
  Submitting job(s).
  1 job(s) submitted to cluster 1.

# date +%s
  1319118734
# condor_q
  -- Submitter: HOSTNAME : <IP:45310> : HOSTNAME
   ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
     1.0   condor         10/20 15:52   0+00:00:00 I  0   0.0  sleep 20          
  1 jobs; 1 idle, 0 running, 0 held

# date +%s
  1319118872
# condor_q
  -- Submitter: HOSTNAME : <IP:45310> : HOSTNAME
   ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
     1.0   condor         10/20 15:52   0+00:00:01 R  0   0.0  sleep 20          
  1 jobs; 0 idle, 1 running, 0 held

2nd scenario:
# cat /tmp/bz712111b.job 
  universe = vanilla
  cmd = /bin/sleep
  args = 20
  deferral_time = 1319119000.9
  deferral_window = 300
  queue 1

# runuser -s /bin/bash -l condor -c "condor_submit /tmp/bz712111b.job"
  Submitting job(s).
  1 job(s) submitted to cluster 2.

# condor_q -bet
  -- Submitter: HOSTNAME : <IP:45310> : HOSTNAME
  ---
  002.000:  Request is held.
  Hold reason: Job missed deferred execution time



Verified on RHEL 5.7 i386:
# rpm -q condor
  condor-7.6.4-0.8.el5

# cat bz712111.job 
  universe = vanilla
  cmd = /bin/sleep
  args = 20
  deferral_time = 1319183700
  deferral_window = 300
  deferral_prep_time = -70
  queue 1
# runuser -s /bin/bash -l condor -c "condor_submit /tmp/bz712111.job"
  Submitting job(s)
  ERROR: 'deferral_prep_time'='-70' is invalid, must eval to a non-negative integer.

# cat bz712111.job 
  universe = vanilla
  cmd = /bin/sleep
  args = 20
  deferral_time = 1319183700.8
  deferral_window = 300
  deferral_prep_time = 70
  queue 1
# runuser -s /bin/bash -l condor -c "condor_submit /tmp/bz712111.job"
  Submitting job(s)
  ERROR: 'deferral_time'='1319183700.8' is invalid, must eval to a non-negative integer.

Result for "deferral_time = -1319183700", "deferral_window = -300", "deferral_window = 300.8" and "deferral_prep_time = 70.8" is similar.


Verified on RHEL 5.7 x86_64, RHEL 6.1 i386, RHEL 6.1 x86_64 with similar output.

>>> VERIFIED

Comment 5 Tomas Capek 2011-11-17 12:50:25 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,4 +1 @@
-C: Input negative or floating point numbers into deferral_* params during submission.
+When a negative or floating point number was entered into a deferral_* parameter during submission, jobs placed on hold as a result. With this update, these parameters are checked to contain a positive integer value before submission and now, only jobs with valid deferral_* parameters are allowed to run.-C: Jobs will be placed on hold.
-F: Input check so deferral_* params have to be positive integers only.
-R: Only jobs with valid deferral_* params are allowed to run.

Comment 6 errata-xmlrpc 2012-01-23 17:27:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2012-0045.html


Note You need to log in before you can comment on or make changes to this bug.