Bug 700784 - condor_q cannot get classads from scheduler
Summary: condor_q cannot get classads from scheduler
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: Development
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: 2.0
: ---
Assignee: Erik Erlandson
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-04-29 11:41 UTC by Martin Kudlej
Modified: 2011-05-06 06:58 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-05-06 06:58:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
condor config, condor log files, condor_q log file (207.33 KB, application/x-gzip)
2011-04-29 11:41 UTC, Martin Kudlej
no flags Details

Description Martin Kudlej 2011-04-29 11:41:42 UTC
Created attachment 495760 [details]
condor config, condor log files, condor_q log file

Description of problem:
It is not possible to get all classads by "condor_q -l" for some jobs.

Version-Release number of selected component (if applicable):
condor-7.6.1-0.4.el6.i686

How reproducible:
100%

Steps to Reproduce:
1. install condor pool with aviary, QMF, Dynamic slots
2. submit simple job via QMF
3. condor_q -l _clusterid_
  
Actual results:
It is no possible to get "long" information by "condor_q -l"

Expected results:
It will be possible to get "long" information by "condor_q -l".

Simple job:
  'universe' : 'vanilla',                                                     
  'executable' : '/bin/sleep',                                                
  'arguments' : '1',                                                          
  'error' : '/tmp/mrg_$(Cluster).$(Process).err',                             
  'output' : '/tmp/mrg_$(Cluster).$(Process).out',                            
  'log' : '/tmp/mrg_$(Cluster).$(Process).log',                               
  'iwd' : '/tmp',                                                             
  'requirements' : '(FileSystemDomain =!= UNDEFINED && Arch =!= UNDEFINED)',  
  'queue': '1',

Comment 1 Matthew Farrellee 2011-04-29 13:35:49 UTC
What is the frequency of this

Comment 4 Matthew Farrellee 2011-04-29 13:55:10 UTC
I suspect an issue with evaluating the Error attribute, possible a reserved word in ClassAds.

Side note - Output includes $(Cluster).$(Process), those should be evaluated during submission. The job has no Cluster nor Process attribute.

FYI, the stderr file attribute is Err.

Comment 5 Martin Kudlej 2011-05-02 14:10:53 UTC
I've done it via aviary/qmf.
This is QMF simple job:
ad = {"cmd": "/bin/sleep",
    "args":  "1",
    "requirements": "(FileSystemDomain =!= UNDEFINED && Arch =!= UNDEFINED)",
    "iwd":  "/tmp",
    "owner":  "condor",
    'error' : '/tmp/mrg_$(Cluster).$(Process).err',
    'output' : '/tmp/mrg_$(Cluster).$(Process).out',
    'log' : '/tmp/mrg_$(Cluster).$(Process).log',
    "!!descriptors":  {"requirements": "com.redhat.grid.Expression"}
    }                                                                 

Is anywhere any list of condor reserver classads which should not be submitted because it can raise error?
How can I process variables like $(Cluster) via QMF and Aviary? How does process them condor_submit?

Comment 6 Erik Erlandson 2011-05-03 23:19:49 UTC
(In reply to comment #4)
> I suspect an issue with evaluating the Error attribute, possible a reserved
> word in ClassAds.
> 
> FYI, the stderr file attribute is Err.

In support of that theory, condor_submit appears to forbid including 'error' as a variable:

$ cjs -n 1 -dur 600 -a '+error = "/tmp/error.txt"'
using temp dir /tmp/cjs_jsub_ReyEv2 for jsub files
preparing 1 jobs in submission file: /tmp/cjs_jsub_ReyEv2/cjs.jsub
submitting 1 jobs via jsub file /tmp/cjs_jsub_ReyEv2/cjs.jsub
Submitting job(s)
ERROR: Parse error in expression: 
	error = "/tmp/error.txt"
	^^^
Error in submit file
WARNING! submit failed with code 1


It works OK if you use something not exactly 'error':

[eje@rorschach hfs_func_tests]$ cjs -n 1 -dur 600 -a '+erro = "/tmp/error.txt"'
using temp dir /tmp/cjs_jsub_QNJXVR for jsub files
preparing 1 jobs in submission file: /tmp/cjs_jsub_QNJXVR/cjs.jsub
submitting 1 jobs via jsub file /tmp/cjs_jsub_QNJXVR/cjs.jsub
Submitting job(s).
1 job(s) submitted to cluster 15.
submit was successful


I assume the QMF/aviary connection is that they somehow by-pass this check on submission, until 'condor_q -l' hits it.

Comment 7 Erik Erlandson 2011-05-03 23:27:03 UTC
(In reply to comment #5)
                                       

> How can I process variables like $(Cluster) via QMF and Aviary? How does
> process them condor_submit?


I was under the impression that $(cluster) works correctly.  There is also a '$$()' syntax that evaluates at execution time:

See http://www.cs.wisc.edu/condor/manual/v7.6/2_5Submitting_Job.html#SECTION00356100000000000000

A special-purpose Machine Ad substitution macro can be used in string attributes in the submit description file. The macro has the form

  $$(MachineAdAttribute)

The $$() informs Condor to substitute the requested MachineAdAttribute from the machine where the job will be executed.

Comment 8 Erik Erlandson 2011-05-03 23:29:26 UTC
(In reply to comment #5)

> Is anywhere any list of condor reserver classads which should not be submitted
> because it can raise error?

I notice that the condor doc appears to be wrong:

http://www.cs.wisc.edu/condor/manual/v7.6/2_5Submitting_Job.html#SECTION00351200000000000000

  ####################     
  #                       
  # Example 2: demonstrate use of multiple     
  # directories for data organization.      
  #                                        
  ####################                    
                                         
  Executable = mathematica          
  Universe   = vanilla                   
  input      = test.data                
  output     = loop.out                
  error      = loop.error             
  Log        = loop.log                                                    
                                  
  Initialdir = run_1         
  Queue                         
                               
  Initialdir = run_2      
  Queue

Comment 9 Matthew Farrellee 2011-05-04 00:42:05 UTC
FYI

$ echo 'cmd=/bin/sleep\nargs=1d\nqueue' | condor_submit 
Submitting job(s).
1 job(s) submitted to cluster 14.
$ condor_q -l | grep ClusterId
ClusterId = 14
$ condor_qedit 14.0 error 1  
Set attribute "error".
$ condor_q -l | grep ClusterId

-- Failed to fetch ads from: <127.0.0.1:54561> : eeyore.local

Comment 10 Luigi Toscano 2011-05-04 08:05:00 UTC
(In reply to comment #8)
> (In reply to comment #5)
> 
> > Is anywhere any list of condor reserver classads which should not be submitted
> > because it can raise error?
> 
> I notice that the condor doc appears to be wrong:
> 
> http://www.cs.wisc.edu/condor/manual/v7.6/2_5Submitting_Job.html#SECTION00351200000000000000
> 

It is not: you can specify error and output in the job submission file; they are remapped in the resulting classad:

....
output=/tmp/foobar.out
error=/tmp/foobar.err
TransferOutputRemaps = "_condor_stdout=/tmp/foobar.out;_condor_stderr=/tmp/foobar.err"

Comment 11 Erik Erlandson 2011-05-04 22:26:39 UTC
FYI, list of reserved words for classads according to classad 2.4 doc:

The following words are reserved, meaning that they may not be used as attribute names.

    error false is isnt parent true undefined

Recognition of reserved words is independent of case. For example, false, FALSE, and False are all reserved words. 

http://www.cs.wisc.edu/condor/classad/refman/node3.html#SECTION00031000000000000000

Comment 13 Erik Erlandson 2011-05-05 20:21:14 UTC
I've confirmed with Pete that neither qmf nor aviary filter 'error' in the same way that condor_submit does.

So this construction should use 'err' instead of 'error':

ad = {"cmd": "/bin/sleep",
    "args":  "1",
    "requirements": "(FileSystemDomain =!= UNDEFINED && Arch =!= UNDEFINED)",
    "iwd":  "/tmp",
    "owner":  "condor",
    'error' : '/tmp/mrg_$(Cluster).$(Process).err',
    'output' : '/tmp/mrg_$(Cluster).$(Process).out',
    'log' : '/tmp/mrg_$(Cluster).$(Process).log',
    "!!descriptors":  {"requirements": "com.redhat.grid.Expression"}
    }


I posted an RFE to have aviary catch the use of classad keywords as attribute names
Bug 702489

Pete also mentioned that $() and $$() aren't currently handled, which also has an RFE:
Bug 702492

Comment 15 Martin Kudlej 2011-05-06 06:58:58 UTC
Because of comment #13 I close this bug as NOTABUG.


Note You need to log in before you can comment on or make changes to this bug.