Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 546736 - Schedd performs unnecessary file operations on SPOOL, targeting mpp.X.Y files
Schedd performs unnecessary file operations on SPOOL, targeting mpp.X.Y files
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor (Show other bugs)
1.2
All Linux
medium Severity medium
: 1.3
: ---
Assigned To: Matthew Farrellee
Luigi Toscano
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-12-11 15:11 EST by Matthew Farrellee
Modified: 2010-10-14 12:12 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, the condor_scheduler daemon attempted to access a SPOOL/mpp.ClusterId.ProcId file for every job when the job left the queue. With this update, traffic takes place only on the mpp file if it is specifically used.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-10-14 12:12:51 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0773 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Messaging and Grid Version 1.3 2010-10-14 11:56:44 EDT

  None (edit)
Description Matthew Farrellee 2009-12-11 15:11:01 EST
All relevant versions, including 7.4.1-0.7

The condor_schedd will attempt to access a SPOOL/mpp.ClusterId.ProcId file for every job when the job leaves the queue, either by rm or successful completion. The rm example is given below. There should only be traffic on the mpp file if it is specifically used, e.g. by providing -password to condor_submit or specifying +MyProxyPassword in a submit file.

Actual output:

$ echo "cmd=/bin/sleep\nargs=1h\nnotification=never\nqueue 3" | condor_submit
Submitting job(s)...
3 job(s) submitted to cluster 7.
$ condor_rm 7
Cluster 7 has been marked for removal.
$ echo "cmd=/bin/sleep\nargs=1h\nnotification=never\nqueue 3" | condor_submit -password downwithmpp
Submitting job(s)...
3 job(s) submitted to cluster 8.
$ condor_rm 8
Cluster 8 has been marked for removal.

# strace -p $(pidof condor_schedd) 2>&1 | grep spool | grep mpp
stat64("/var/lib/condor/spool/mpp.7.0", 0xbfbbd664) = -1 ENOENT (No such file or directory)
stat64("/var/lib/condor/spool/mpp.7.1", 0xbfbbd664) = -1 ENOENT (No such file or directory)
stat64("/var/lib/condor/spool/mpp.7.2", 0xbfbbd664) = -1 ENOENT (No such file or directory)
stat64("/var/lib/condor/spool/mpp.8.0", 0xbfbbd2c4) = -1 ENOENT (No such file or directory)
open("/var/lib/condor/spool/mpp.8.0", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = 14
stat64("/var/lib/condor/spool/mpp.8.2", 0xbfbbd664) = -1 ENOENT (No such file or directory)
stat64("/var/lib/condor/spool/mpp.8.0", {st_mode=S_IFREG|0600, st_size=11, ...}) = 0
unlink("/var/lib/condor/spool/mpp.8.0") = 0


Expected output:

$ echo "cmd=/bin/sleep\nargs=1h\nshould_transfer_files=no\nnotification=never\nqueue 3" | condor_submit
Submitting job(s)...
3 job(s) submitted to cluster 9.
$ condor_rm 9
Cluster 9 has been marked for removal.
$ echo "cmd=/bin/sleep\nargs=1h\nshould_transfer_files=no\nnotification=never\nqueue 3" | condor_submit -password downwithmpp
Submitting job(s)...
3 job(s) submitted to cluster 10.
$ condor_rm 10
Cluster 10 has been marked for removal.

# strace -p $(pidof condor_schedd) 2>&1 | grep spool | grep mpp
stat64("/var/lib/condor/spool/mpp.10.0", 0xbff8a494) = -1 ENOENT (No such file or directory)
open("/var/lib/condor/spool/mpp.10.0", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = 14
stat64("/var/lib/condor/spool/mpp.10.0", {st_mode=S_IFREG|0600, st_size=11, ...}) = 0
unlink("/var/lib/condor/spool/mpp.10.0") = 0
Comment 1 Matthew Farrellee 2009-12-15 15:55:56 EST
Initial change for this it to have the Schedd set attribute MyProxyPasswordExists to TRUE whenever the MyProxyPassword attribute is encountered and written to the mpp file. The deletion of the mpp file is guarded by a test on MyProxyPasswordExists.

Upgrade impact: Any existing job with an mpp file will have it left in the filesystem when the job leaves the queue. The mpp files are not protected from PREEN, so they will not be leaked forever. To avoid this problem, after upgrade condor_qedit can be run to set MyProxyPasswordExists on all jobs. It will result in extra attempts to stat the mpp file for jobs that do not have one, which is no worse than the current situation.
Comment 2 Matthew Farrellee 2009-12-15 15:58:12 EST
It is not desirable to entirely avoid the upgrade issue by having the absence of MyProxyPasswordExists mean the file may exist. Doing so means that condor_submit (and all submitters) must set the attribute to avoid the unnecessary file operations on SPOOL.
Comment 3 Matthew Farrellee 2009-12-15 16:00:45 EST
Upstream ticket: http://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=1061
Comment 4 Matthew Farrellee 2010-01-04 13:19:01 EST
Fixed in 7.4.2-0.1
Comment 5 Luigi Toscano 2010-06-25 13:52:30 EDT
strace still shows calls to stat() when no password is specified.
condor-7.4.3-0.21, RHEL4.8/5.5, i386/x86_64.
Comment 6 Matthew Farrellee 2010-06-28 09:26:43 EDT
UPSTREAM-7.5.1-BZ546736-spool-mpp-files was mislabeled as 7.4.1 and never made it into the build. Definitely included in 7.4.4-0.3.
Comment 7 Luigi Toscano 2010-07-27 13:31:04 EDT
Access to mpp* files happens only when a password is specified.

Verified on condor-7.4.4-0.4, RHEL4.8/5.5, i386/x86_64.
Comment 8 Florian Nadge 2010-10-07 13:44:13 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, the condor_scheduler daemon attempted to access a SPOOL/mpp.ClusterId.ProcId file for every job when the job left the queue. With this update, traffic takes place only on the mpp file if it is specifically used, by providing -password to condor_submit or specifying +MyProxyPassword in a submit file.
Comment 9 Florian Nadge 2010-10-07 13:44:35 EDT
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Previously, the condor_scheduler daemon attempted to access a SPOOL/mpp.ClusterId.ProcId file for every job when the job left the queue. With this update, traffic takes place only on the mpp file if it is specifically used, by providing -password to condor_submit or specifying +MyProxyPassword in a submit file.+Previously, the condor_scheduler daemon attempted to access a SPOOL/mpp.ClusterId.ProcId file for every job when the job left the queue. With this update, traffic takes place only on the mpp file if it is specifically used.
Comment 11 errata-xmlrpc 2010-10-14 12:12:51 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html

Note You need to log in before you can comment on or make changes to this bug.