Bug 753822 - Make condor_job_server default submission publisher
Summary: Make condor_job_server default submission publisher
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor-qmf
Version: 2.1
Hardware: All
OS: Linux
low
medium
Target Milestone: 2.3
: ---
Assignee: Pete MacKinnon
QA Contact: Lubos Trilety
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-11-14 15:30 UTC by Pete MacKinnon
Modified: 2013-03-06 18:39 UTC (History)
9 users (show)

Fixed In Version: condor-7.8.2-0.1
Doc Type: Bug Fix
Doc Text:
Release Note: "The default configuration of the QMF Job Server used by cumin has been changed. The previous default was to use an embedded QMF Job Server object managed within a schedd plug-in component. The default configuration now launches a standalone daemon (condor_job_server) and disables the submission publishing from within the plug-in's embedded Job Server. Grid users using the old default should note that with the new default configuration, job details will now be available for jobs that have been completed or removed from the queue."
Clone Of:
Environment:
Last Closed: 2013-03-06 18:39:42 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:0564 0 normal SHIPPED_LIVE Low: Red Hat Enterprise MRG Grid 2.3 security update 2013-03-06 23:37:09 UTC

Description Pete MacKinnon 2011-11-14 15:30:52 UTC
The condor_job_server should be preferred by users for job submission data and job history (which is otherwise unavailable through the schedd plugin submission publishing). The plugin publisher can only provide full job data for live jobs still in the queue.

Thus, we should ensure the the default qmf condor config activates the JS OOTB.

Comment 1 Pete MacKinnon 2011-12-02 14:54:16 UTC
Customer considerations and impact:

The current OOTB default is provided in the QMF schedd plug-in code as
QMF_PUBLISH_SUBMISSIONS = True
Also, the base configuration does not add the JOB_SERVER to the DAEMON_LIST 
although it is defined.

Recommend the following:

1) Code change within the condor_job_server to halt activation if
QMF_PUBLISH_SUBMISSIONS = True
This will avoid confusion from having two *redundant* job servers active 
in the same QMF object space.

2) Modify 60condor-qmf.config to add/uncomment:
	QMF_PUBLISH_SUBMISSIONS = False
	DAEMON_LIST = $(DAEMON_LIST), JOB_SERVER

3) Release Note
	"The default configuration of the QMF Job Server used by cumin has 
	been changed. The previous default was to use an embedded QMF Job 
	Server object managed within a schedd plug-in component. The default 
	configuration now launches a standalone daemon (condor_job_server) and
	disables the submission publishing from within the plug-in's embedded 
	Job Server. Grid users using the old default should note that with the 
	new default configuration, job details will now be available for jobs 
	that have been completed or removed from the queue."

4) Possibly revise section of MCIG <4.1.1. Job Server Configuration> to note
or emphasize the new default.

Comment 2 Pete MacKinnon 2011-12-02 17:34:08 UTC
No base-db changes required since there is no feature that specifically activates the current default of QMF_PUBLISH_SUBMISSIONS = True.

Comment 5 Pete MacKinnon 2012-03-14 16:36:54 UTC
Validation techniques:
- 1 jobserver object listed via qpid-tool where only 1 schedd has been deployed
- jobserver QMF object contains a "jobserver" prefix to schedd host, not "scheduler"
- full job ads can be retrieved using QMF from a job that has been committed to history (confirmed using condor_history)

Comment 6 Pete MacKinnon 2012-03-14 17:19:01 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Release Note:
"The default configuration of the QMF Job Server used by cumin has been changed. The previous default was to use an embedded QMF Job Server object managed within a schedd plug-in component. The default configuration now launches a standalone daemon (condor_job_server) and disables the submission publishing from within the plug-in's embedded Job Server. Grid users using the old default should note that with the new default configuration, job details will now be available for jobs that have been completed or removed from the queue."

Comment 9 Lubos Trilety 2013-01-09 19:21:59 UTC
(In reply to comment #5)
> Validation techniques:
> - 1 jobserver object listed via qpid-tool where only 1 schedd has been
> deployed
> - jobserver QMF object contains a "jobserver" prefix to schedd host, not
> "scheduler"
> - full job ads can be retrieved using QMF from a job that has been committed
> to history (confirmed using condor_history)

Everything seems ok, only one thing I don't understand, what does it mean
jobserver QMF object contains a "jobserver" prefix to schedd host, not "scheduler"?
I compare schemas in qpid-tool on old version with schema on new version and it seems the same for me.

Comment 10 Pete MacKinnon 2013-01-16 16:49:10 UTC
Clarification and elaboration of comment #5:

The QMF implementation allows the schedd plugin to act as both a "scheduler" and a "jobserver". In this scenario, its QMF object id should contain the string "scheduler" in it anywhere. In the opposite scenario, you will not see that.

%qpid-tool
Management Tool for QPID
qpid: list jobserver
Management Object Types:
    ObjectType       Active  Deleted
    ==================================
    com.redhat.grid:jobserver  1       0
qpid: show com.redhat.grid:jobserver

Comment 12 Lubos Trilety 2013-01-21 17:04:42 UTC
Tested with:
condor-qmf-7.8.8-0.4

Tested on:
RHEL6 i386,x86_64
RHEL5 i386,x86_64

second point verification: The scheduler name is not present in the show command

SCHEDD_NAME = scheduler@$(FULL_HOSTNAME)

schedd plugin:
qpid: show 214
...
    Name                              scheduler@host

jobserver process:
qpid: show 236
...
    Name                              host

>>> verified

Comment 14 errata-xmlrpc 2013-03-06 18:39:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0564.html


Note You need to log in before you can comment on or make changes to this bug.