Created attachment 489281 [details] Test job_queue.log Definitely an issue in 7.6.0-0.4, presumed issue in 1.3.2. Using the attached job_queue.log, (a) Query the JobServer for the details of 246.0. A failure does not return attributes found only in the 0246.-1 ad, for instance Owner. Success returns all attributes, with the values for duplicate attributes coming from the 246.0 ad, for instance JobPrio = -1 (not 0). (b) Also look at the Submission, eeyore.local#246. A failure will show Owner = Unknown. Success is Owner = matt. For (a), the issue is eager creation of the Job in JobServerJobLogConsumer. If the parent, cluster, ad has not already been created a placeholder is made. When that placeholder is populated, the Job is never given a chance to update the Submission from which it comes to have a proper owner. For (b), the code to iterate over the job ad for details does not iterate over the job's parent ad.
aviary: fixed at FH V7_6-aviary-branch 55b2689cd12f0c2c755f1df5952595e6d662c465 qmf: still needs work...
Even better fixes... :-) aviary: FH at 3658ae5 V7_6-aviary-branch qmf: UW at 1083848, 93798c5 V7_6-branch
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: Proc entry appears before cluster entry in job queue log. Consequence: No opportunity to update the internal SubmissionObject once ATTR_OWNER set from log before the submission name. Fix: Code changes so that the internal Job object updates its associated SubmissionObject with an owner or name regardless of order. Result: What now happens when the actions or circumstances above occur.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,4 +1,4 @@ Cause: Proc entry appears before cluster entry in job queue log. Consequence: No opportunity to update the internal SubmissionObject once ATTR_OWNER set from log before the submission name. Fix: Code changes so that the internal Job object updates its associated SubmissionObject with an owner or name regardless of order. -Result: What now happens when the actions or circumstances above occur.+Result: Submission name and owner data appear correctly through QMF and Aviary queries.
Reproduced on RHEL5/x86_64 with: $CondorVersion: 7.6.0 Mar 30 2011 BuildID: RH-7.6.0-0.4.el5 PRE-RELEASE-GRID $ $CondorPlatform: X86_64-Redhat_5.6 $ Qpid-tool is broken in this version, the jobserver must to be asked from python code directly: qpid: call 102 GetJobAd 246.0 qpid: invalid conversion: Variant is not a string; use asString() if conversion is required. (qpid/types/Variant.cpp:569) (7) - {} qpid: call 102 GetJobAd "246.0" qpid: Invalid Job Id (65536) - {} From my script I found relevant info: JobPrio = 0 Owner = Unknown
Retested over all supported platforms x86,x86_64/RHEL5,RHEL6 with: condor-7.6.1-0.4 The job_queue_log was read as expected from jobserver: qpid: call 220 GetJobAd "246.0" .... 'Owner': 'matt' JobPrio': -1 .... >>> VERIFIED
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2011-0889.html