Bug 476779

Summary: Job Router doesn't recognize routed job complete
Product: Red Hat Enterprise MRG Reporter: Robert Rati <rrati>
Component: gridAssignee: Matthew Farrellee <matt>
Status: CLOSED ERRATA QA Contact: Jeff Needle <jneedle>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 1.0CC: matt
Target Milestone: 1.1   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-02-04 16:04:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Robert Rati 2008-12-17 02:51:25 UTC
Description of problem:
When a cluster of 10 jobs is submitted, the jobs all run to completion (routed jobs shutdown), but the job router never recognizes that the routed job as completed so the source jobs never complete and go through the finalization process.

Version-Release number of selected component (if applicable):
7.2.0-0.13
hooks-1.0-8

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Matthew Farrellee 2008-12-18 18:02:22 UTC
Fix will be present after 7.2.0-0.13

commit 915980142bb380619c6a7677f44b04f3277058b7
Author: Matthew Farrellee <matt>
Date:   Thu Dec 18 11:19:30 2008 -0600

    Fixed a dirty write bug in the JobRouter's Status Hook
    
    The JobRouter would get a copy of an ad from the ClassAdCollection and
    spawn the hook. While the hook was executing the JobRouter would
    update the ad the hook was working on. When the hook returned the
    stale ad would be used for updates. The result was stale data would
    overwrite current data.
    
    The fix is to sync the ad with the ClassAdCollection when the hook
    returns.

Comment 4 errata-xmlrpc 2009-02-04 16:04:15 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0036.html