Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 568502

Summary: Collector should advertise itself immediately
Product: Red Hat Enterprise MRG Reporter: Robert Rati <rrati>
Component: condorAssignee: Robert Rati <rrati>
Status: CLOSED ERRATA QA Contact: Lubos Trilety <ltrilety>
Severity: medium Docs Contact:
Priority: low    
Version: 1.2CC: iboverma, ltrilety
Target Milestone: 1.3   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
The collector did not advertise itself until it has been running for the amount of seconds specified in the 'COLLECTOR_UPDATE_INTERVAL' variable. With this update, the collector advertises itself immediately on startup and every 'COLLECTOR_UPDATE_INTERVAL' seconds.
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-10-14 15:58:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Robert Rati 2010-02-25 20:51:06 UTC
Description of problem:
The collector doesn't advertise itself until it has been running for COLLECTOR_UPDATE_INTERVAL seconds.  This is contrary to other daemons, who advertise themselves to the collector as soon as they start.  The Collector should advertise itself immediately on startup, then ever COLLECTOR_UPDATE_INTERVAL.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Robert Rati 2010-04-16 15:16:34 UTC
The issue is that the collector won't advertise itself when it doesn't have any startds in its hashtable.  The code in the comments seems to indicate there's an issue with people running collectors on every node.  The offending code in collector.cpp:

    // compute machine information
    machinesTotal = 0;
    machinesUnclaimed = 0;
    machinesClaimed = 0;
    machinesOwner = 0;
        ustatsAccum.Reset( );
    if (!collector.walkHashTable (STARTD_AD, reportMiniStartdScanFunc)) {
            dprintf (D_ALWAYS, "Error making collector ad (startd scan) \n");
    }

    // If we don't have any machines, then bail out. You oftentimes
    // see people run a collector on each macnine in their pool. Duh.
    if(machinesTotal == 0) {
              return 1;
      }

Comment 2 Robert Rati 2010-06-08 18:25:50 UTC
Moved the check for machinesTotal until after the collector has registered with local collectors.  This allows the collectors to register locally, but not with the UW pool.

Fixed in next build of condor.

Comment 3 Lubos Trilety 2010-08-31 09:49:55 UTC
Tested with (version):
condor-7.4.4-0.9

Tested on:
RHEL5 i386,x86_64  - passed
RHEL4 i386,x86_64  - passed

>>> VERIFIED

Comment 4 Martin Prpič 2010-10-07 14:45:40 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
The collector did not advertise itself until it has been running for the amount of seconds specified in the 'COLLECTOR_UPDATE_INTERVAL' variable. With this update, the collector advertises itself immediately on startup and every 'COLLECTOR_UPDATE_INTERVAL' seconds.

Comment 6 errata-xmlrpc 2010-10-14 15:58:23 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html