Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 706977 - HGQ sanity check on recursive allocation is checking wrong variable
HGQ sanity check on recursive allocation is checking wrong variable
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor (Show other bugs)
2.0
Unspecified Unspecified
medium Severity low
: 2.0.1
: ---
Assigned To: Erik Erlandson
Jan Sarenik
:
Depends On:
Blocks: 723887
  Show dependency treegraph
 
Reported: 2011-05-23 12:37 EDT by Erik Erlandson
Modified: 2011-09-07 12:43 EDT (History)
8 users (show)

See Also:
Fixed In Version: condor-7.6.2-0.1
Doc Type: Bug Fix
Doc Text:
Cause: Sanity check on group quota surplus allocation was examining wrong variable. Consequence: A spurious warning message could be output when there was remaining surplus, even though allocation was correct. Fix: Update sanity check to examine the correct variable. Result: Spurious warning no longer appears.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-09-07 12:43:04 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Simple reproducer/verification script (3.04 KB, application/x-gzip)
2011-07-27 13:53 EDT, Jan Sarenik
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:1249 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Grid 2.0 security, bug fix and enhancement update 2011-09-07 12:40:45 EDT

  None (edit)
Description Erik Erlandson 2011-05-23 12:37:02 EDT
Description of problem:

This should be checking fabs(s), not fabs(surplus):

            double s = hgq_allocate_surplus(groups[j], allocated[j]);
            if (fabs(surplus) > 0.00001) {
                dprintf(D_ALWAYS, "group quotas: WARNING: allocate-surplus (3): surplus= %g\n", s);
            }


In cases where there is remaining surplus, this will output the warning message for no good reason.
Comment 1 Erik Erlandson 2011-06-22 14:51:02 EDT
upstream: https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2262
Comment 2 Erik Erlandson 2011-06-22 14:52:19 EDT
Fix upstream on V7_6-branch
Comment 3 Erik Erlandson 2011-06-22 14:54:23 EDT
repro/test

Use this config:

CLAIM_WORKLIFE = 0
NEGOTIATOR_CONSIDER_PREEMPTION = FALSE
NEGOTIATOR_DEBUG = D_FULLDEBUG

NEGOTIATOR_USE_SLOT_WEIGHTS = FALSE

GROUP_QUOTA_MAX_ALLOCATION_ROUNDS = 1

NEGOTIATOR_INTERVAL = 30
SCHEDD_INTERVAL = 15

NUM_CPUS = 20

GROUP_NAMES = a, b, a.a1
GROUP_QUOTA_a = 10
GROUP_QUOTA_b = 10
GROUP_QUOTA_a.a1 = 5

GROUP_ACCEPT_SURPLUS = TRUE

Submit the following job, that invokes the use of surplus for "a.a1" (but does not use all of it)

universe = vanilla
cmd = /bin/sleep
args = 60
should_transfer_files = if_needed
when_to_transfer_output = on_exit
+AccountingGroup="a.a1.user"
queue 7

Before fix: the sanity check looks at the wrong variable, and emits a warning message even though it worked correctly

$ tail -f NegotiatorLog | grep WARNING
06/22/11 11:20:14 group quotas: WARNING: allocate-surplus (3): surplus= 0

After fix: the spurious warning will not appear.
Comment 4 Erik Erlandson 2011-06-22 14:54:23 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause:
Sanity check on group quota surplus allocation was examining wrong variable.

Consequence:
A spurious warning message could be output when there was remaining surplus, even though allocation was correct.

Fix:
Update sanity check to examine the correct variable.

Result:
Spurious warning no longer appears.
Comment 6 Jan Sarenik 2011-07-27 11:20:39 EDT
Reproduced with condor-7.6.1-0.10.el5.x86_64 (MRG 2.0)

group quotas: WARNING: allocate-surplus (3): surplus= 0
Comment 7 Jan Sarenik 2011-07-27 13:50:38 EDT
Verified and reproduced with
  condor-7.6.3-0.2.el5.i386
  condor-7.6.3-0.2.el5.x86_64
  condor-7.6.3-0.2.el6.i686
  condor-7.6.3-0.2.el6.x86_64

Verification: no WARNING after running reproducer
Comment 8 Jan Sarenik 2011-07-27 13:53:08 EDT
Created attachment 515572 [details]
Simple reproducer/verification script

To reproduce, run
 include/clean.sh; MRG_REPO=stable ./runtest.sh

To verify, run
 include/clean.sh; unset MRG_REPO; ./runtest.sh
Comment 9 errata-xmlrpc 2011-09-07 12:43:04 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1249.html

Note You need to log in before you can comment on or make changes to this bug.