Some customers would like a way to preserve the pre-HGQ legacy behavior where submitters against groups with quotas get to negotiate first (up to their quota), and then all other non-group submitters negotiate, with the option of also allowing groups with acct groups to get thrown in as well if "autoregroup" is enabled.
TESTING Begin with the following configuration, which enables accept-surplus and does not enable autoregroup: NEGOTIATOR_DEBUG = D_FULLDEBUG NEGOTIATOR_USE_SLOT_WEIGHTS = FALSE NEGOTIATOR_INTERVAL = 30 SCHEDD_INTERVAL = 15 CLAIM_WORKLIFE = 0 NUM_CPUS = 20 # turn off round robin and multiple allocation rounds HFS_ROUND_ROBIN_RATE = 100000000 HFS_MAX_ALLOCATION_ROUNDS = 1 GROUP_NAMES = a, b # 10 slots are left over for root group "none" GROUP_QUOTA_a = 5 GROUP_QUOTA_b = 5 # autoregroup is off GROUP_ACCEPT_SURPLUS = TRUE GROUP_AUTOREGROUP = FALSE Using the following submission: universe = vanilla cmd = /bin/sleep args = 60 should_transfer_files = if_needed when_to_transfer_output = on_exit +AccountingGroup="a.user" queue 10 +AccountingGroup="b.user" queue 10 Submitting the jobs verifies current behavior: $ tailf NegotiatorLog | grep -e 'Started Negotiation' -e 'group quotas: group=.*quota=.*requested=.*' -e 'Group.* - BEGIN' -e 'Negotiating with .* at' -e 'autoregroup' 12/08/11 20:37:02 ---------- Started Negotiation Cycle ---------- 12/08/11 20:37:03 group quotas: group= <none> quota= 10 requested= 0 allocated= 0 unallocated= 0 12/08/11 20:37:03 group quotas: group= a quota= 5 requested= 10 allocated= 10 unallocated= 0 12/08/11 20:37:03 group quotas: group= b quota= 5 requested= 10 allocated= 10 unallocated= 0 12/08/11 20:37:03 Group a - BEGIN NEGOTIATION 12/08/11 20:37:03 Negotiating with a.user@localdomain at <192.168.1.2:42279> 12/08/11 20:37:03 Group b - BEGIN NEGOTIATION 12/08/11 20:37:03 Negotiating with b.user@localdomain at <192.168.1.2:42279> All 20 jobs negotiate in the above cycle, using proportional surplus allocation: $ qvhist AccountingGroup JobStatus 10 a.user | 2 10 b.user | 2 20 total To test new autoregroup semantic, enable GROUP_AUTOREGROUP in the config above: GROUP_AUTOREGROUP = TRUE re-submit the jobs, and this time all twenty jobs should negotiate, but the 10 extra jobs from "a" and "b" should negotiate under "<none>", and after the "a" and "b" groups negotiate: 12/08/11 20:39:26 ---------- Started Negotiation Cycle ---------- 12/08/11 20:39:26 group quotas: autoregroup mode: appended 2 submitters to group <none> negotiation 12/08/11 20:39:26 group quotas: autoregroup mode: proportional surplus allocation disabled 12/08/11 20:39:26 group quotas: autoregroup mode: allocating 20 to group <none> 12/08/11 20:39:26 group quotas: group= <none> quota= 10 requested= 20 allocated= 20 unallocated= 0 12/08/11 20:39:26 group quotas: group= a quota= 5 requested= 10 allocated= 5 unallocated= 5 12/08/11 20:39:26 group quotas: group= b quota= 5 requested= 10 allocated= 5 unallocated= 5 12/08/11 20:39:26 group quotas: autoregroup mode: forcing group <none> to negotiate last 12/08/11 20:39:26 Group a - BEGIN NEGOTIATION 12/08/11 20:39:26 Negotiating with a.user@localdomain at <192.168.1.2:42279> 12/08/11 20:39:26 Group b - BEGIN NEGOTIATION 12/08/11 20:39:26 Negotiating with b.user@localdomain at <192.168.1.2:42279> 12/08/11 20:39:26 Group <none> - BEGIN NEGOTIATION 12/08/11 20:39:26 group quotas: autoregroup mode: negotiating with legacy mode for <none> 12/08/11 20:39:26 Negotiating with b.user@localdomain at <192.168.1.2:42279> 12/08/11 20:39:26 Negotiating with a.user@localdomain at <192.168.1.2:42279> 12/08/11 20:39:27 Negotiating with a.user@localdomain at <192.168.1.2:42279> $ qvhist AccountingGroup JobStatus 10 a.user | 2 10 b.user | 2 20 total
Tested on RHEL 5.9/6.4 x i386/x86_64 with condor-7.8.8-0.4.1 and it works. -->VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0564.html