Description of problem: The number of "group quota exceeded" is increased again. See bug 678590 for more information. That's probably because of Bug 805448 Version-Release number of selected component (if applicable): condor-7.6.7-0.7 How reproducible: 100% Steps to Reproduce: NUM_CPUS = 100 GROUP_NAMES = a GROUP_QUOTA_DYNAMIC_a = 1 GROUP_AUTOREGROUP_a = FALSE #HFS_MAX_ALLOCATION_ROUNDS = 10 HFS_ROUND_ROBIN_RATE = 1 universe = vanilla cmd = /bin/sleep args = 10m should_transfer_files = if_needed when_to_transfer_output = on_exit +AccountingGroup="a.u1" queue 50 +AccountingGroup="a.u2" queue 50 +AccountingGroup="a.u3" queue 50 +AccountingGroup="a.u4" queue 50 +AccountingGroup="a.u5" queue 50 +AccountingGroup="a.u6" queue 50 +AccountingGroup="a.u7" queue 50 +AccountingGroup="a.u8" queue 50 wait for all slots to get used Actual results: 234 group quota exceeded in negotiator log # grep -c "group quota exceeded" /var/log/condor/Neg* 234 Expected results: The number of group quota exceeded should be smaller Additional info:
Fix for this bug and Bug 805448 included in upstream ticket #2952. The 'group quota exceeded' messages were misleading in two ways. (1) the proper message should be 'submitter limit exceeded' (the code was updated to reflect this). (2) these messages were being output during the first negotiator pie spin, which allows submitter limits to be exceeded once in order to use up fractional remainders, and additionally if slot rank preemption allows it. I tweaked the log output logic to avoid printing 'submitter limit exceeded' (formerly 'group quota exceeded') when submitter limits are being ignored in the inner negotiation loop.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: Log message output logic for negotiation rejections did not properly reflect the negotiation logic in the code. Consequence: Rejection message 'group quota exceeded' was output instead of the correct 'submitter limit exceeded', and also was output when submitter limit were allowed to be exceeded. Fix: Message was changed to 'submitter limit exceeded' and logic was updated to not output this message when limits were actually allowed to be exceeded. Result: Log message is no longer misleading, and is only output when it is in effect.
Tested with: condor-7.8.7-0.6. Tested on: RHEL6 i386,x86_64 RHEL5 i386,x86_64 # grep -ci "exceed" /var/log/condor/Neg* 0 >>> verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0564.html