Bug 850392
| Summary: | RFE Update Hunting+Splitting+Defaults algorithm | ||
|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Timothy St. Clair <tstclair> |
| Component: | condor | Assignee: | Timothy St. Clair <tstclair> |
| Status: | CLOSED ERRATA | QA Contact: | Luigi Toscano <ltoscano> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | Development | CC: | dkinon, dryan, ltoscano, matt, mkudlej, rrati, tstclair |
| Target Milestone: | 2.3 | Keywords: | FutureFeature, TestOnly |
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | condor-7.8.2-0.1 | Doc Type: | Rebase: Enhancements Only |
| Doc Text: |
Important: if this rebase also contains *bug fixes* (or contains only bug fixes), select the correct option from the Doc Type drop-down list.
Rebase package(s) to version:
condor-7.8 series
Highlights and notable enhancements:
The default settings have been updated to better allow for accurate matching and reuse of partitionable slots. The defaults have been changed for job submission, and execute resources. The enhancements enable more accurate memory tracking of jobs and higher resource utilization.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2013-03-06 18:45:14 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 845292 | ||
|
Description
Timothy St. Clair
2012-08-21 14:21:54 UTC
The testing for this is quite involved. It requires testing the following according to version(s): state space: 001 - old submit, old schedd, new exec 010 - old submit, new schedd, old exec 011 - old submit, new schedd, new exec 100 - new submit, old schedd, old exec (not realistic) 101 - new submit(remote), old schedd, new exec 110 - new submit, new schedd, old exec 111 - all new. ++ It requires potential feedback from the field on examples of existing RequestMemory expressions to ensure compatibility. ++ MemoryUsage = expr referencing PSS *** Bug 845569 has been marked as a duplicate of this bug. *** Can you please provide few examples (>=2) where the behavior changed (what changed, configuration, etc)? Request is to come with use case difference from an administration and daemon perspective. From the administrators perspective, the behavior has changed in the following way: When a job is initially submitted everything appears similar to as it had before, with slight modifications on the auto-filled data (https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2835), and it is advised that the user specify the Request* variables and not overwrite the Requirements expr. Once it is matched, the startd will evaluate the RequestMemory Expression according to its MODIFY_REQUEST_EXPR_* and modify the initial request to better carve the partitionable slot for reuse. In this way the behavior has changed, as the RequestMemory is usually a lower bound on what will be supplied. Once the job has completed its run it will update the attribute MemoryUsage with the value of RSS which will be an upper bound. This MemoryUsage is referenced in the default requirements expression. Re comment #7, methods for verification could include: 1. post job submission check the Requirements expression via condor_q -long and verify the differences between the two versions. 2. One the jobs have been match and landed on partitionable slot run condor_status -long and verify that the slot has been split on quantized boundaries. 3. Once a job has completed verify via condor_q -long that MemoryUsage has been updated with RSS. Verified (thanks to mkudlej for most of the work) on supported configuration (RHEL5.9/6.4 beta, i386/x86_64): condor-7.8.8-0.4.1 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0564.html |