Bug 718265
Summary: | low-latency not expiring work | ||
---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Robert Rati <rrati> |
Component: | condor-low-latency | Assignee: | Robert Rati <rrati> |
Status: | CLOSED ERRATA | QA Contact: | Lubos Trilety <ltrilety> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 1.3 | CC: | jneedle, ltrilety, matt, mkudlej |
Target Milestone: | 2.0.1 | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | condor-low-latency-1.2-1 | Doc Type: | Bug Fix |
Doc Text: |
C: A job that causes the condor_starter to exit quickly, such as a job where the starter is unable to execute the program in the job, the low-latency daemon will not expire the low-latency job
C: The slot running the job that should have been expired will not be allowed to do any more work by the low-latency daemon until the daemon has been restarted.
F: Fixed issues with message expiration
R: Messages are expired as expected
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2011-09-07 16:43:29 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 723971 | ||
Bug Blocks: | 723887 |
Description
Robert Rati
2011-07-01 15:42:18 UTC
There were 2 issues with message expiration: 1) The messages were never being expired because the check for a slot being in use was resetting the access time 2) Once a message was expired, it was unable to be removed from the work queue thus keeping a slot "busy" that was actually empty Fixed on BZ718265-no-expiration Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: C: A job that causes the condor_starter to exit quickly, such as a job where the starter is unable to execute the program in the job, the low-latency daemon will not expire the low-latency job C: The slot running the job that should have been expired will not be allowed to do any more work by the low-latency daemon until the daemon has been restarted. F: Fixed issues with message expiration R: Messages are expired as expected Tested on RHEL5.6/6.1 x x86_64/i386 with condor-low-latency-1.1-3 and it doesn't work. Job expired after some time and slot was released. (There is an issue with exit hook, see bug 726761.) Tested with: condor-low-latency-1.2-2 Tested on: RHEL6 x86_64,i386 RHEL5 x86_64,i386 >>> VERIFIED An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1249.html |