Bug 488998
Summary: | carod cannot handle broker restart | ||
---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Matthew Farrellee <matt> |
Component: | grid | Assignee: | Robert Rati <rrati> |
Status: | CLOSED ERRATA | QA Contact: | Martin Kudlej <mkudlej> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 1.1 | CC: | iboverma, lans.carstensen, lbrindle, mkudlej, tao |
Target Milestone: | 1.2 | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Grid bug fix
C: MRG Messaging Broker restarted while low-latency is running on a grid execute node
C: The low-latency daemon (carod) would stop processing jobs and crash
F: Fixed the daemon to check for disconnections and to attempt to reconnect
R: The daemon no longer crashes and will resume processing jobs once the broker is running again
If the MRG Messaging Broker was restarted while low-latency was running on a grid execute node, the low-latency daemon (carod) would stop processing jobs and crash. The daemon now checks for disconnections and attempts to reconnect. This prevents the daemon from crashing and will resume processing jobs once the broker is running again.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2009-12-03 09:16:06 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 522467 | ||
Bug Blocks: | 527551 |
Description
Matthew Farrellee
2009-03-06 17:11:19 UTC
Exception in thread Thread-2: Traceback (most recent call last): File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap self.run() File "/usr/lib64/python2.4/threading.py", line 422, in run self.__target(*self.__args, **self.__kwargs) File "/usr/sbin/carod", line 239, in lease_monitor item.unlock(False) File "/usr/sbin/carod", line 56, in unlock self.__access_lock__.release() File "/usr/lib64/python2.4/threading.py", line 113, in release assert self.__owner is me, "release() of un-acquire()d lock" AssertionError: release() of un-acquire()d lock Fixed in: condor-low-latency-1.0-18 Tested on RHEL5.4 condor-7.4.0-0.5 and RHEL4.8 condor-7.4.0-0.4 i386/x86_64 and with condor-low-latency-1.0-19 and it works --> VERIFIED Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Carod is no longer crashing when broker is restarted (488998) Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1,8 @@ -Carod is no longer crashing when broker is restarted (488998)+Grid bug fix + +C: MRG Messaging Broker restarted +C: Carod would experience and exception and crash +F: +R: Carod no longer crashes. + +NEED FURTHER INFORMATION FOR RELNOTE. C: MRG Messaging Broker restarted while low-latency is running on a grid execute node C: The low-latency daemon (carod) would stop processing jobs and crash F: Fixed the daemon to check for disconnections and to attempt to reconnect R: The daemon no longer crashes and will resume processing jobs once the broker is running again Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,8 +1,8 @@ Grid bug fix -C: MRG Messaging Broker restarted -C: Carod would experience and exception and crash -F: -R: Carod no longer crashes. +C: MRG Messaging Broker restarted while low-latency is running on a grid execute node +C: The low-latency daemon (carod) would stop processing jobs and crash +F: Fixed the daemon to check for disconnections and to attempt to reconnect +R: The daemon no longer crashes and will resume processing jobs once the broker is running again -NEED FURTHER INFORMATION FOR RELNOTE.+If the MRG Messaging Broker was restarted while low-latency was running on a grid execute node, the low-latency daemon (carod) would stop processing jobs and crash. The daemon now checks for disconnections and attempts to reconnect. This prevents the daemon from crashing and will resume processing jobs once the broker is running again. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-1633.html |