Bug 822955
Summary: | Resource upgrade never disabled if agent started in offline mode | ||
---|---|---|---|
Product: | [Other] RHQ Project | Reporter: | Lukas Krejci <lkrejci> |
Component: | Plugin Container | Assignee: | RHQ Project Maintainer <rhq-maint> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Filip Brychta <fbrychta> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 4.4 | CC: | ccrouch, fbrychta, hrupp, jsanda |
Target Milestone: | --- | ||
Target Release: | JON 3.1.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | 822905 | Environment: | |
Last Closed: | 2013-09-03 15:19:19 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 822905 | ||
Bug Blocks: | 782579 |
Description
Lukas Krejci
2012-05-18 15:41:50 UTC
Lukas From your description I would generally be +1 on including this. What tests can we put in place to show this doesn't introduce a regression or in fact make matters worse. Setting Assigned To correctly The agent and plugin container are designed to gracefully handle the loss of comms to the server during normal mode of operation. So I think the only test we'd want to add is that the agent enters the normal mode of operation in all circumstances that may occur during the startup (which are: primary server up, at least one of the servers up and no servers up). Because the communication layer is separate from the plugin container, this basically boils down to ensuring that resource upgrade is switched off regardless of the exceptions thrown during making the contact with the server (note that exceptions thrown during the upgrade of the individual resources are handled on per-resource basis by making sure that the resource that failed to upgrade never starts up during the plugin container lifetime and report an upgrade error to the server). The fix moves the switching logic into a finally block to make sure that the resource upgrade mode is always switched off regardless of the upgrade result. Our current test suite tests quite comprehensively the failures during the upgrade itself but doesn't test the behavior when there is an error in the agent-server communication - which is exactly what this BZ is about. I am going to add a test for that but I generally trust Java to handle the finally block right ;) I added a test (in master only) for the resource upgrade mode being inactive into the existing resource upgrade unit tests (that actually include a test for handling the server unavailability during upgrade, but did not catch this nuance). As "to show this doesn't introduce a regression or in fact make matters worse" - I think we just have to trust our current test suite that it would discover the side effects. As I wrote in the previous comments, I think the risk is low because the upgrade was always meant to only occur during inventory manager startup and the fix makes it so. Lukas, lets get the test and fix pushed to the release branch release/jon3.1.x http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commitdiff;h=19b728129a36cfaab4dcb7d8c7a9b52ff219a64a Author: Lukas Krejci <lkrejci> Date: Fri May 18 15:30:19 2012 +0200 [BZ 822905] - Make sure to always disable the resource upgrade after it has finished, even after a failure. (cherry picked from commit 5ebacc87cb292004186c0b8f6876e00649b18391) release/jon3.1.x http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commitdiff;h=47d0f7a9eb9f2d171948e3f4ecfbc649109ee083 Author: Lukas Krejci <lkrejci> Date: Tue May 22 12:07:02 2012 +0200 [BZ 822905] - modified the unit tests for checking that the resource upgrade is actually disabled after resource upgrade. (cherry picked from commit 5e5b9d9581e18b389046753b3c5e8d03067da42f) Setting this back to modified since its not available in a JON build yet, should be in ER5 Verified on 3.1.0.CR2 Bulk closing of old issues in VERIFIED state. |