Bug 608148
| Summary: | Stop operations on a Tomcat Connector fails after successful Restart operation of the Tomcat Server | ||
|---|---|---|---|
| Product: | [Other] RHQ Project | Reporter: | John Sefler <jsefler> |
| Component: | Operations | Assignee: | Jay Shaughnessy <jshaughn> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Corey Welton <cwelton> |
| Severity: | high | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 3.0.0 | CC: | jshaughn |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | 2.4 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2010-08-12 16:48:27 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 593121 | ||
|
Description
John Sefler
2010-06-25 19:53:00 UTC
So this is an interesting bug that touches on an area that's been a problem for me in the past, and I've worked around it. This time I think the fix is in place for the root cause. The issue is in the way the JMX plugin caches the component mbean. The caching is fine most of the time but the bean's validity is only verified in the MBeanResourceComponent impl of getAvailability(). This is problematic in that there can be a significant window of time (minutes) between the bean becoming invalid and a call to getAvailability(). And if, in the case of this issue report, getAvailability() is overriden without calling the super, the bean will never get refreshed short of an agent shutdown. When the server restart operation happens the TC server is shutdown and restarted. The mbean connections are all lost at that point and the cached beans become invalid. They stay that way at least until the next availability check. Note that that check is scheduled by the plugin container, it is unrelated to the fact that the server has been restarted via the operation. So, metric collection, operations, etc are all going to be in trouble until the avail check. And TC connectors, due to the override, will not perform correctly after the restart. The solution is to change the implementation of MBeanResourceComponent.getEmsBean(). This method typically returns the cached bean. I'm adding a (fast) check to ensure that the cached bean's connection matches the current emsConnection. If not the bean is reset. This has possible benefit to all JMX based plugins. I am sure there must be other code paths, especially for plugins offering stop/start/restart capability, where this could have been a problem. note - reviewed with mazz. fix commit: c6a959a6fd636f15c76493bf20a9ad779e441175 In addition to verifying the scenario written up in this BZ I would recommend that QA also attempt a similar test with an AS4 restart. After the restart try an operation on some child service and not the server itself. Analogous to the connector operation used for TC. Tested against Tomcat5 and seem to be fine. Will test against AS4. QA Verified against Tomcat5 and EAP4.3 After performing a restart op on the server and then stop/start against some child resource, things look fine. Mass-closure of verified bugs against JON. |