Bug 1599625
Summary: | [GSS](6.4.z) Host controllers can not connect to domain after creating a rollout plan and restarting the master host controller | ||
---|---|---|---|
Product: | [JBoss] JBoss Enterprise Application Platform 6 | Reporter: | tmiyargi |
Component: | Domain Management | Assignee: | Jiri Ondrusek <jondruse> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Peter Mackay <pmackay> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 6.4.21 | CC: | bmaxwell, brian.stansberry, dandread, dcihak, jbaesner, jondruse, pmackay, rstancel |
Target Milestone: | CR1 | ||
Target Release: | EAP 6.4.21 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-08-19 12:45:38 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1567790 |
Description
tmiyargi
2018-07-10 08:34:39 UTC
What doesn't work is a slave HC reconnecting to the master following loss of connectivity. A common case for that being the master is reloaded, which is the specific thing reported here. Other things that cause reconnection, e.g. a network outage detected by the slave and then later resolved, would result in the same problem. There is a guard in the code that rejects a particular call path for providing updates to rollout-plan resources, unless the resource is in a kind of "initial" state, i.e. what it would be in early in HC boot. When the slave HC reconnects it syncs its local copy of the domain-wide model with what the master currently has, and while doing that it uses the call path that's being rejected. Best fix is probably to eliminate that guard as the value it provides is basically theoretical, a check against EAP developers doing something wrong that is hard to imagine actually being done. Trying to work around the call path that trips the guard would add complexity to already complex code. |