Creating a rollout plan and restarting the dc host prevent the other hosts to connect to the master again. The slave hc is unable to connect giving the error JBAS014687: Resource is immutable, the dc shows many errors like: JBAS012119: cancelled task by interrupting thread Thread[Host Controller Service Threads - 117,5,Host Controller Service Threads] To reproduce create a domain with master and slave, create a rollout plan and restart like this: rollout-plan add --name=my-plan --content={rollout groupa^groupb} /host=my-dc:reload
What doesn't work is a slave HC reconnecting to the master following loss of connectivity. A common case for that being the master is reloaded, which is the specific thing reported here. Other things that cause reconnection, e.g. a network outage detected by the slave and then later resolved, would result in the same problem. There is a guard in the code that rejects a particular call path for providing updates to rollout-plan resources, unless the resource is in a kind of "initial" state, i.e. what it would be in early in HC boot. When the slave HC reconnects it syncs its local copy of the domain-wide model with what the master currently has, and while doing that it uses the call path that's being rejected. Best fix is probably to eliminate that guard as the value it provides is basically theoretical, a check against EAP developers doing something wrong that is hard to imagine actually being done. Trying to work around the call path that trips the guard would add complexity to already complex code.