Created attachment 1457720 [details] Expand task Description of problem: When expand cluster job is triggered it imports only one node. After that it is marked as finished but the other node remains unmanaged and option expand is no longer possible. The node that remains unmangaged is the one that have created Host dashboard before Cluster Expand is triggered as described in BZ 1599630. Version-Release number of selected component (if applicable): tendrl-ansible-1.6.3-5.el7rhgs.noarch tendrl-api-1.6.3-4.el7rhgs.noarch tendrl-api-httpd-1.6.3-4.el7rhgs.noarch tendrl-commons-1.6.3-8.el7rhgs.noarch tendrl-grafana-plugins-1.6.3-6.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch tendrl-monitoring-integration-1.6.3-6.el7rhgs.noarch tendrl-node-agent-1.6.3-8.el7rhgs.noarch tendrl-notifier-1.6.3-4.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch tendrl-ui-1.6.3-6.el7rhgs.noarch How reproducible: 100% Steps to Reproduce: 1. Import cluster with 4 nodes into WA. 2. Add 2 more nodes to cluster and create distributed replicated volume. 3. Restart all nodes. 4. Expand cluster in WA. Actual results: Only one node is imported. Expected results: All nodes should be imported. Additional info: For configuration with created distributed-replicated volume it happens every time but when tried expand only with peer probe without any volume it imported all nodes.
Created attachment 1457721 [details] 1 - Node is expanded by task
Created attachment 1457722 [details] 2 - Node remains unmanaged
This is actually not an issue with the expand flow. Rather what has happened in this scenario is that the one of nodes pending for expansion got elected as the provisioner of the cluster and so got removed from the list of expansion nodes. It would need change in provisioner election logic and have an additional check to make sure nodes pending for expansion should not participate in the election.
Added upstream issue and PR: https://github.com/Tendrl/node-agent/pull/841
Now with the fix the nodes pending for expansion cannot claim the provisioner tag, and so such issue where one node gets left out while expansion of cluster.
This problem breaks expand RFE BZ 1516417.
Tested multiple times. Seems ok. --> VERIFIED Tested with: tendrl-ansible-1.6.3-5.el7rhgs.noarch tendrl-api-1.6.3-4.el7rhgs.noarch tendrl-api-httpd-1.6.3-4.el7rhgs.noarch tendrl-commons-1.6.3-9.el7rhgs.noarch tendrl-grafana-plugins-1.6.3-7.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch tendrl-monitoring-integration-1.6.3-7.el7rhgs.noarch tendrl-node-agent-1.6.3-9.el7rhgs.noarch tendrl-notifier-1.6.3-4.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch tendrl-ui-1.6.3-8.el7rhgs.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2616