Hide Forgot
Description of problem: Failed to scale out nodes in pre-existing env Version-Release number of selected component (if applicable): atomic-openshift-utils-3.0.20-1.git.0.3703f1b.el7aos.noarch How reproducible: Always Steps to Reproduce: 1.Install an env of one master and two nodes by quick install 2.Add a new node to the existing env using atomic-openshft-installer Actual results: TASK: [openshift_manage_node | Wait for Node Registration] ******************** failed: [10.x.x.158] => (item=openshift_nodes) => {"attempts": 20, "changed": true, "cmd": ["oc", "get", "node", "openshift_nodes"], "delta": "0:00:00.257345", "end": "2015-12-24 18:28:39.257526", "failed": true, "item": "openshift_nodes", "rc": 1, "start": "2015-12-24 18:28:39.000181", "warnings": []} stderr: Error from server: node "openshift_nodes" not found msg: Task failed as maximum retries was encountered FATAL: all hosts have already failed -- aborting Expected results: Install successfully Additional info: It works when add nodes in pre-existing env that master and node in one host. It need new_node group in hosts file seen from playbook when scale out in a multi-nodes env. But QE have'n seen new_node related configuration in anisble inventory after adding a new node.
Should be resolved by: https://github.com/openshift/openshift-ansible/pull/1143 which is currently being updated with the given suggestions.
In the end we decided to go with the approach in Comment #1. Can you test with the latest in openshift-ansible? We have some other PRs pending and haven't created a build yet.
Verified with the latest in openshift-ansible. New node group will be added in the host file, and install successfully. But ansible playbook also has a little change related new_node. This is the variables in ansible playbook with version 3.0.20-1 - include: ../../common/openshift-cluster/scaleup.yml vars: g_etcd_group: "{{ 'etcd' }}" g_masters_group: "{{ 'masters' }}" g_new_nodes_group: "{{ 'new_nodes' }}" g_lb_group: "{{ 'lb' }}" openshift_cluster_id: "{{ cluster_id | default('default') }}" openshift_debug_level: 2 openshift_deployment_type: "{{ deployment_type }}" And this is the variables in ansible playbook with latest --- g_etcd_hosts: "{{ groups.etcd | default([]) }}" g_lb_hosts: "{{ groups.lb | default([]) }}" g_master_hosts: "{{ groups.masters | default([]) }}" g_node_hosts: "{{ groups.nodes | default([]) }}" g_nfs_hosts: "{{ groups.nfs | default([]) }}" g_all_hosts: "{{ g_master_hosts | union(g_node_hosts) | union(g_etcd_hosts) | union(g_lb_hosts) | default([]) }}" So it does not take effect on this modification although install successfully. Maybe we shoud make the relationship of quick-install and ansible-playbook clear.
Hi, Brenton As the description in Comment #3. Because the lastest ansible playbook has also been changed aganist adding new node, this commit seems useless if ansible playbook would not be changed anymore about adding new nodes. I want to know which way(new-group or no new-group) will be used eventually so that the bug won't be reproduced.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:0075