Bug 1294748 - Failed to scale out nodes in pre-existing env
Failed to scale out nodes in pre-existing env
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
Unspecified Unspecified
medium Severity high
: ---
: ---
Assigned To: Samuel Munilla
Ma xiaoqiang
Depends On:
  Show dependency treegraph
Reported: 2015-12-30 00:59 EST by Gan Huang
Modified: 2016-07-03 20:46 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2016-01-27 14:43:52 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
smunilla: needinfo-

Attachments (Terms of Use)

  None (edit)
Description Gan Huang 2015-12-30 00:59:05 EST
Description of problem:
Failed to scale out nodes in pre-existing env

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.Install an env of one master and two nodes by quick install
2.Add a new node to the existing env using atomic-openshft-installer

Actual results:
TASK: [openshift_manage_node | Wait for Node Registration] ******************** 
failed: [10.x.x.158] => (item=openshift_nodes) => {"attempts": 20, "changed": true, "cmd": ["oc", "get", "node", "openshift_nodes"], "delta": "0:00:00.257345", "end": "2015-12-24 18:28:39.257526", "failed": true, "item": "openshift_nodes", "rc": 1, "start": "2015-12-24 18:28:39.000181", "warnings": []}
stderr: Error from server: node "openshift_nodes" not found
msg: Task failed as maximum retries was encountered

FATAL: all hosts have already failed -- aborting

Expected results:
Install successfully

Additional info:
It works when add nodes in pre-existing env that master and node in one host. 
It need new_node group in hosts file seen from playbook when scale out in a multi-nodes env. But QE have'n seen new_node related configuration in anisble inventory after adding a new node.
Comment 1 Samuel Munilla 2016-01-11 09:43:15 EST
Should be resolved by: https://github.com/openshift/openshift-ansible/pull/1143
which is currently being updated with the given suggestions.
Comment 2 Brenton Leanhardt 2016-01-12 17:03:17 EST
In the end we decided to go with the approach in Comment #1.  Can you test with the latest in openshift-ansible?  We have some other PRs pending and haven't created a build yet.
Comment 3 Gan Huang 2016-01-13 04:42:19 EST
Verified with the latest in openshift-ansible. New node group  will be added in the host file, and install successfully. But ansible playbook also has a little change related new_node.
This is the variables in ansible playbook with version 3.0.20-1
- include: ../../common/openshift-cluster/scaleup.yml
    g_etcd_group: "{{ 'etcd' }}"
    g_masters_group: "{{ 'masters' }}"
    g_new_nodes_group: "{{ 'new_nodes' }}"
    g_lb_group: "{{ 'lb' }}"
    openshift_cluster_id: "{{ cluster_id | default('default') }}"
    openshift_debug_level: 2
    openshift_deployment_type: "{{ deployment_type }}"

And this is the variables in ansible playbook with latest
g_etcd_hosts:   "{{ groups.etcd | default([]) }}"
g_lb_hosts:     "{{ groups.lb | default([]) }}"
g_master_hosts: "{{ groups.masters | default([]) }}"
g_node_hosts:   "{{ groups.nodes | default([]) }}"
g_nfs_hosts:   "{{ groups.nfs | default([]) }}"
g_all_hosts:    "{{ g_master_hosts | union(g_node_hosts) | union(g_etcd_hosts)
                    | union(g_lb_hosts) | default([]) }}"

So it does not take effect on this modification although install successfully.
Maybe we shoud make the relationship of quick-install and ansible-playbook  clear.
Comment 4 Gan Huang 2016-01-13 23:57:18 EST
Hi, Brenton
As the description in Comment #3. Because the lastest ansible playbook has also been changed aganist adding new node, this commit seems useless if ansible playbook would not be changed anymore about adding new nodes. I want to know which way(new-group or no new-group) will be used eventually so that the bug won't be reproduced.
Comment 6 errata-xmlrpc 2016-01-27 14:43:52 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.