Bug 1294748 - Failed to scale out nodes in pre-existing env
Summary: Failed to scale out nodes in pre-existing env
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.1.0
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: ---
Assignee: Samuel Munilla
QA Contact: Ma xiaoqiang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-12-30 05:59 UTC by Gan Huang
Modified: 2016-07-04 00:46 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-01-27 19:43:52 UTC
Target Upstream Version:
smunilla: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0075 0 normal SHIPPED_LIVE Red Hat OpenShift Enterprise atomic-openshift-utils bug fix update 2016-01-28 00:42:22 UTC

Description Gan Huang 2015-12-30 05:59:05 UTC
Description of problem:
Failed to scale out nodes in pre-existing env


Version-Release number of selected component (if applicable):
atomic-openshift-utils-3.0.20-1.git.0.3703f1b.el7aos.noarch

How reproducible:
Always

Steps to Reproduce:
1.Install an env of one master and two nodes by quick install
2.Add a new node to the existing env using atomic-openshft-installer

Actual results:
TASK: [openshift_manage_node | Wait for Node Registration] ******************** 
failed: [10.x.x.158] => (item=openshift_nodes) => {"attempts": 20, "changed": true, "cmd": ["oc", "get", "node", "openshift_nodes"], "delta": "0:00:00.257345", "end": "2015-12-24 18:28:39.257526", "failed": true, "item": "openshift_nodes", "rc": 1, "start": "2015-12-24 18:28:39.000181", "warnings": []}
stderr: Error from server: node "openshift_nodes" not found
msg: Task failed as maximum retries was encountered

FATAL: all hosts have already failed -- aborting

Expected results:
Install successfully

Additional info:
It works when add nodes in pre-existing env that master and node in one host. 
It need new_node group in hosts file seen from playbook when scale out in a multi-nodes env. But QE have'n seen new_node related configuration in anisble inventory after adding a new node.

Comment 1 Samuel Munilla 2016-01-11 14:43:15 UTC
Should be resolved by: https://github.com/openshift/openshift-ansible/pull/1143
which is currently being updated with the given suggestions.

Comment 2 Brenton Leanhardt 2016-01-12 22:03:17 UTC
In the end we decided to go with the approach in Comment #1.  Can you test with the latest in openshift-ansible?  We have some other PRs pending and haven't created a build yet.

Comment 3 Gan Huang 2016-01-13 09:42:19 UTC
Verified with the latest in openshift-ansible. New node group  will be added in the host file, and install successfully. But ansible playbook also has a little change related new_node.
This is the variables in ansible playbook with version 3.0.20-1
- include: ../../common/openshift-cluster/scaleup.yml
  vars:
    g_etcd_group: "{{ 'etcd' }}"
    g_masters_group: "{{ 'masters' }}"
    g_new_nodes_group: "{{ 'new_nodes' }}"
    g_lb_group: "{{ 'lb' }}"
    openshift_cluster_id: "{{ cluster_id | default('default') }}"
    openshift_debug_level: 2
    openshift_deployment_type: "{{ deployment_type }}"

And this is the variables in ansible playbook with latest
---
g_etcd_hosts:   "{{ groups.etcd | default([]) }}"
g_lb_hosts:     "{{ groups.lb | default([]) }}"
g_master_hosts: "{{ groups.masters | default([]) }}"
g_node_hosts:   "{{ groups.nodes | default([]) }}"
g_nfs_hosts:   "{{ groups.nfs | default([]) }}"
g_all_hosts:    "{{ g_master_hosts | union(g_node_hosts) | union(g_etcd_hosts)
                    | union(g_lb_hosts) | default([]) }}"

So it does not take effect on this modification although install successfully.
Maybe we shoud make the relationship of quick-install and ansible-playbook  clear.

Comment 4 Gan Huang 2016-01-14 04:57:18 UTC
Hi, Brenton
As the description in Comment #3. Because the lastest ansible playbook has also been changed aganist adding new node, this commit seems useless if ansible playbook would not be changed anymore about adding new nodes. I want to know which way(new-group or no new-group) will be used eventually so that the bug won't be reproduced.

Comment 6 errata-xmlrpc 2016-01-27 19:43:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:0075


Note You need to log in before you can comment on or make changes to this bug.