| Summary: | Failed to scale out nodes in pre-existing env | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Gan Huang <ghuang> |
| Component: | Installer | Assignee: | Samuel Munilla <smunilla> |
| Status: | CLOSED ERRATA | QA Contact: | Ma xiaoqiang <xiama> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.1.0 | CC: | aos-bugs, bleanhar, cryan, gpei, jokerman, mmccomas, smunilla, xtian |
| Target Milestone: | --- | Flags: | smunilla:
needinfo-
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-01-27 19:43:52 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Should be resolved by: https://github.com/openshift/openshift-ansible/pull/1143 which is currently being updated with the given suggestions. In the end we decided to go with the approach in Comment #1. Can you test with the latest in openshift-ansible? We have some other PRs pending and haven't created a build yet. Verified with the latest in openshift-ansible. New node group will be added in the host file, and install successfully. But ansible playbook also has a little change related new_node.
This is the variables in ansible playbook with version 3.0.20-1
- include: ../../common/openshift-cluster/scaleup.yml
vars:
g_etcd_group: "{{ 'etcd' }}"
g_masters_group: "{{ 'masters' }}"
g_new_nodes_group: "{{ 'new_nodes' }}"
g_lb_group: "{{ 'lb' }}"
openshift_cluster_id: "{{ cluster_id | default('default') }}"
openshift_debug_level: 2
openshift_deployment_type: "{{ deployment_type }}"
And this is the variables in ansible playbook with latest
---
g_etcd_hosts: "{{ groups.etcd | default([]) }}"
g_lb_hosts: "{{ groups.lb | default([]) }}"
g_master_hosts: "{{ groups.masters | default([]) }}"
g_node_hosts: "{{ groups.nodes | default([]) }}"
g_nfs_hosts: "{{ groups.nfs | default([]) }}"
g_all_hosts: "{{ g_master_hosts | union(g_node_hosts) | union(g_etcd_hosts)
| union(g_lb_hosts) | default([]) }}"
So it does not take effect on this modification although install successfully.
Maybe we shoud make the relationship of quick-install and ansible-playbook clear.
Hi, Brenton As the description in Comment #3. Because the lastest ansible playbook has also been changed aganist adding new node, this commit seems useless if ansible playbook would not be changed anymore about adding new nodes. I want to know which way(new-group or no new-group) will be used eventually so that the bug won't be reproduced. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:0075 |
Description of problem: Failed to scale out nodes in pre-existing env Version-Release number of selected component (if applicable): atomic-openshift-utils-3.0.20-1.git.0.3703f1b.el7aos.noarch How reproducible: Always Steps to Reproduce: 1.Install an env of one master and two nodes by quick install 2.Add a new node to the existing env using atomic-openshft-installer Actual results: TASK: [openshift_manage_node | Wait for Node Registration] ******************** failed: [10.x.x.158] => (item=openshift_nodes) => {"attempts": 20, "changed": true, "cmd": ["oc", "get", "node", "openshift_nodes"], "delta": "0:00:00.257345", "end": "2015-12-24 18:28:39.257526", "failed": true, "item": "openshift_nodes", "rc": 1, "start": "2015-12-24 18:28:39.000181", "warnings": []} stderr: Error from server: node "openshift_nodes" not found msg: Task failed as maximum retries was encountered FATAL: all hosts have already failed -- aborting Expected results: Install successfully Additional info: It works when add nodes in pre-existing env that master and node in one host. It need new_node group in hosts file seen from playbook when scale out in a multi-nodes env. But QE have'n seen new_node related configuration in anisble inventory after adding a new node.