Bug 1566424

Summary: [RFE] Scale out Overcloud VM-based OCP on OSP with Ansible
Product: OpenShift Container Platform Reporter: Ramon Acedo <racedoro>
Component: InstallerAssignee: Tomas Sedovic <tsedovic>
Status: CLOSED ERRATA QA Contact: Jon Uriarte <juriarte>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 3.11.0CC: aos-bugs, jokerman, kdube, mmccomas, wjiang
Target Milestone: ---Keywords: FutureFeature, Triaged
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Feature: Add scale out Ansible playbooks for the OpenStack deployed clusters. Reason: When installing OpenShift on top of OpenStack with the OpenStack provisioning playbooks (`playbooks/openstack/openshift-cluster/provision_install.yml`), scaling the cluster out required several manual steps such as writing the inventory by hand and running two extra playbooks. This was more brittle, required more complex documentation and did not match the initial deployment experience. Result: To scale out OpenShift on OpenStack, the user can now simply change the desired number of nodes and run one of the following playbooks (depending on whether they want to scale the worker or master nodes): playbooks/openstack/openshift-cluster/node-scaleup.yml playbooks/openstack/openshift-cluster/master-scaleup.yml
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-01-10 09:03:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1566010, 1566927    

Description Ramon Acedo 2018-04-12 09:52:31 UTC
Write a playbook that takes the number of new VM OpenShift app nodes. It will provision the new nodes without touching any of the existing ones and perform the necessary steps to add the new nodes to the OCP cluster.

Comment 1 Tomas Sedovic 2018-06-28 10:56:33 UTC
This is a first step:

https://github.com/openshift/openshift-ansible/pull/9008

It's not fully automated -- the user must explicitly mark which nodes are the newly-added ones -- but it works, it's simple to understand and will serve as a basis for a fully-automated scaleup solution.

Comment 2 Tomas Sedovic 2018-07-18 11:42:01 UTC
This PR adds scaling for all node types (master, infra, compute):

https://github.com/openshift/openshift-ansible/pull/9243

Comment 3 Tomas Sedovic 2018-07-30 10:25:20 UTC
The pull requests have now been merged.

Comment 4 Scott Dodson 2018-08-02 12:41:22 UTC
In openshift-ansible-3.11.0-0.10.0

Comment 5 N. Harrison Ripps 2018-09-21 20:15:51 UTC
Per OCP program call on 21-SEP-2018 we are deferring Kuryr-related bugs to 3.11.z

Comment 6 weiwei jiang 2018-12-18 14:09:51 UTC
Checked with openshift-ansible-3.11.57-1 and both master and node scaleup work well.

master(from 2 to 3): 
ansible-playbook --private-key ~/.ssh/libra-new.pem --user openshift   -i openshift-ansible/playbooks/openstack/scaleup_inventory.py   -i inventory   openshift-ansible/playbooks/openstack/openshift-cluster/master-scaleup.yml -vv

node(infra from 1 to 2 and compute from 2 to 3):
ansible-playbook --private-key ~/.ssh/libra-new.pem --user openshift   -i openshift-ansible/playbooks/openstack/scaleup_inventory.py   -i inventory   openshift-ansible/playbooks/openstack/openshift-cluster/node-scaleup.yml -vv


[openshift@master-0 ~]$ oc get pods -o wide -l run=h
NAME        READY     STATUS    RESTARTS   AGE       IP           NODE                                  NOMINATED NODE
h-1-4s5w5   1/1       Running   0          2m        10.129.2.3   infra-node-1.wjiang-ocp.example.com   <none>
h-1-5kt5z   1/1       Running   0          2m        10.131.2.5   master-2.wjiang-ocp.example.com       <none>
h-1-7d9qn   1/1       Running   0          1m        10.131.0.8   infra-node-0.wjiang-ocp.example.com   <none>
h-1-7mnlk   1/1       Running   0          1m        10.130.2.4   app-node-2.wjiang-ocp.example.com     <none>
h-1-cxsrr   1/1       Running   0          2m        10.128.2.3   app-node-1.wjiang-ocp.example.com     <none>
h-1-dfzhx   1/1       Running   0          1m        10.128.2.4   app-node-1.wjiang-ocp.example.com     <none>
h-1-fmzcb   1/1       Running   0          2m        10.130.2.3   app-node-2.wjiang-ocp.example.com     <none>
h-1-gp8x9   1/1       Running   0          2m        10.128.2.2   app-node-1.wjiang-ocp.example.com     <none>
h-1-k45zg   1/1       Running   0          2m        10.130.2.2   app-node-2.wjiang-ocp.example.com     <none>
h-1-kl575   1/1       Running   0          2m        10.131.0.7   infra-node-0.wjiang-ocp.example.com   <none>
h-1-ktfgh   1/1       Running   0          1m        10.129.2.5   infra-node-1.wjiang-ocp.example.com   <none>
h-1-l6vp8   1/1       Running   0          2m        10.130.0.2   app-node-0.wjiang-ocp.example.com     <none>
h-1-rlrzj   1/1       Running   0          2m        10.129.2.4   infra-node-1.wjiang-ocp.example.com   <none>
h-1-wmrsp   1/1       Running   0          2m        10.130.0.3   app-node-0.wjiang-ocp.example.com     <none>
[openshift@master-0 ~]$ oc get nodes  -o wide 
NAME                                  STATUS    ROLES     AGE       VERSION           INTERNAL-IP     EXTERNAL-IP    OS-IMAGE                                      KERNEL-VERSION              CONTAINER-RUNTIME
app-node-0.wjiang-ocp.example.com     Ready     compute   3h        v1.11.0+d4cacc0   192.168.99.12   10.8.248.117   Red Hat Enterprise Linux Server 7.6 (Maipo)   3.10.0-957.1.3.el7.x86_64   docker://1.13.1
app-node-1.wjiang-ocp.example.com     Ready     compute   1h        v1.11.0+d4cacc0   192.168.99.16   10.8.252.236   Red Hat Enterprise Linux Server 7.6 (Maipo)   3.10.0-957.1.3.el7.x86_64   docker://1.13.1
app-node-2.wjiang-ocp.example.com     Ready     compute   42m       v1.11.0+d4cacc0   192.168.99.10   10.8.252.159   Red Hat Enterprise Linux Server 7.6 (Maipo)   3.10.0-957.1.3.el7.x86_64   docker://1.13.1
infra-node-0.wjiang-ocp.example.com   Ready     infra     3h        v1.11.0+d4cacc0   192.168.99.15   10.8.249.246   Red Hat Enterprise Linux Server 7.6 (Maipo)   3.10.0-957.1.3.el7.x86_64   docker://1.13.1
infra-node-1.wjiang-ocp.example.com   Ready     infra     42m       v1.11.0+d4cacc0   192.168.99.13   10.8.245.246   Red Hat Enterprise Linux Server 7.6 (Maipo)   3.10.0-957.1.3.el7.x86_64   docker://1.13.1
master-0.wjiang-ocp.example.com       Ready     master    3h        v1.11.0+d4cacc0   192.168.99.7    10.8.250.138   Red Hat Enterprise Linux Server 7.6 (Maipo)   3.10.0-957.1.3.el7.x86_64   docker://1.13.1
master-1.wjiang-ocp.example.com       Ready     master    3h        v1.11.0+d4cacc0   192.168.99.8    10.8.244.173   Red Hat Enterprise Linux Server 7.6 (Maipo)   3.10.0-957.1.3.el7.x86_64   docker://1.13.1
master-2.wjiang-ocp.example.com       Ready     master    20m       v1.11.0+d4cacc0   192.168.99.18   10.8.245.157   Red Hat Enterprise Linux Server 7.6 (Maipo)   3.10.0-957.1.3.el7.x86_64   docker://1.13.1

Comment 8 errata-xmlrpc 2019-01-10 09:03:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0024