Bug 1503733 - [RFE] Support Load Balancer for Multi-infra deployment for Openshift-on-OpenStack
Summary: [RFE] Support Load Balancer for Multi-infra deployment for Openshift-on-OpenS...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RFE
Version: unspecified
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.10.0
Assignee: Tomas Sedovic
QA Contact: Jon Uriarte
URL:
Whiteboard: DFG:OpenShiftonOpenStack
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-18 15:36 UTC by Tzu-Mainn Chen
Modified: 2018-12-20 21:41 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-12-20 21:41:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Tzu-Mainn Chen 2017-10-18 15:36:32 UTC
Description of problem:

The playbook already spin up a LB for a multi-master deployment. But the same thing should happen in case we have multiple infra nodes. In case we run multiple masters and multiple infra nodes, the same load balancer VM should be configured for both.

Expected results:

Comment 1 Bogdan Dobrelya 2017-10-19 13:33:39 UTC
This thing is blocked with the status: "the use case requires clarification and dod expectations adjusted"

More details on that:

what is the use case for a lb node and multi infra nodes? I can't find any docs or details on that. Which configuration the lb node is expected to get for multi infra nodes? What is the test case for it?

Shall the lb node's haproxy config provide a load balance for a docker registry, when deploying a multiple infra nodes?
I can see only a single logical frontend/backend definition stanza for master lb:

```
frontend  atomic-openshift-api
    bind *:8443
    default_backend atomic-openshift-api
    mode tcp
    option tcplog

backend atomic-openshift-api
    balance source
    mode tcp
    server      master0 192.168.99.12:8443 check
```
so blocked to know more for supported use cases of lb for registry and routers, and what else hosted at the infra nodes?..

Comment 2 Bogdan Dobrelya 2017-11-13 15:25:12 UTC
Info needed from openshift-ansible and/or reference-architecture folks.

Comment 3 Tomas Sedovic 2017-11-13 18:04:19 UTC
The load balancer deployed by openshift-ansible seems to only support master node loadbalancing. I don't think we should spend time working on that.

Let's limit the scope of this RFE to Octavia only. I.e. when you use the Neutron LBaaSv2/Octavia, we should set up load balancing for ports 80 and 443 for the infra nodes **in addition** to the master ones.

The reference architecture doesn't use the node set up by openshift-ansible either, but since there are some issues with lbaas on OSP 10, they create their own load balancer and configure it themselves:

https://access.redhat.com/documentation/en-us/reference_architectures/2017/html-single/deploying_and_managing_red_hat_openshift_container_platform_3.6_on_red_hat_openstack_platform_10/#haproxy

If we want to do that as well (for cases where octavia is unavailable), that should require another RFE.

Comment 4 Tzu-Mainn Chen 2018-02-19 17:34:02 UTC
Fuller description:

Description of problem:

OpenShift on OpenStack should support load balancing for a multi-master/infra node deployment of Openshift-on-OpenStack. This RFE adds an option to the openshift-ansible playbooks to use Octavia to create a load balancer to do so.

Steps to Reproduce:
1. Configure the inventory as documented in https://github.com/openshift/openshift-ansible/blob/master/playbooks/openstack/advanced-configuration.md
2. Run the playbooks as documented in https://github.com/openshift/openshift-ansible/blob/master/playbooks/openstack/README.md

Actual results:
Not implemented

Expected results:
Validation steps are documented in https://github.com/openshift/openshift-ansible/blob/master/playbooks/openstack/advanced-configuration.md

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

Comment 5 Tomas Sedovic 2018-05-04 16:53:19 UTC
How to test:

Prerequisites:

1. A tenant access to an OpenStack with Octavia


Steps:
1. Configure the inventory as described in: https://bugzilla.redhat.com/show_bug.cgi?id=1503667#c2
2. Add `openshift_openstack_use_lbaas_load_balancer: true` to your inventory/group_vars/all.yml
3. Set `openshift_openstack_num_infra = 2` in all.yml
4. Run the provision_install playbook
   * The playbook will print out `openshift_openstack_public_router_ip` at the end
   * Note the IP address


Validation:
1. The playbook must finish without any errors
2. The `router_lb` load balancer was created: `openstack loadbalancer list`
3. The `openshift_openstack_public_router_ip` is NOT an IP address of any of the servers in `openstack server list` but it corresponds to a floating IP address attached to a port of the load balancer
4. Log in to the cluster: oc login
5. Create a new project: oc new-project test
6. Launch an openshift app: oc new-app --template=cakephp-mysql-example
7. Wait for all pods to be running: oc status --suggest
8. Update your DNS or /etc/hosts so that the app route resolves to the `openshift_openstack_public_router_ip`
9. Verify that the app is accessible at its route/url

Comment 6 Tomas Sedovic 2018-06-26 11:56:49 UTC
Okay, I don't think steps 5 and onwards are really necessary. They test the end-to-end but that's not what this RFE is about anyway.

Let's do this instead:

5. SSH into the master VM
6. Run: oc get pod -n default -o wide | grep router
   - there should be two router pods Running, one on each Infra node
7. Run: oc describe svc router -n default
   - the router service should exist
   - note it's IP value (not Endpoints)
8. From the ansible host, run: openstack loadbalancer list
   - there should be a load balancer called `default/router`
   - its `vip_address` should be equal to the router svc IP

Comment 7 Jon Uriarte 2018-06-26 14:49:26 UTC
Verified in openshift-ansible-3.10.0-0.67.0 over OSP 13 2018-05-23.1 puddle with Octavia.

Verification steps:
1. Deploy OpenStack (OSP13) with Octavia
2. Deploy an Ansible-host and a DNS server on the overcloud
3. Get OCP openshift-ansible downstream rpm
4. Configure OSP (all.yml) and OCP (OSEv3.yml) inventory files
     Set:
         - 'openshift_openstack_use_lbaas_load_balancer: true' and
         - 'openshift_openstack_num_infra: 2'
     in inventory/group_vars/all.yml
5. Run from the Ansible-host:
ansible-playbook --user openshift -i /usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py -i inventory /usr/share/ansible/openshift-ansible/playbooks/openstack/openshift-cluster/prerequisites.yml

ansible-playbook --user openshift -i /usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py -i inventory /usr/share/ansible/openshift-ansible/playbooks/openstack/openshift-cluster/provision.yml

ansible-playbook --user openshift -i /usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py -i inventory red-hat-ca.yml

ansible-playbook --user openshift -i /usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py -i inventory /usr/share/ansible/openshift-ansible/playbooks/openstack/openshift-cluster/repos.yml

ansible-playbook --user openshift -i /usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py -i inventory /usr/share/ansible/openshift-ansible/playbooks/openstack/openshift-cluster/install.yml

6. Check the installer finishes without errors, and note the `openshift_openstack_public_router_ip` at the end of the playbook print-out
TASK [Print the OpenShift Router Public IP Address] ***************************************************************************************************************************************************************
ok: [localhost] => {
    "openshift_openstack_public_router_ip": "172.20.0.234"
}

7. Check vms deployed in the overcloud
(shiftstack) [cloud-user@ansible-host ~]$ openstack server list
+--------------------------------------+------------------------------------+--------+-------------------------------------------------------------------------+--------+-----------+
| ID                                   | Name                               | Status | Networks                                                                | Image  | Flavor    |
+--------------------------------------+------------------------------------+--------+-------------------------------------------------------------------------+--------+-----------+
| 40b470c7-5229-4fba-a8fc-6c5e885450ba | infra-node-1.openshift.example.com | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.19, 172.20.0.222 | rhel75 | m1.node   |
| b6c0d43b-14d8-46a0-8a6d-9fdf062aad74 | infra-node-0.openshift.example.com | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.14, 172.20.0.240 | rhel75 | m1.node   |
| af6b6db0-c9bf-4111-a347-7d61d63e7975 | master-0.openshift.example.com     | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.16, 172.20.0.236 | rhel75 | m1.master |
| eb1eb7ae-5a76-407a-b546-ac0281af5cd0 | app-node-1.openshift.example.com   | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.15, 172.20.0.237 | rhel75 | m1.node   |
| 3f877066-2083-4fa8-ad2b-ab90de38c4b5 | app-node-0.openshift.example.com   | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.11, 172.20.0.235 | rhel75 | m1.node   |
+--------------------------------------+------------------------------------+--------+-------------------------------------------------------------------------+--------+-----------+

8. Check the `router_lb` load balancer was created (`openstack loadbalancer list`)
(shiftstack) [cloud-user@ansible-host ~]$ openstack loadbalancer list
+--------------------------------------+------------------------------------------------+----------------------------------+----------------+---------------------+----------+
| id                                   | name                                           | project_id                       | vip_address    | provisioning_status | provider |
+--------------------------------------+------------------------------------------------+----------------------------------+----------------+---------------------+----------+
| b97a6f5d-f8ab-4ff0-9ae9-bebe0d24a5d9 | openshift-ansible-openshift.example.com-api-lb | a02185177ac246529e69bb252f021683 | 172.30.0.1     | ACTIVE              | octavia  |
| b096546e-6d94-42bd-a3f0-aa827ba54435 | openshift-cluster-router_lb-4f53nds4cg75       | a02185177ac246529e69bb252f021683 | 192.168.99.6   | ACTIVE              | octavia  |
| 4e340890-1075-4123-bf1d-9b1f9a3ecafc | default/router                                 | a02185177ac246529e69bb252f021683 | 172.30.108.168 | ACTIVE              | octavia  |
| 37f2f27b-478c-4287-ad72-6c62a720ce91 | default/docker-registry                        | a02185177ac246529e69bb252f021683 | 172.30.247.20  | ACTIVE              | octavia  |
| 9c8d6e8e-09aa-4729-9641-de2ee71706dd | default/registry-console                       | a02185177ac246529e69bb252f021683 | 172.30.217.98  | ACTIVE              | octavia  |
+--------------------------------------+------------------------------------------------+----------------------------------+----------------+---------------------+----------+

9. Check the `openshift_openstack_public_router_ip` is NOT an IP address of any of the servers in `openstack server list` but it corresponds to a floating IP address attached to a port of the load balancer
(shiftstack) [cloud-user@ansible-host ~]$ openstack floating ip list | grep 192.168.99.6                                                                                                                           
| 23d04d26-f760-43c2-b2d6-ed8b9dccf429 | 172.20.0.234        | 192.168.99.6     | 9ac772da-69b4-47a8-8c56-daf3ac4fec4d | dd5a700a-a0bf-4e18-b6db-a59f4063f7b4 | a02185177ac246529e69bb252f021683 |
LB's fixed IP (192.168.99.6) and floating IP (172.20.0.234). The floating IP is the one showed in the playbook print-out.

10. SSH into the master VM
(shiftstack) [cloud-user@ansible-host ~]$ ssh -o "UserKnownHostsFile=/dev/null" -o "StrictHostKeyChecking=no" openshift.0.236

11. Run: oc get pod -n default -o wide | grep router
   - there should be two router pods Running, one on each Infra node
[openshift@master-0 ~]$ oc get pod -n default -o wide | grep router
router-1-7htzk             1/1       Running   0          3h        192.168.99.14   infra-node-0.openshift.example.com
router-1-7qlcx             1/1       Running   0          3h        192.168.99.19   infra-node-1.openshift.example.com

12. Run: oc describe svc router -n default
   - the router service should exist
   - note it's IP value (not Endpoints)

[openshift@master-0 ~]$ oc describe svc router -n default
Name:              router
Namespace:         default
Labels:            router=router
Annotations:       openstack.org/kuryr-lbaas-spec={"versioned_object.data": {"ip": "172.30.108.168", "lb_ip": null, "ports": [{"versioned_object.data": {"name": "80-tcp", "port": 80, "protocol": "TCP"}, "versioned_objec...
                   prometheus.openshift.io/password=Oha11kU2EC
                   prometheus.openshift.io/username=admin
                   service.alpha.openshift.io/serving-cert-secret-name=router-metrics-tls
                   service.alpha.openshift.io/serving-cert-signed-by=openshift-service-serving-signer@1530010068
Selector:          router=router
Type:              ClusterIP
IP:                172.30.108.168
Port:              80-tcp  80/TCP
TargetPort:        80/TCP
Endpoints:         192.168.99.14:80,192.168.99.19:80
Port:              443-tcp  443/TCP
TargetPort:        443/TCP
Endpoints:         192.168.99.14:443,192.168.99.19:443
Port:              1936-tcp  1936/TCP
TargetPort:        1936/TCP
Endpoints:         192.168.99.14:1936,192.168.99.19:1936
Session Affinity:  None
Events:            <none>

13. From the ansible host, run: openstack loadbalancer list
   - there should be a load balancer called `default/router`
   - its `vip_address` should be equal to the router svc IP

(shiftstack) [cloud-user@ansible-host ~]$ openstack loadbalancer list                                                                                                                                              
+--------------------------------------+------------------------------------------------+----------------------------------+----------------+---------------------+----------+
| id                                   | name                                           | project_id                       | vip_address    | provisioning_status | provider |
+--------------------------------------+------------------------------------------------+----------------------------------+----------------+---------------------+----------+
| b97a6f5d-f8ab-4ff0-9ae9-bebe0d24a5d9 | openshift-ansible-openshift.example.com-api-lb | a02185177ac246529e69bb252f021683 | 172.30.0.1     | ACTIVE              | octavia  |
| b096546e-6d94-42bd-a3f0-aa827ba54435 | openshift-cluster-router_lb-4f53nds4cg75       | a02185177ac246529e69bb252f021683 | 192.168.99.6   | ACTIVE              | octavia  |
| 4e340890-1075-4123-bf1d-9b1f9a3ecafc | default/router                                 | a02185177ac246529e69bb252f021683 | 172.30.108.168 | ACTIVE              | octavia  |
| 37f2f27b-478c-4287-ad72-6c62a720ce91 | default/docker-registry                        | a02185177ac246529e69bb252f021683 | 172.30.247.20  | ACTIVE              | octavia  |
| 9c8d6e8e-09aa-4729-9641-de2ee71706dd | default/registry-console                       | a02185177ac246529e69bb252f021683 | 172.30.217.98  | ACTIVE              | octavia  |
+--------------------------------------+------------------------------------------------+----------------------------------+----------------+---------------------+----------+

The `vip_address` 172.30.108.168 indeed matches with the router svc IP.


Note You need to log in before you can comment on or make changes to this bug.