Bug 1874504 - load balancers offline after installation over OSP16.1 with OVN and amphora
Summary: load balancers offline after installation over OSP16.1 with OVN and amphora
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ---
: 4.6.0
Assignee: Maysa Macedo
QA Contact: GenadiC
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-01 13:57 UTC by rlobillo
Modified: 2020-10-27 16:36 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:36:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift kuryr-kubernetes pull 336 0 None closed Bug 1874504: Ensure LB sg is in sync with backend Pods 2020-12-03 13:02:36 UTC
Github openshift kuryr-kubernetes pull 342 0 None closed Bug 1876571: Ensure updated lb sgs is used on the CRD 2020-12-03 13:02:36 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:36:57 UTC

Description rlobillo 2020-09-01 13:57:00 UTC
Description of problem:

Installation OCP4.6 over OSP16.1 with only amphora loadbalancer provider enabled is failing with below error:

time="2020-09-01T07:03:28-04:00" level=error msg="Cluster operator authentication Degraded is True with OAuthServiceCheckEndpointAccessibleController_SyncError: OAuthServiceCheckEndpointAccessibleControllerDegraded: Get \"https://172.30.27.143:443/healthz\": dial tcp 172.30.27.143:443: connect: connection refused"

The authentication clusteroperator appears as degraded:

(overcloud) [stack@undercloud-0 ~]$ oc get clusteroperators
NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.6.0-0.nightly-2020-09-01-042030   False       False         True       150m

The listener linked to the loadbalancer created for that clusteroperator service appears as OFFLINE:

$ oc get services -n openshift-authentication
NAME              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
oauth-openshift   ClusterIP   172.30.27.143   <none>        443/TCP   3h23m


(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer listener show 8b8eae2a-6f19-43be-8dae-cfb296142ad9
+-----------------------------+--------------------------------------------------+
| Field                       | Value                                            |
+-----------------------------+--------------------------------------------------+
| admin_state_up              | False                                            |
| connection_limit            | -1                                               |
| created_at                  | 2020-09-01T10:51:16                              |
| default_pool_id             | c3f24812-e0e8-4d4a-9944-0f73bd93b8ef             |
| default_tls_container_ref   | None                                             |
| description                 |                                                  |
| id                          | 8b8eae2a-6f19-43be-8dae-cfb296142ad9             |
| insert_headers              | None                                             |
| l7policies                  |                                                  |
| loadbalancers               | 357ba937-b4ed-479e-b4d0-d6111ca83e2e             |
| name                        | openshift-authentication/oauth-openshift:TCP:443 |
| operating_status            | OFFLINE                                          |
| project_id                  | 182c1effd5144181ba930cd2950d7eb9                 |
| protocol                    | TCP                                              |
| protocol_port               | 443                                              |
| provisioning_status         | ACTIVE                                           |
| sni_container_refs          | []                                               |
| timeout_client_data         | 50000                                            |
| timeout_member_connect      | 5000                                             |
| timeout_member_data         | 50000                                            |
| timeout_tcp_inspect         | 0                                                |
| updated_at                  | 2020-09-01T10:51:53                              |
| client_ca_tls_container_ref | None                                             |
| client_authentication       | NONE                                             |
| client_crl_container_ref    | None                                             |
| allowed_cidrs               | None                                             |
+-----------------------------+--------------------------------------------------+

Furthermore, same is happening for most of the listeners:

(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer listener list -c name -c admin_state_up -f value | grep True  | wc -l
31
(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer listener list -c name -c admin_state_up -f value | grep False  | wc -l
38



Version-Release number of selected component (if applicable):
OCP4.6.0-0.nightly-2020-09-01-042030
RHOS-16.1-RHEL-8-20200821.n.0

How reproducible: Always


Steps to Reproduce:
1. Install OSP16.1 with OVN disabling ovn-octavia on controllers and enabling only amphora provider.
2. Install OCP4.6.

Actual results: Installation fails.


Expected results: Installation succeeds and OCP cluster is fully operative.


Additional info:

Comment 4 rlobillo 2020-09-10 14:04:56 UTC
Verified on OCP4.6.0-0.nightly-2020-09-10-082657 over OSP 16.1 (RHOS-16.1-RHEL-8-20200831.n.1) with Amphora provider enabled.

OCP succesfully installed:

(overcloud) [stack@undercloud-0 ~]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-09-10-082657   True        False         88s     Cluster version is 4.6.0-0.nightly-2020-09-10-082657
(overcloud) [stack@undercloud-0 ~]$ oc get clusteroperator
NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.6.0-0.nightly-2020-09-10-082657   True        False         False      114s
cloud-credential                           4.6.0-0.nightly-2020-09-10-082657   True        False         False      47m
cluster-autoscaler                         4.6.0-0.nightly-2020-09-10-082657   True        False         False      27m
config-operator                            4.6.0-0.nightly-2020-09-10-082657   True        False         False      40m
console                                    4.6.0-0.nightly-2020-09-10-082657   True        False         False      10m
csi-snapshot-controller                    4.6.0-0.nightly-2020-09-10-082657   True        False         False      27m
dns                                        4.6.0-0.nightly-2020-09-10-082657   True        False         False      37m
etcd                                       4.6.0-0.nightly-2020-09-10-082657   True        False         False      38m
image-registry                             4.6.0-0.nightly-2020-09-10-082657   True        False         False      13m
ingress                                    4.6.0-0.nightly-2020-09-10-082657   True        False         False      13m
insights                                   4.6.0-0.nightly-2020-09-10-082657   True        False         False      27m
kube-apiserver                             4.6.0-0.nightly-2020-09-10-082657   True        False         False      37m
kube-controller-manager                    4.6.0-0.nightly-2020-09-10-082657   True        False         False      32m
kube-scheduler                             4.6.0-0.nightly-2020-09-10-082657   True        False         False      36m
kube-storage-version-migrator              4.6.0-0.nightly-2020-09-10-082657   True        False         False      13m
machine-api                                4.6.0-0.nightly-2020-09-10-082657   True        False         False      23m
machine-approver                           4.6.0-0.nightly-2020-09-10-082657   True        False         False      35m
machine-config                             4.6.0-0.nightly-2020-09-10-082657   True        False         False      27m
marketplace                                4.6.0-0.nightly-2020-09-10-082657   True        False         False      27m
monitoring                                 4.6.0-0.nightly-2020-09-10-082657   True        False         False      11m
network                                    4.6.0-0.nightly-2020-09-10-082657   True        False         False      41m
node-tuning                                4.6.0-0.nightly-2020-09-10-082657   True        False         False      40m
openshift-apiserver                        4.6.0-0.nightly-2020-09-10-082657   True        False         False      27m
openshift-controller-manager               4.6.0-0.nightly-2020-09-10-082657   True        False         False      26m
openshift-samples                          4.6.0-0.nightly-2020-09-10-082657   True        False         False      26m
operator-lifecycle-manager                 4.6.0-0.nightly-2020-09-10-082657   True        False         False      38m
operator-lifecycle-manager-catalog         4.6.0-0.nightly-2020-09-10-082657   True        False         False      38m
operator-lifecycle-manager-packageserver   4.6.0-0.nightly-2020-09-10-082657   True        False         False      18m
service-ca                                 4.6.0-0.nightly-2020-09-10-082657   True        False         False      40m
storage                                    4.6.0-0.nightly-2020-09-10-082657   True        False         False      39m


(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer listener list -c name -c admin_state_up -f value | grep True  | wc -l
65
(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer listener list -c name -c admin_state_up -f value | grep False  | wc -l
0

Comment 6 errata-xmlrpc 2020-10-27 16:36:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.