Bug 2227199

Summary: [osp][octavia lb] LB in degraded status with ovn-octavia
Product: Red Hat OpenStack Reporter: Jon Uriarte <juriarte>
Component: python-ovn-octavia-providerAssignee: Fernando Royo <froyo>
Status: MODIFIED --- QA Contact: Eran Kuris <ekuris>
Severity: high Docs Contact:
Priority: medium    
Version: 17.1 (Wallaby)CC: bcafarel, froyo, gthiemon, jlibosva, mdulko, pgrist, tweining
Target Milestone: z1Keywords: Triaged
Target Release: 17.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-ovn-octavia-provider-1.0.3-17.1.20230720161142.9eea0b9.el9osttrunk Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jon Uriarte 2023-07-28 10:00:32 UTC
Description of problem:

In an OCP 4.13 on OSP 17.1 deployment, when creating a LB type UDP service in OCP with ovn-octavia provider and monitors
enabled, the corresponding LB is created in Openstack but it ends up in DEGRADED operating_status (as it's loadbalancer pool)
and the traffic from the LB is sent to all the members instead of only to the ONLINE ones.

$ openstack loadbalancer show d2922cf3-e9a7-4230-821a-7dcab7190192
/usr/lib/python3.9/site-packages/osc_lib/utils/__init__.py:448: DeprecationWarning: The usage of formatter functions is now discouraged. Consider using cliff.columns.FormattableColumn instead. See reviews linked with bug 1687955 for more detail.
  warnings.warn(
+---------------------+--------------------------------------------------------------------------------------------+
| Field               | Value                                                                                      |
+---------------------+--------------------------------------------------------------------------------------------+
| admin_state_up      | True                                                                                       |
| availability_zone   | None                                                                                       |
| created_at          | 2023-07-28T07:57:12                                                                        |
| description         | Kubernetes external service udp-lb-etplocal-ns/udp-lb-etplocal-svc from cluster kubernetes |
| flavor_id           | None                                                                                       |
| id                  | d2922cf3-e9a7-4230-821a-7dcab7190192                                                       |
| listeners           | 82a581a9-7b42-4962-a685-c897db5c6b9b                                                       |
| name                | kube_service_kubernetes_udp-lb-etplocal-ns_udp-lb-etplocal-svc                             |
| operating_status    | DEGRADED                                                                                   |
| pools               | bf4dc1b0-d7bb-4319-bc2f-74c3c6597f04                                                       |
| project_id          | 3674d08c0f4546b495677e0bbf046bd8                                                           |
| provider            | ovn                                                                                        |
| provisioning_status | ACTIVE                                                                                     |
| updated_at          | 2023-07-28T07:57:35                                                                        |
| vip_address         | 10.196.1.148                                                                               |
| vip_network_id      | ca0fed66-e69c-492e-99fd-d50aea240e6f                                                       |
| vip_port_id         | 11778e21-f2dc-4c78-a05b-2409753f92ff                                                       |
| vip_qos_policy_id   | None                                                                                       |
| vip_subnet_id       | 11f3cf7b-1bb5-4098-bc77-8717ef0727fe                                                       |
| tags                | kube_service_kubernetes_udp-lb-etplocal-ns_udp-lb-etplocal-svc                             |
+---------------------+--------------------------------------------------------------------------------------------+

$ openstack loadbalancer pool show bf4dc1b0-d7bb-4319-bc2f-74c3c6597f04
/usr/lib/python3.9/site-packages/osc_lib/utils/__init__.py:448: DeprecationWarning: The usage of formatter functions is now discouraged. Consider using cliff.columns.FormattableColumn instead. See reviews linked with bug 1687955 for more detail.
  warnings.warn(
+----------------------+-----------------------------------------------------------------------+
| Field                | Value                                                                 |
+----------------------+-----------------------------------------------------------------------+
| admin_state_up       | True                                                                  |
| created_at           | 2023-07-28T07:57:13                                                   |
| description          |                                                                       |
| healthmonitor_id     | 9dda0ee8-7db1-41d6-888c-2b173bc28a1c                                  |
| id                   | bf4dc1b0-d7bb-4319-bc2f-74c3c6597f04                                  |
| lb_algorithm         | SOURCE_IP_PORT                                                        |
| listeners            | 82a581a9-7b42-4962-a685-c897db5c6b9b                                  |
| loadbalancers        | d2922cf3-e9a7-4230-821a-7dcab7190192                                  |
| members              | 13ffa095-c8df-4e68-8f75-5fba2af97bc7                                  |
|                      | 2771ee5c-9eab-4c33-8110-4823e00fad55                                  |
|                      | 400768e1-9df8-42e4-a6d1-c8aaa8815e99                                  |
|                      | 8eb654ab-8334-4d1c-9413-1e07f5d9c0f7                                  |
|                      | ae708565-761d-46d9-9514-8b3954e488cc                                  |
|                      | da714d19-fef0-43a4-9d0c-69e1ce040b90                                  |
| name                 | pool_0_kube_service_kubernetes_udp-lb-etplocal-ns_udp-lb-etplocal-svc |
| operating_status     | DEGRADED                                                              |
| project_id           | 3674d08c0f4546b495677e0bbf046bd8                                      |
| protocol             | UDP                                                                   |
| provisioning_status  | ACTIVE                                                                |
| session_persistence  | None                                                                  |
| updated_at           | 2023-07-28T07:57:35                                                   |
| tls_container_ref    | None                                                                  |
| ca_tls_container_ref | None                                                                  |
| crl_container_ref    | None                                                                  |
| tls_enabled          | False                                                                 |
| tls_ciphers          | None                                                                  |
| tls_versions         | None                                                                  |
| tags                 |                                                                       |
| alpn_protocols       | None                                                                  |
+----------------------+-----------------------------------------------------------------------+

$ openstack loadbalancer healthmonitor show 9dda0ee8-7db1-41d6-888c-2b173bc28a1c
/usr/lib/python3.9/site-packages/osc_lib/utils/__init__.py:448: DeprecationWarning: The usage of formatter functions is now discouraged. Consider using cliff.columns.FormattableColumn instead. See reviews linked with bug 1687955 for more detail.
  warnings.warn(
+---------------------+-----------------------------------------------------------------------------+
| Field               | Value                                                                       |
+---------------------+-----------------------------------------------------------------------------+
| project_id          | 3674d08c0f4546b495677e0bbf046bd8                                            |
| name                | monitor_8082_kube_service_kubernetes_udp-lb-etplocal-ns_udp-lb-etplocal-svc |
| admin_state_up      | True                                                                        |
| pools               | bf4dc1b0-d7bb-4319-bc2f-74c3c6597f04                                        |
| created_at          | 2023-07-28T07:57:13                                                         |
| provisioning_status | ACTIVE                                                                      |
| updated_at          | 2023-07-28T07:57:15                                                         |
| delay               | 5                                                                           |
| expected_codes      | None                                                                        |
| max_retries         | 2                                                                           |
| http_method         | None                                                                        |
| timeout             | 5                                                                           |
| max_retries_down    | 3                                                                           |
| url_path            | None                                                                        |
| type                | UDP-CONNECT                                                                 |
| id                  | 9dda0ee8-7db1-41d6-888c-2b173bc28a1c                                        |
| operating_status    | ONLINE                                                                      |
| http_version        | None                                                                        |
| domain_name         | None                                                                        |
| tags                |                                                                             |
+---------------------+-----------------------------------------------------------------------------+


$ openstack loadbalancer member list bf4dc1b0-d7bb-4319-bc2f-74c3c6597f04
+--------------------------------------+-----------------------------+----------------------------------+---------------------+--------------+---------------+------------------+--------+
| id                                   | name                        | project_id                       | provisioning_status | address      | protocol_port | operating_status | weight |
+--------------------------------------+-----------------------------+----------------------------------+---------------------+--------------+---------------+------------------+--------+
| 13ffa095-c8df-4e68-8f75-5fba2af97bc7 | ostest-c7zn4-master-1       | 3674d08c0f4546b495677e0bbf046bd8 | ACTIVE              | 10.196.0.143 |         32664 | ERROR            |      1 |
| 2771ee5c-9eab-4c33-8110-4823e00fad55 | ostest-c7zn4-worker-0-hpjlx | 3674d08c0f4546b495677e0bbf046bd8 | ACTIVE              | 10.196.3.225 |         32664 | ERROR            |      1 |
| 400768e1-9df8-42e4-a6d1-c8aaa8815e99 | ostest-c7zn4-worker-0-5mldk | 3674d08c0f4546b495677e0bbf046bd8 | ACTIVE              | 10.196.0.93  |         32664 | ONLINE           |      1 |
| 8eb654ab-8334-4d1c-9413-1e07f5d9c0f7 | ostest-c7zn4-master-0       | 3674d08c0f4546b495677e0bbf046bd8 | ACTIVE              | 10.196.0.71  |         32664 | ERROR            |      1 |
| ae708565-761d-46d9-9514-8b3954e488cc | ostest-c7zn4-worker-0-zqcrk | 3674d08c0f4546b495677e0bbf046bd8 | ACTIVE              | 10.196.3.246 |         32664 | ONLINE           |      1 |
| da714d19-fef0-43a4-9d0c-69e1ce040b90 | ostest-c7zn4-master-2       | 3674d08c0f4546b495677e0bbf046bd8 | ACTIVE              | 10.196.2.186 |         32664 | ERROR            |      1 |
+--------------------------------------+-----------------------------+----------------------------------+---------------------+--------------+---------------+------------------+--------+


This behavior makes the test "[sig-installer][Suite:openshift/openstack][lb][Serial] The Openstack platform should apply lb-method on UDP OVN LoadBalancer when an UDP svc with monitors and ETP:Local is created on Openshift" [1] fail in OCP.

Version-Release number of selected component (if applicable):
OSP 17.1.0 (RHOS-17.1-RHEL-9-20230719.n.1)
OCP 4.13.0-0.nightly-2023-07-27-013427


How reproducible: always


Steps to Reproduce:
1. Deploy OCP on OSP 17.1
2. Create a project, deployment and a UDP LB type svc (it has monitors and ETP:local in svc definition)
cat <<EOF | oc apply -f -
---
apiVersion: project.openshift.io/v1
kind: Project
metadata:
  name: udp-lb-etplocal-ns
  labels:
    kubernetes.io/metadata.name: udp-lb-etplocal-ns
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: udp-lb-etplocal-dep
  namespace: udp-lb-etplocal-ns
  labels:
    app: udp-lb-etplocal-dep
spec:
  replicas: 2
  selector:
    matchLabels:
      app: udp-lb-etplocal-dep
  template:
    metadata:
      labels:
        app: udp-lb-etplocal-dep
    spec:
      containers:
      - name: udp-test
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          runAsNonRoot: true
          seccompProfile:
            type: RuntimeDefault
        image: k8s.gcr.io/e2e-test-images/agnhost:2.43
        args:
          - netexec
          - --udp-port=8081
        ports:
        - containerPort: 8081
          protocol: UDP
---
apiVersion: v1
kind: Service
metadata:
  name: udp-lb-etplocal-svc
  namespace: udp-lb-etplocal-ns
  labels:
    app: udp-lb-etplocal-dep
  annotations:                                                                                                                                                                                                                               
    loadbalancer.openstack.org/enable-health-monitor: "true"
    loadbalancer.openstack.org/health-monitor-delay: "5"                                                                                                                                                                                      
    loadbalancer.openstack.org/health-monitor-max-retries: "2"
    loadbalancer.openstack.org/health-monitor-timeout: "5"
spec:
  ports:
  - port: 8082
    targetPort: 8081
    protocol: UDP
  selector:
    app: udp-lb-etplocal-dep
  type: LoadBalancer
  externalTrafficPolicy: Local
EOF


3. Install nc
$ sudo yum install nmap-ncat

4. Test the svc
$ for i in {1..100}; do cat <(echo hostname) <(sleep 1) | nc -w 1 -u <servicd FIP> 8082; echo; done > /tmp/result.txt && cat /tmp/result.txt | sort | uniq -c

Actual results: traffic from the LB is sent to all the pool members and some requests are lost


Expected results: traffic from the LB only sent to the ONLINE pool members


Additional info:

$ metalsmith list
+--------------------------------------+--------------+--------------------------------------+--------------+--------+------------------------+
| UUID                                 | Node Name    | Allocation UUID                      | Hostname     | State  | IP Addresses           |
+--------------------------------------+--------------+--------------------------------------+--------------+--------+------------------------+
| 4313377f-b172-4385-adb0-76933ca27c50 | compute-0    | 0a0dfabc-ccfc-454f-96b6-da5c8957d63b | compute-0    | ACTIVE | ctlplane=192.168.24.37 |
| cc85ca1c-c309-4bad-bfed-62983c881fc6 | controller-0 | 7f9988db-6bb7-4098-b3d5-9ed567121118 | controller-0 | ACTIVE | ctlplane=192.168.24.20 |
| 6b993b79-7403-45ef-a889-753c31b49d13 | controller-1 | da2308f8-598e-49bc-8359-a6e03b565072 | controller-1 | ACTIVE | ctlplane=192.168.24.49 |
| 8d82aadf-2c0b-4bb7-8991-2d335d83c139 | controller-2 | 5f8dfc5c-d833-45d2-8ce1-018ca82b178b | controller-2 | ACTIVE | ctlplane=192.168.24.40 |
+--------------------------------------+--------------+--------------------------------------+--------------+--------+------------------------+

$ oc -n openshift-config get cm cloud-provider-config -o yaml                                                                                                                                             
apiVersion: v1
data:
[...]
  config: |
    [Global]
    secret-name = openstack-credentials
    secret-namespace = kube-system
    region = regionOne
    ca-file = /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem
    [LoadBalancer]
    lb-provider = ovn
    lb-method = SOURCE_IP_PORT
    floating-network-id = xx
    subnet-id = xx
    create-monitor = False
    monitor-delay = 10s
    monitor-timeout = 10s
    monitor-max-retries = 1
kind: ConfigMap
[...]


[1] https://github.com/openshift/openstack-test/blob/b2f8871fe72c24285c4cc22dd4491ea0e609c492/test/extended/openstack/loadbalancers.go#L245C4-L245C4

Comment 3 Jon Uriarte 2023-08-01 14:39:20 UTC
This issue affects LB type services (with External Traffic Policy set to local) [1] and LB type ingress controllers or routes when using octavia-ovn provider in OCP 4.13 and OCP 4.14, which are the supported versions for OSP 17.1.

[1] https://docs.openshift.com/container-platform/4.13/networking/load-balancing-openstack.html#nw-osp-loadbalancer-etp-local_load-balancing-openstack