Bug 2055661 - migrate loadbalancers from amphora to ovn not working
Summary: migrate loadbalancers from amphora to ovn not working
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.10
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: 4.10.0
Assignee: Michał Dulko
QA Contact: Itay Matza
URL:
Whiteboard:
Depends On: 2051423
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-17 13:31 UTC by OpenShift BugZilla Robot
Modified: 2022-04-21 13:16 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-21 13:15:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift kuryr-kubernetes pull 637 0 None open [release-4.10] Bug 2055661: Update KLB .spec.provider when required 2022-02-17 13:31:33 UTC
Red Hat Product Errata RHSA-2022:1356 0 None None None 2022-04-21 13:16:05 UTC

Description OpenShift BugZilla Robot 2022-02-17 13:31:20 UTC
+++ This bug was initially created as a clone of Bug #2051423 +++

Description of problem:

Applying below procedure to migrate the loadbalancer from amphora provider to ovn provider is not working in 4.10:

https://docs.openshift.com/container-platform/4.9/networking/load-balancing-openstack.html#installation-osp-kuryr-octavia-configure_load-balancing-openstack

is not migrating the existing amphora loadbalancers to ovn.

Version-Release number of selected component (if applicable):
core_puddle: RHOS-16.1-RHEL-8-20211126.n.1
openshift_puddle: 4.10.0-0.nightly-2022-02-06-060618


How reproducible: Always


Steps to Reproduce:
1. Perform OCP installation with only amphora provider enabled. As a result, 48 load balancers are created:

2022-02-06 14:48:14.212 | TASK [install : Get number of amphoras] ****************************************
2022-02-06 14:48:14.214 | task path: /home/rhos-ci/jenkins/workspace/DFG-osasinfra-shiftstack_periodic-16.1_director-rhel-virthost-3cont_3comp_1ipa-ipv4-geneve,vlan-hybrid_4.10_3master_3worker-updateoctavia/infrared/plugins/openshift/roles/install/tasks/update_octavia_provider.yml:8
2022-02-06 14:48:14.217 | Sunday 06 February 2022  14:48:14 +0000 (0:00:04.665)       0:54:50.222 ******* 
2022-02-06 14:48:19.067 | changed: [undercloud-0] => {
2022-02-06 14:48:19.070 |     "changed": true,
2022-02-06 14:48:19.073 |     "cmd": "source ~/overcloudrc\nopenstack loadbalancer list -c provider -f value | grep amphora | wc -l\n",
2022-02-06 14:48:19.076 |     "delta": "0:00:04.527868",
2022-02-06 14:48:19.078 |     "end": "2022-02-06 14:48:19.029284",
2022-02-06 14:48:19.081 |     "rc": 0,
2022-02-06 14:48:19.083 |     "start": "2022-02-06 14:48:14.501416"
2022-02-06 14:48:19.086 | }
2022-02-06 14:48:19.088 | 
2022-02-06 14:48:19.091 | STDOUT:
2022-02-06 14:48:19.093 | 
2022-02-06 14:48:19.096 | 58

2. Enabling OVN again in octavia.conf in all the controllers:

- name: Enable ovn-octavia
  block:
    - name: Add ovn-octavia on enabled_provider_drivers option
      ini_file:
        path: /var/lib/config-data/puppet-generated/octavia/etc/octavia/octavia.conf
        section: api_settings
        option: enabled_provider_drivers
        value: "amphora: The Octavia Amphora driver.,octavia: Deprecated alias of the Octavia Amphora driver.,ovn: Octavia OVN driver."
      loop: "{{ groups['controller'] }}"
      delegate_to: "{{ item }}"

    - name: Restart Octavia containers
      shell: "podman restart octavia_api"
      loop: "{{ groups['controller'] }}"
      delegate_to: "{{ item }}"
  become: true


3. Annotate kuryr-config:

2022-02-06 14:53:04.737 | TASK [install : Octavia provider update - trigger update removing kuryr-octavia-provider annotation] ***
2022-02-06 14:53:04.740 | task path: /home/rhos-ci/jenkins/workspace/DFG-osasinfra-shiftstack_periodic-16.1_director-rhel-virthost-3cont_3comp_1ipa-ipv4-geneve,vlan-hybrid_4.10_3master_3worker-updateoctavia/infrared/plugins/openshift/roles/install/tasks/update_octavia_provider.yml:46
2022-02-06 14:53:04.743 | Sunday 06 February 2022  14:53:04 +0000 (0:00:06.444)       0:59:40.719 ******* 
2022-02-06 14:53:05.266 | changed: [undercloud-0] => {
2022-02-06 14:53:05.269 |     "changed": true,
2022-02-06 14:53:05.272 |     "cmd": "kubectl -n openshift-kuryr annotate cm kuryr-config networkoperator.openshift.io/kuryr-octavia-provider-\n",
2022-02-06 14:53:05.276 |     "delta": "0:00:00.205740",
2022-02-06 14:53:05.279 |     "end": "2022-02-06 14:53:05.231370",
2022-02-06 14:53:05.283 |     "rc": 0,
2022-02-06 14:53:05.286 |     "start": "2022-02-06 14:53:05.025630"
2022-02-06 14:53:05.290 | }
2022-02-06 14:53:05.293 | 
2022-02-06 14:53:05.297 | STDOUT:
2022-02-06 14:53:05.300 | 
2022-02-06 14:53:05.304 | configmap/kuryr-config annotated

4. Wait until the process starts:

2022-02-06 14:53:05.310 | TASK [install : Active wait for the update to start (max 240 seconds)] *********
2022-02-06 14:53:05.314 | task path: /home/rhos-ci/jenkins/workspace/DFG-osasinfra-shiftstack_periodic-16.1_director-rhel-virthost-3cont_3comp_1ipa-ipv4-geneve,vlan-hybrid_4.10_3master_3worker-updateoctavia/infrared/plugins/openshift/roles/install/tasks/update_octavia_provider.yml:50
2022-02-06 14:53:05.317 | Sunday 06 February 2022  14:53:05 +0000 (0:00:00.593)       0:59:41.313 ******* 
2022-02-06 14:53:05.908 | FAILED - RETRYING: Active wait for the update to start (max 240 seconds) (24 retries left).
2022-02-06 14:53:16.381 | FAILED - RETRYING: Active wait for the update to start (max 240 seconds) (23 retries left).
2022-02-06 14:53:26.852 | FAILED - RETRYING: Active wait for the update to start (max 240 seconds) (22 retries left).
2022-02-06 14:53:37.301 | FAILED - RETRYING: Active wait for the update to start (max 240 seconds) (21 retries left).
2022-02-06 14:53:47.730 | FAILED - RETRYING: Active wait for the update to start (max 240 seconds) (20 retries left).
2022-02-06 14:53:58.199 | FAILED - RETRYING: Active wait for the update to start (max 240 seconds) (19 retries left).
2022-02-06 14:54:08.789 | changed: [undercloud-0] => {
2022-02-06 14:54:08.792 |     "attempts": 7,
2022-02-06 14:54:08.795 |     "changed": true,
2022-02-06 14:54:08.798 |     "cmd": "kubectl get -n openshift-kuryr cm/kuryr-config -o jsonpath='{.metadata.annotations.networkoperator\\.openshift\\.io/kuryr-octavia-provider}'\n",
2022-02-06 14:54:08.801 |     "delta": "0:00:00.249561",
2022-02-06 14:54:08.803 |     "end": "2022-02-06 14:54:08.749260",
2022-02-06 14:54:08.806 |     "rc": 0,
2022-02-06 14:54:08.808 |     "start": "2022-02-06 14:54:08.499699"
2022-02-06 14:54:08.811 | }
2022-02-06 14:54:08.813 | 
2022-02-06 14:54:08.816 | STDOUT:
2022-02-06 14:54:08.818 | 
2022-02-06 14:54:08.821 | ovn

5. Check the existing loadbalancers:

2022-02-06 14:59:20.413 | TASK [install : check number of loadbalancers after octavia update (> 4.8)] ****
2022-02-06 14:59:20.416 | task path: /home/rhos-ci/jenkins/workspace/DFG-osasinfra-shiftstack_periodic-16.1_director-rhel-virthost-3cont_3comp_1ipa-ipv4-geneve,vlan-hybrid_4.10_3master_3worker-updateoctavia/infrared/plugins/openshift/roles/install/tasks/update_octavia_provider.yml:93
2022-02-06 14:59:20.419 | Sunday 06 February 2022  14:59:20 +0000 (0:00:00.059)       1:05:56.433 ******* 
2022-02-06 14:59:20.454 | fatal: [undercloud-0]: FAILED! => {
2022-02-06 14:59:20.457 |     "assertion": "ovnlbs.stdout|int == total_lbs.stdout|int",
2022-02-06 14:59:20.459 |     "changed": false,
2022-02-06 14:59:20.462 |     "evaluated_to": false
2022-02-06 14:59:20.465 | }
2022-02-06 14:59:20.468 | 
2022-02-06 14:59:20.470 | MSG:
2022-02-06 14:59:20.473 | 
2022-02-06 14:59:20.476 | unexpected types of loadbalancer after octavia provider update.
2022-02-06 14:59:20.479 | ovn-octavia lbs: 0
2022-02-06 14:59:20.481 | Total: 59

Actual results: Loadbalancers are not migrated at all.
Expected results: All loadbalancers are moved to ovn provider.

--- Additional comment from mdulko on 2022-02-08 11:30:44 UTC ---

Okay, I debugged this. I'm setting blocker- due to several reasons:

* This will mostly happen on OSP13->OSP16 upgrade, but 4.10 is not supporting OSP13, so that migration should happen in 4.9, where it works okay.
* There might be folks running OSP16 with OVS+Amphora, in which case they probably won't move to OVN anyway.
* Setups running OSP16 with OVN+Amphora might be affected, but as OVN-Octavia is preferred I bet this is really minor use case.
* For the rare cases described above I can provide a workaround draft:

1. After executing step 3 in [1], manually update all your KuryrLoadBalancer CRDs and set `.spec.provider` to "ovn". This should trigger the migration of the LBs.

[1] https://docs.openshift.com/container-platform/4.9/networking/load-balancing-openstack.html#installation-osp-kuryr-octavia-configure_load-balancing-openstack

Comment 1 ShiftStack Bugwatcher 2022-03-05 07:07:43 UTC
Removing the Triaged keyword because:
* the QE automation assessment (flag qe_test_coverage) is missing

Comment 8 Itay Matza 2022-04-18 06:41:48 UTC
Verified with OCP 4.10.0-0.nightly-2022-04-14-121105 on top of RHOS-16.1-RHEL-8-20220329.n.1.

Migrate the loadbalancer from amphora provider to ovn provider is working.
Verification steps:

1. Perform OCP installation with only amphora provider enabled. As a result, 58 load balancers are created:
```
2022-04-17 15:39:05.997 | TASK [install : Get number of amphoras] ****************************************
2022-04-17 15:39:05.999 | task path: /home/rhos-ci/jenkins/workspace/DFG-osasinfra-shiftstack_periodic-16.1_director-rhel-virthost-3cont_3comp_1ipa-ipv4-geneve,vlan-hybrid_4.10_3master_3worker-updateoctavia/infrared/plugins/openshift/roles/install/tasks/update_octavia_provider.yml:8
2022-04-17 15:39:06.002 | Sunday 17 April 2022  15:39:05 +0000 (0:00:04.403)       0:51:40.257 ********** 
2022-04-17 15:39:09.309 | changed: [undercloud-0] => {
2022-04-17 15:39:09.311 |     "changed": true,
2022-04-17 15:39:09.314 |     "cmd": "source ~/overcloudrc\nopenstack loadbalancer list -c provider -f value | grep amphora | wc -l\n",
2022-04-17 15:39:09.317 |     "delta": "0:00:03.004725",
2022-04-17 15:39:09.319 |     "end": "2022-04-17 15:39:09.274338",
2022-04-17 15:39:09.322 |     "rc": 0,
2022-04-17 15:39:09.325 |     "start": "2022-04-17 15:39:06.269613"
2022-04-17 15:39:09.327 | }
2022-04-17 15:39:09.330 | 
2022-04-17 15:39:09.332 | STDOUT:
2022-04-17 15:39:09.335 | 
2022-04-17 15:39:09.338 | 58
```

2. Enabling OVN again in octavia.conf in all the controllers:
```
- name: Enable ovn-octavia
  block:
    - name: Add ovn-octavia on enabled_provider_drivers option
      ini_file:
        path: /var/lib/config-data/puppet-generated/octavia/etc/octavia/octavia.conf
        section: api_settings
        option: enabled_provider_drivers
        value: "amphora: The Octavia Amphora driver.,octavia: Deprecated alias of the Octavia Amphora driver.,ovn: Octavia OVN driver."
      loop: "{{ groups['controller'] }}"
      delegate_to: "{{ item }}"

    - name: Restart Octavia containers
      shell: "podman restart octavia_api"
      loop: "{{ groups['controller'] }}"
      delegate_to: "{{ item }}"
  become: true
```

3. Annotate kuryr-config:
```
2022-04-17 15:43:59.319 | TASK [install : Octavia provider update - trigger update removing kuryr-octavia-provider annotation] ***
2022-04-17 15:43:59.322 | task path: /home/rhos-ci/jenkins/workspace/DFG-osasinfra-shiftstack_periodic-16.1_director-rhel-virthost-3cont_3comp_1ipa-ipv4-geneve,vlan-hybrid_4.10_3master_3worker-updateoctavia/infrared/plugins/openshift/roles/install/tasks/update_octavia_provider.yml:46
2022-04-17 15:43:59.325 | Sunday 17 April 2022  15:43:59 +0000 (0:00:06.149)       0:56:33.566 ********** 
2022-04-17 15:43:59.836 | changed: [undercloud-0] => {
2022-04-17 15:43:59.839 |     "changed": true,
2022-04-17 15:43:59.843 |     "cmd": "kubectl -n openshift-kuryr annotate cm kuryr-config networkoperator.openshift.io/kuryr-octavia-provider-\n",
2022-04-17 15:43:59.846 |     "delta": "0:00:00.188458",
2022-04-17 15:43:59.849 |     "end": "2022-04-17 15:43:59.796880",
2022-04-17 15:43:59.852 |     "rc": 0,
2022-04-17 15:43:59.855 |     "start": "2022-04-17 15:43:59.608422"
2022-04-17 15:43:59.858 | }
2022-04-17 15:43:59.860 | 
2022-04-17 15:43:59.863 | STDOUT:
2022-04-17 15:43:59.866 | 
2022-04-17 15:43:59.868 | configmap/kuryr-config annotated
```

4. Wait until the process starts:
```
2022-04-17 15:43:59.873 | TASK [install : Active wait for the update to start (max 240 seconds)] *********
2022-04-17 15:43:59.876 | task path: /home/rhos-ci/jenkins/workspace/DFG-osasinfra-shiftstack_periodic-16.1_director-rhel-virthost-3cont_3comp_1ipa-ipv4-geneve,vlan-hybrid_4.10_3master_3worker-updateoctavia/infrared/plugins/openshift/roles/install/tasks/update_octavia_provider.yml:50
2022-04-17 15:43:59.879 | Sunday 17 April 2022  15:43:59 +0000 (0:00:00.564)       0:56:34.131 ********** 
2022-04-17 15:44:00.386 | FAILED - RETRYING: Active wait for the update to start (max 240 seconds) (24 retries left).
2022-04-17 15:44:10.998 | FAILED - RETRYING: Active wait for the update to start (max 240 seconds) (23 retries left).
2022-04-17 15:44:21.462 | FAILED - RETRYING: Active wait for the update to start (max 240 seconds) (22 retries left).
2022-04-17 15:44:31.878 | FAILED - RETRYING: Active wait for the update to start (max 240 seconds) (21 retries left).
2022-04-17 15:44:42.343 | FAILED - RETRYING: Active wait for the update to start (max 240 seconds) (20 retries left).
2022-04-17 15:44:52.780 | FAILED - RETRYING: Active wait for the update to start (max 240 seconds) (19 retries left).
2022-04-17 15:45:03.177 | FAILED - RETRYING: Active wait for the update to start (max 240 seconds) (18 retries left).
2022-04-17 15:45:13.657 | changed: [undercloud-0] => {
2022-04-17 15:45:13.660 |     "attempts": 8,
2022-04-17 15:45:13.663 |     "changed": true,
2022-04-17 15:45:13.666 |     "cmd": "kubectl get -n openshift-kuryr cm/kuryr-config -o jsonpath='{.metadata.annotations.networkoperator\\.openshift\\.io/kuryr-octavia-provider}'\n",
2022-04-17 15:45:13.669 |     "delta": "0:00:00.151587",
2022-04-17 15:45:13.672 |     "end": "2022-04-17 15:45:13.616926",
2022-04-17 15:45:13.674 |     "rc": 0,
2022-04-17 15:45:13.677 |     "start": "2022-04-17 15:45:13.465339"
2022-04-17 15:45:13.680 | }
2022-04-17 15:45:13.683 | 
2022-04-17 15:45:13.686 | STDOUT:
2022-04-17 15:45:13.690 | 
2022-04-17 15:45:13.693 | ovn
```

5. Check the existing loadbalancers:
```
2022-04-17 15:50:14.443 | TASK [install : Get number of ovn lbs] *****************************************
2022-04-17 15:50:14.446 | task path: /home/rhos-ci/jenkins/workspace/DFG-osasinfra-shiftstack_periodic-16.1_director-rhel-virthost-3cont_3comp_1ipa-ipv4-geneve,vlan-hybrid_4.10_3master_3worker-updateoctavia/infrared/plugins/openshift/roles/install/tasks/update_octavia_provider.yml:69
2022-04-17 15:50:14.449 | Sunday 17 April 2022  15:50:14 +0000 (0:00:00.083)       1:02:48.712 ********** 
2022-04-17 15:50:19.813 | changed: [undercloud-0] => {
2022-04-17 15:50:19.816 |     "changed": true,
2022-04-17 15:50:19.819 |     "cmd": "source ~/overcloudrc\nopenstack loadbalancer list -c provider -f value | grep ovn | wc -l\n",
2022-04-17 15:50:19.821 |     "delta": "0:00:04.239528",
2022-04-17 15:50:19.824 |     "end": "2022-04-17 15:50:19.760138",
2022-04-17 15:50:19.826 |     "rc": 0,
2022-04-17 15:50:19.829 |     "start": "2022-04-17 15:50:15.520610"
2022-04-17 15:50:19.832 | }
2022-04-17 15:50:19.835 | 
2022-04-17 15:50:19.838 | STDOUT:
2022-04-17 15:50:19.840 | 
2022-04-17 15:50:19.845 | 59
2022-04-17 15:50:19.847 | 
2022-04-17 15:50:19.851 | TASK [install : Get number of total loadbalancers] *****************************
2022-04-17 15:50:19.854 | task path: /home/rhos-ci/jenkins/workspace/DFG-osasinfra-shiftstack_periodic-16.1_director-rhel-virthost-3cont_3comp_1ipa-ipv4-geneve,vlan-hybrid_4.10_3master_3worker-updateoctavia/infrared/plugins/openshift/roles/install/tasks/update_octavia_provider.yml:75
2022-04-17 15:50:19.857 | Sunday 17 April 2022  15:50:19 +0000 (0:00:05.395)       1:02:54.108 ********** 
2022-04-17 15:50:23.749 | changed: [undercloud-0] => {
2022-04-17 15:50:23.752 |     "changed": true,
2022-04-17 15:50:23.754 |     "cmd": "source ~/overcloudrc\nopenstack loadbalancer list -c provider -f value | wc -l\n",
2022-04-17 15:50:23.757 |     "delta": "0:00:03.574638",
2022-04-17 15:50:23.759 |     "end": "2022-04-17 15:50:23.714989",
2022-04-17 15:50:23.762 |     "rc": 0,
2022-04-17 15:50:23.764 |     "start": "2022-04-17 15:50:20.140351"
2022-04-17 15:50:23.767 | }
2022-04-17 15:50:23.769 | 
2022-04-17 15:50:23.773 | STDOUT:
2022-04-17 15:50:23.777 | 
2022-04-17 15:50:23.780 | 59
2022-04-17 15:50:23.837 | TASK [install : check number of loadbalancers after octavia update (> 4.8)] ****
2022-04-17 15:50:23.839 | task path: /home/rhos-ci/jenkins/workspace/DFG-osasinfra-shiftstack_periodic-16.1_director-rhel-virthost-3cont_3comp_1ipa-ipv4-geneve,vlan-hybrid_4.10_3master_3worker-updateoctavia/infrared/plugins/openshift/roles/install/tasks/update_octavia_provider.yml:93
2022-04-17 15:50:23.842 | Sunday 17 April 2022  15:50:23 +0000 (0:00:00.062)       1:02:58.107 ********** 
2022-04-17 15:50:23.884 | ok: [undercloud-0] => {
2022-04-17 15:50:23.887 |     "changed": false
2022-04-17 15:50:23.890 | }
2022-04-17 15:50:23.894 | 
2022-04-17 15:50:23.897 | MSG:
2022-04-17 15:50:23.899 | 
2022-04-17 15:50:23.902 | All assertions passed
```

Results: All loadbalancers are moved to ovn provider.

Comment 10 errata-xmlrpc 2022-04-21 13:15:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.10 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1356


Note You need to log in before you can comment on or make changes to this bug.