Bug 1843784 - LBaaSLoadBalancer object has wrong default value for security_groups
Summary: LBaaSLoadBalancer object has wrong default value for security_groups
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.5.0
Assignee: Maysa Macedo
QA Contact: GenadiC
URL:
Whiteboard:
Depends On: 1843674
Blocks: 1844093
TreeView+ depends on / blocked
 
Reported: 2020-06-04 06:46 UTC by OpenShift BugZilla Robot
Modified: 2020-07-13 17:43 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-13 17:43:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift kuryr-kubernetes pull 258 0 None closed [release-4.5] Bug 1843784: Ensure security_groups on LBaaSLoadBalancer defaults to empty list 2020-08-03 09:34:01 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:43:28 UTC

Description OpenShift BugZilla Robot 2020-06-04 06:46:37 UTC
+++ This bug was initially created as a clone of Bug #1843674 +++

Description of problem:

When no security groups is present on the LBaaSLoadBalancer oslo
versioned object it should default to an empty list and not to
None. Otherwise iterations of the security_groups field fails.

2020-05-29 10:10:02.892 1 ERROR kuryr_kubernetes.controller.drivers.lbaasv2 if sg.id in loadbalancer.security_groups:
2020-05-29 10:10:02.892 1 ERROR kuryr_kubernetes.controller.drivers.lbaasv2 TypeError: argument of type 'NoneType' is not iterable.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 rlobillo 2020-06-08 15:46:02 UTC
Verified on OCP 4.5.0-0.nightly-2020-06-05-021159 on OSP13 puddle: 2020-05-19.2

Forcing the controller to cleanup a leftover loadbalancer do not raise any error and kuryr keeps providing its service normally.

NP + Conformance has been run with the expected results.

Steps:

#1. Set environment: 

$ oc new-project test && oc run --image kuryr/demo demo && oc run --image kuryr/demo demo-caller && oc expose pod/demo --port 80 --target-port 8080
$ oc get all
NAME              READY   STATUS    RESTARTS   AGE
pod/demo          1/1     Running   0          21m
pod/demo-caller   1/1     Running   0          45s

NAME           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
service/demo   ClusterIP   172.30.106.155   <none>        80/TCP    64s

$ openstack loadbalancer list | grep demo
| 76509d3f-7708-4d67-b815-ea4e753ec7f7 | test/demo                                                                   | 1fe9a73db0c7406cbac908c2b6a041a5 | 172.30.106.155 | ACTIVE              | octavia  |

$ oc get pods -n openshift-kuryr
NAME                                   READY   STATUS    RESTARTS   AGE
kuryr-cni-4pcj2                        1/1     Running   0          3h12m
kuryr-cni-7q5w8                        1/1     Running   0          3h14m
kuryr-cni-b4c29                        1/1     Running   0          3h12m
kuryr-cni-g9z4n                        1/1     Running   0          3h13m
kuryr-cni-wm5zr                        1/1     Running   0          3h15m
kuryr-cni-zv6pc                        1/1     Running   0          3h15m
kuryr-controller-54db8f998d-rf6rm      1/1     Running   0          18m
kuryr-dns-admission-controller-db6m4   1/1     Running   0          3h10m
kuryr-dns-admission-controller-lbzmx   1/1     Running   0          3h9m
kuryr-dns-admission-controller-pkrsq   1/1     Running   0          3h8m


#2. Remove service and restart controller, letting the service ACTIVE:

$ date && oc delete service/demo && \
	oc delete pod -n openshift-kuryr $(oc get pods -n openshift-kuryr -o jsonpath='{.items[6].metadata.name}') && \
	openstack loadbalancer list | grep demo 
	
Mon Jun  8 05:50:44 EDT 2020
service "demo" deleted
pod "kuryr-controller-54db8f998d-rf6rm" deleted
| 76509d3f-7708-4d67-b815-ea4e753ec7f7 | test/demo                                                                   | 1fe9a73db0c7406cbac908c2b6a041a5 | 172.30.106.155 | ACTIVE              | octavia  |

(overcloud) [stack@undercloud-0 ~]$ oc get pods -n openshift-kuryr
NAME                                   READY   STATUS    RESTARTS   AGE
kuryr-cni-4pcj2                        1/1     Running   0          174m
kuryr-cni-7q5w8                        1/1     Running   0          176m
kuryr-cni-b4c29                        1/1     Running   0          173m
kuryr-cni-g9z4n                        1/1     Running   0          175m
kuryr-cni-wm5zr                        1/1     Running   0          176m
kuryr-cni-zv6pc                        1/1     Running   0          177m
kuryr-controller-54db8f998d-rf6rm      0/1     Running   0          20s
kuryr-dns-admission-controller-db6m4   1/1     Running   0          172m
kuryr-dns-admission-controller-lbzmx   1/1     Running   0          171m
kuryr-dns-admission-controller-pkrsq   1/1     Running   0          170m

#3. Wait a few minutes and confirm that the load balancer is deleted by new controller pod:
$ date && openstack loadbalancer list | grep demo
Mon Jun  8 05:30:51 EDT 2020

# No errors observed:
$ oc logs -n openshift-kuryr $(oc get pods -n openshift-kuryr -o jsonpath='{.items[6].metadata.name}') | grep ERROR
$

#4. New service can be created and works as expected:
$ oc expose pod/demo --port 80 --target-port 8080
$ oc get all
NAME              READY   STATUS    RESTARTS   AGE
pod/demo          1/1     Running   0          21m
pod/demo-caller   1/1     Running   0          45s

NAME           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
service/demo   ClusterIP   172.30.106.155   <none>        80/TCP    64s

$ oc rsh demo-caller curl 172.30.106.155
demo: HELLO! I AM ALIVE!!!

Comment 4 rlobillo 2020-06-08 16:05:38 UTC
Verified also on OSP16(RHOS_TRUNK-16.0-RHEL-8-20200513.n.1) with OVN-Octavia and OCP4.5.0-0.nightly-2020-06-08-053957

#1. Setup ready:
(shiftstack) [stack@undercloud-0 ~]$  oc rsh demo-caller curl 172.30.136.161
demo: HELLO! I AM ALIVE!!!

#2. Forcing removal event during controller restart:
$ date && oc delete pod -n openshift-kuryr $(oc get pods -n openshift-kuryr -o jsonpath='{.items[6].metadata.name}') &
$ oc delete service/demo && openstack loadbalancer list | grep demo
service "demo" deleted
[1]+  Done                    date && oc delete pod -n openshift-kuryr $(oc get pods -n openshift-kuryr -o jsonpath='{.items[6].metadata.name}')
| e9efc65e-6eaf-4584-80de-40bf0f01a2ee | test1/demo                                                                  | c4b979c3ddf249b1aa83a5ef16efb22b | 172.30.136.161 | ACTIVE              | ovn      |

#3. Check cleanup works fine while deleting Octavia-OVN LB (without SGs on the VIP port):
(wait few minutes) LB deleted and no ERROR in kuryr-controller logs:

(shiftstack) [stack@undercloud-0 ~]$ openstack loadbalancer list | grep demo
(shiftstack) [stack@undercloud-0 ~]$ oc logs -n openshift-kuryr $(oc get pods -n openshift-kuryr -o jsonpath='{.items[6].metadata.name}') | grep ERROR

Comment 5 errata-xmlrpc 2020-07-13 17:43:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.