Bug 1928029 - kuryr-controller in crash loop if IP is removed from secondary interfaces
Summary: kuryr-controller in crash loop if IP is removed from secondary interfaces
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: All
OS: All
high
high
Target Milestone: ---
: 4.7.z
Assignee: Michał Dulko
QA Contact: GenadiC
URL:
Whiteboard:
Depends On: 1924917
Blocks: 1938161
TreeView+ depends on / blocked
 
Reported: 2021-02-12 07:48 UTC by OpenShift BugZilla Robot
Modified: 2021-03-25 01:53 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-03-25 01:53:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift kuryr-kubernetes pull 449 0 None open [release-4.7] Bug 1928029: Get trunks more diligently 2021-02-16 16:33:20 UTC
Red Hat Product Errata RHBA-2021:0821 0 None None None 2021-03-25 01:53:16 UTC

Comment 6 rlobillo 2021-03-15 15:42:28 UTC
Verified on 4.7.0-0.nightly-2021-03-14-223051 over OSP16.1 (RHOS-16.1-RHEL-8-20201214.n.3) with OVN-Octavia.

Steps:

1. Create extra network and subnet:

$ openstack network create data-network
$ openstack subnet create data-subnet --network data-network --gateway 10.0.0.1 --subnet-range 10.0.0.0/24 --dns-nameserver 10.46.0.31


2. Loading below machineset manifest which includes two interfaces, one to the regular ostest-858gf-nodes and secondary to recently created data-network:

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  name: ostest-858gf-data-0
  namespace: openshift-machine-api
spec:
  replicas: 1
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: ostest-858gf
      machine.openshift.io/cluster-api-machineset: ostest-858gf-data-0
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: ostest-858gf
        machine.openshift.io/cluster-api-machine-role: worker
        machine.openshift.io/cluster-api-machine-type: worker
        machine.openshift.io/cluster-api-machineset: ostest-858gf-data-0
    spec:
      metadata: {}
      providerSpec:
        value:
          apiVersion: openstackproviderconfig.openshift.io/v1alpha1
          cloudName: openstack
          cloudsSecret:
            name: openstack-cloud-credentials
            namespace: openshift-machine-api
          flavor: m4.xlarge
          image: ostest-858gf-rhcos
          kind: OpenstackProviderSpec
          metadata:
            creationTimestamp: null
          networks:
          - filter: {}
            subnets:
            - filter:
                name: ostest-858gf-nodes
                tags: openshiftClusterID=ostest-858gf
            - filter:
                name: data-subnet
          primarySubnet: 7bbfcc1c-247f-4d72-927a-e188c082848c
          securityGroups:
          - filter: {}
            name: ostest-858gf-worker
          serverMetadata:
            Name: ostest-858gf-worker
            openshiftClusterID: ostest-858gf
          tags:
          - openshiftClusterID=ostest-858gf
          trunk: true
          userDataSecret:
            name: worker-user-data

3. Wait until is new worker is up and remove the secondary IP manually:

$ openstack server list
+--------------------------------------+-----------------------------+--------+-------------------------------------------------------------+--------------------+--------+                                                                                                    
| ID                                   | Name                        | Status | Networks                                                    | Image              | Flavor |                                                                                                    
+--------------------------------------+-----------------------------+--------+-------------------------------------------------------------+--------------------+--------+                                                                                                    
| b856bbf4-dfb5-45df-b3f6-0f6546465fe6 | ostest-858gf-data-0-297tt   | ACTIVE | data-network=10.0.0.245; ostest-858gf-openshift=10.196.1.87 | ostest-858gf-rhcos |        |
| 3b045e98-12f7-422b-b0b2-391c8f0082cd | ostest-858gf-worker-0-w6psd | ACTIVE | ostest-858gf-openshift=10.196.1.240                         | ostest-858gf-rhcos |        |                                                                                                    
| 19046102-21e1-4cc0-9282-e1dd5dbbc398 | ostest-858gf-worker-0-qtc8n | ACTIVE | ostest-858gf-openshift=10.196.3.163                         | ostest-858gf-rhcos |        |                                                                                                    
| 9f38334e-29e7-4c55-86e8-404da832e6bc | ostest-858gf-worker-0-9pgwp | ACTIVE | ostest-858gf-openshift=10.196.0.247                         | ostest-858gf-rhcos |        |                                                                                                    
| 7119c59c-6ac9-48f4-93ef-3549da71d5a8 | ostest-858gf-master-2       | ACTIVE | ostest-858gf-openshift=10.196.2.78                          | ostest-858gf-rhcos |        |                                                                                                    
| 5d591107-7c86-4c23-aa4f-78f6a50581fc | ostest-858gf-master-1       | ACTIVE | ostest-858gf-openshift=10.196.0.178                         | ostest-858gf-rhcos |        |                                                                                                    
| 27d67be7-8f4a-488a-8323-54022dc288a9 | ostest-858gf-master-0       | ACTIVE | ostest-858gf-openshift=10.196.3.229                         | ostest-858gf-rhcos |        |                                                                                                    
+--------------------------------------+-----------------------------+--------+-------------------------------------------------------------+--------------------+

$ oc get nodes -o wide
NAME                          STATUS   ROLES    AGE     VERSION           INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                 CONTAINER-RUNTIME
ostest-858gf-data-0-297tt     Ready    worker   2m58s   v1.20.0+bafe72f   10.196.1.87    <none>        Red Hat Enterprise Linux CoreOS 47.83.202103142039-0 (Ootpa)   4.18.0-240.15.1.el8_3.x86_64   cri-o://1.20.1-5.rhaos4.7.git62f21aa.el8
ostest-858gf-master-0         Ready    master   5h55m   v1.20.0+bafe72f   10.196.3.229   <none>        Red Hat Enterprise Linux CoreOS 47.83.202103142039-0 (Ootpa)   4.18.0-240.15.1.el8_3.x86_64   cri-o://1.20.1-5.rhaos4.7.git62f21aa.el8
ostest-858gf-master-1         Ready    master   5h55m   v1.20.0+bafe72f   10.196.0.178   <none>        Red Hat Enterprise Linux CoreOS 47.83.202103142039-0 (Ootpa)   4.18.0-240.15.1.el8_3.x86_64   cri-o://1.20.1-5.rhaos4.7.git62f21aa.el8
ostest-858gf-master-2         Ready    master   5h55m   v1.20.0+bafe72f   10.196.2.78    <none>        Red Hat Enterprise Linux CoreOS 47.83.202103142039-0 (Ootpa)   4.18.0-240.15.1.el8_3.x86_64   cri-o://1.20.1-5.rhaos4.7.git62f21aa.el8
ostest-858gf-worker-0-9pgwp   Ready    worker   5h35m   v1.20.0+bafe72f   10.196.0.247   <none>        Red Hat Enterprise Linux CoreOS 47.83.202103142039-0 (Ootpa)   4.18.0-240.15.1.el8_3.x86_64   cri-o://1.20.1-5.rhaos4.7.git62f21aa.el8
ostest-858gf-worker-0-qtc8n   Ready    worker   5h35m   v1.20.0+bafe72f   10.196.3.163   <none>        Red Hat Enterprise Linux CoreOS 47.83.202103142039-0 (Ootpa)   4.18.0-240.15.1.el8_3.x86_64   cri-o://1.20.1-5.rhaos4.7.git62f21aa.el8
ostest-858gf-worker-0-w6psd   Ready    worker   5h34m   v1.20.0+bafe72f   10.196.1.240   <none>        Red Hat Enterprise Linux CoreOS 47.83.202103142039-0 (Ootpa)   4.18.0-240.15.1.el8_3.x86_64   cri-o://1.20.1-5.rhaos4.7.git62f21aa.el8


$ oc debug node/ostest-858gf-data-0-297tt
sh-4.4# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc fq_codel state UP group default qlen 1000
    link/ether fa:16:3e:1c:63:15 brd ff:ff:ff:ff:ff:ff
    inet 10.196.1.87/16 brd 10.196.255.255 scope global dynamic noprefixroute ens3
       valid_lft 42969sec preferred_lft 42969sec
    inet6 fe80::5cf:a645:1ae1:b49/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc fq_codel state UP group default qlen 1000
    link/ether fa:16:3e:12:1a:bd brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.245/24 brd 10.0.0.255 scope global dynamic noprefixroute ens4
       valid_lft 42969sec preferred_lft 42969sec
    inet6 fe80::8572:6f9c:6dc:c005/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

$ openstack port list --network data-network | grep ostest-858gf-data-0-297tt
| 78b76dcc-6db7-49be-9e49-8ffce4197432 | ostest-858gf-data-0-297tt | fa:16:3e:12:1a:bd | ip_address='10.0.0.245', subnet_id='52f120c2-2a71-43b6-8ad0-79232c6a8bfd' | ACTIVE |

$ openstack port set 78b76dcc-6db7-49be-9e49-8ffce4197432 --no-fixed-ip --no-allowed-address  --allowed-address ip-address=10.0.0.0/24

$ openstack server list
+--------------------------------------+-----------------------------+--------+-------------------------------------+--------------------+--------+
| ID                                   | Name                        | Status | Networks                            | Image              | Flavor |
+--------------------------------------+-----------------------------+--------+-------------------------------------+--------------------+--------+
| b856bbf4-dfb5-45df-b3f6-0f6546465fe6 | ostest-858gf-data-0-297tt   | ACTIVE | ostest-858gf-openshift=10.196.1.87  | ostest-858gf-rhcos |        |
| 3b045e98-12f7-422b-b0b2-391c8f0082cd | ostest-858gf-worker-0-w6psd | ACTIVE | ostest-858gf-openshift=10.196.1.240 | ostest-858gf-rhcos |        |
| 19046102-21e1-4cc0-9282-e1dd5dbbc398 | ostest-858gf-worker-0-qtc8n | ACTIVE | ostest-858gf-openshift=10.196.3.163 | ostest-858gf-rhcos |        |
| 9f38334e-29e7-4c55-86e8-404da832e6bc | ostest-858gf-worker-0-9pgwp | ACTIVE | ostest-858gf-openshift=10.196.0.247 | ostest-858gf-rhcos |        |
| 7119c59c-6ac9-48f4-93ef-3549da71d5a8 | ostest-858gf-master-2       | ACTIVE | ostest-858gf-openshift=10.196.2.78  | ostest-858gf-rhcos |        |
| 5d591107-7c86-4c23-aa4f-78f6a50581fc | ostest-858gf-master-1       | ACTIVE | ostest-858gf-openshift=10.196.0.178 | ostest-858gf-rhcos |        |
| 27d67be7-8f4a-488a-8323-54022dc288a9 | ostest-858gf-master-0       | ACTIVE | ostest-858gf-openshift=10.196.3.229 | ostest-858gf-rhcos |        |
+--------------------------------------+-----------------------------+--------+-------------------------------------+--------------------+--------+


4. Restart kuryr-controller and confirms it remains stable:

$ oc delete pods -n openshift-kuryr -l app=kuryr-controller
pod "kuryr-controller-8699ddb754-zmtpq" deleted

$ oc get pod -n openshift-kuryr
NAME                                READY   STATUS    RESTARTS   AGE
kuryr-cni-cs8nk                     1/1     Running   0          5h57m
kuryr-cni-fsptw                     1/1     Running   0          7m44s
kuryr-cni-gbffk                     1/1     Running   0          5h39m
kuryr-cni-ghjjk                     1/1     Running   0          5h57m
kuryr-cni-tn2wp                     1/1     Running   0          5h40m
kuryr-cni-twlhx                     1/1     Running   0          5h57m
kuryr-cni-v2jrv                     1/1     Running   0          5h39m
kuryr-controller-8699ddb754-nstpj   1/1     Running   0          118s

Comment 8 errata-xmlrpc 2021-03-25 01:53:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.7.3 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0821


Note You need to log in before you can comment on or make changes to this bug.