Hide Forgot
Description of problem: When the fixed_ip of a secondary interface (multus) is removed from a secondary network interface and the kuryr controller is restarted, the controller is crash looping with the following error message: ~~~ 2021-02-03 17:44:59.534 1 ERROR oslo_service.service [-] Error starting thread.: IndexError: list index out of range 2021-02-03 17:44:59.534 1 ERROR oslo_service.service Traceback (most recent call last): 2021-02-03 17:44:59.534 1 ERROR oslo_service.service File "/usr/lib/python3.6/site-packages/oslo_service/service.py", line 810, in run_service 2021-02-03 17:44:59.534 1 ERROR oslo_service.service service.start() 2021-02-03 17:44:59.534 1 ERROR oslo_service.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/service.py", line 110, in start 2021-02-03 17:44:59.534 1 ERROR oslo_service.service self.pool_driver.sync_pools() 2021-02-03 17:44:59.534 1 ERROR oslo_service.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 1241, in sync_pools 2021-02-03 17:44:59.534 1 ERROR oslo_service.service vif_drv.sync_pools() 2021-02-03 17:44:59.534 1 ERROR oslo_service.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 994, in sync_pools 2021-02-03 17:44:59.534 1 ERROR oslo_service.service self._recover_precreated_ports() 2021-02-03 17:44:59.534 1 ERROR oslo_service.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 999, in _recover_precreated_ports 2021-02-03 17:44:59.534 1 ERROR oslo_service.service self._precreated_ports(action='recover') 2021-02-03 17:44:59.534 1 ERROR oslo_service.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 1022, in _precreated_ports 2021-02-03 17:44:59.534 1 ERROR oslo_service.service parent_ports, available_subports, subnets = self._get_trunks_info() 2021-02-03 17:44:59.534 1 ERROR oslo_service.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 412, in _get_trunks_info 2021-02-03 17:44:59.534 1 ERROR oslo_service.service 'ip': port.fixed_ips[0]['ip_address'], 2021-02-03 17:44:59.534 1 ERROR oslo_service.service IndexError: list index out of range 2021-02-03 17:44:59.534 1 ERROR oslo_service.service ~~~ The use case for this scenario is a setup with limit IP addresses. Multus secondary interfaces with ipvlan should be configured with wearabouts and only the pods should need IP addresses from the network, not the nodes. This is why the deployment automation removes the node IP's past deployment. Version-Release number of selected component (if applicable): OpenShift 4.6.12 OpenStack 16.1.3 (AIO) Instances Deployed with additional network interfaces How reproducible: Always Steps to Reproduce: 1. Deploy Cluster with Kuryr 2. Configure MachineSet with additional network interfaces ~~~ [stack@osp16amd ocp-test1]$ oc get machineset -n openshift-machine-api NAME DESIRED CURRENT READY AVAILABLE AGE ocp-phnb2-worker-0 2 2 2 2 8d ocp-phnb2-worker-1 1 1 1 1 3h27m [stack@osp16amd ocp-test1]$ oc get machineset ocp-phnb2-worker-1 -n openshift-machine-api -o json | jq -r .spec.template.spec.providerSpec.value.networks [ { "filter": {}, "subnets": [ { "filter": { "name": "ocp-phnb2-nodes", "tags": "openshiftClusterID=ocp-phnb2" } } ] }, { "filter": {}, "noAllowedAddressPairs": true, "subnets": [ { "filter": { "name": "additional-network-subnet" } } ] } ] ~~~ 3. Remove IP from Node Network interface ~~~ [stack@osp16amd ocp-test1]$ openstack port list --network additional-network +--------------------------------------+--------------------------+-------------------+-------------------------------------------------------------------------------+--------+ | ID | Name | MAC Address | Fixed IP Addresses | Status | +--------------------------------------+--------------------------+-------------------+-------------------------------------------------------------------------------+--------+ | 52a86be6-763c-497e-9d06-21cff0fa4dab | ocp-phnb2-worker-1-9wpjw | fa:16:3e:bd:58:45 | ip_address='192.168.123.68', subnet_id='8ea4d2d7-5541-4bd3-8828-86b441ae06f9' | ACTIVE | | 66a3e17b-b111-4e06-a709-083c96cf57e6 | | fa:16:3e:d6:02:3f | ip_address='192.168.123.10', subnet_id='8ea4d2d7-5541-4bd3-8828-86b441ae06f9' | DOWN | | aaf9493c-bcfd-42bf-9cb3-b8eedbb5cb69 | | fa:16:3e:66:7e:2f | ip_address='192.168.123.1', subnet_id='8ea4d2d7-5541-4bd3-8828-86b441ae06f9' | ACTIVE | +--------------------------------------+--------------------------+-------------------+-------------------------------------------------------------------------------+--------+ [stack@osp16amd ocp-test1]$ openstack port set 52a86be6-763c-497e-9d06-21cff0fa4dab --no-fixed-ip --no-allowed-address --allowed-address ip-address=192.168.123.0/24 [stack@osp16amd ocp-test1]$ openstack port list --network additional-network +--------------------------------------+--------------------------+-------------------+-------------------------------------------------------------------------------+--------+ | ID | Name | MAC Address | Fixed IP Addresses | Status | +--------------------------------------+--------------------------+-------------------+-------------------------------------------------------------------------------+--------+ | 52a86be6-763c-497e-9d06-21cff0fa4dab | ocp-phnb2-worker-1-9wpjw | fa:16:3e:bd:58:45 | | ACTIVE | | 66a3e17b-b111-4e06-a709-083c96cf57e6 | | fa:16:3e:d6:02:3f | ip_address='192.168.123.10', subnet_id='8ea4d2d7-5541-4bd3-8828-86b441ae06f9' | DOWN | | aaf9493c-bcfd-42bf-9cb3-b8eedbb5cb69 | | fa:16:3e:66:7e:2f | ip_address='192.168.123.1', subnet_id='8ea4d2d7-5541-4bd3-8828-86b441ae06f9' | ACTIVE | +--------------------------------------+--------------------------+-------------------+-------------------------------------------------------------------------------+--------+ ~~~ 4. Restart Kuryr Pod ~~~ [stack@osp16amd ocp-test1]$ oc delete pods -n openshift-kuryr -l app=kuryr-controller pod "kuryr-controller-75957cd77d-fd444" deleted ~~~ 5. Check the Kuryr Pod status ~~~ [stack@osp16amd ocp-test1]$ oc get pods -n openshift-kuryr NAME READY STATUS RESTARTS AGE kuryr-cni-57vnm 1/1 Running 0 8d kuryr-cni-59wzz 1/1 Running 0 59m kuryr-cni-6cjsx 1/1 Running 0 28h kuryr-cni-hgftg 1/1 Running 0 8d kuryr-cni-hlkq4 1/1 Running 0 29h kuryr-cni-xmf4k 1/1 Running 0 8d kuryr-controller-75957cd77d-4wtn5 0/1 CrashLoopBackOff 1 68s [stack@osp16amd ocp-test1]$ oc logs -n openshift-kuryr kuryr-controller-75957cd77d-4wtn5 2021-02-03 17:44:33.247 1 INFO kuryr_kubernetes.config [-] Logging enabled! 2021-02-03 17:44:33.248 1 INFO kuryr_kubernetes.config [-] /usr/bin/kuryr-k8s-controller version 4.6.0 2021-02-03 17:44:34.335 1 INFO os_vif [-] Loaded VIF plugins: linux_bridge, noop, ovs, noop, sriov 2021-02-03 17:44:34.344 1 INFO kuryr_kubernetes.controller.service [-] Configured handlers: ['vif', 'kuryrport', 'service', 'endpoints', 'kuryrloadbalancer', 'policy', 'pod_label', 'namespace', 'kuryrnetworkpolicy', 'kuryrnetwork'] 2021-02-03 17:44:34.384 1 INFO kuryr_kubernetes.controller.drivers.lbaasv2 [-] Octavia supports ACLs for Amphora provider. 2021-02-03 17:44:34.384 1 INFO kuryr_kubernetes.controller.drivers.lbaasv2 [-] Octavia supports double listeners (different protocol, same port) for Amphora provider. 2021-02-03 17:44:34.384 1 INFO kuryr_kubernetes.controller.drivers.lbaasv2 [-] Octavia supports resource tags. 2021-02-03 17:44:34.462 1 INFO kuryr_kubernetes.controller.service [-] Loaded handlers: ['endpoints', 'kuryrloadbalancer', 'kuryrnetwork', 'kuryrnetworkpolicy', 'kuryrport', 'namespace', 'pod_label', 'policy', 'service', 'vif'] 2021-02-03 17:44:34.472 1 WARNING oslo_config.cfg [-] Deprecated: Option "sg_mode" from group "octavia_defaults" is deprecated for removal (enforce_sg_rules option can be used instead). Its value may be silently ignored in the future. 2021-02-03 17:44:34.481 1 INFO kuryr_kubernetes.controller.service [-] Service 'KuryrK8sService' stopped 2021-02-03 17:44:34.482 1 INFO kuryr_kubernetes.controller.service [-] Service 'KuryrK8sService' starting 2021-02-03 17:44:34.483 1 INFO kuryr_kubernetes.controller.service [-] Running in non-HA mode, starting watcher immediately. 2021-02-03 17:44:34.487 1 INFO kuryr_kubernetes.watcher [-] Started watching '/api/v1/endpoints' 2021-02-03 17:44:34.492 1 INFO kuryr_kubernetes.watcher [-] Started watching '/apis/openstack.org/v1/kuryrloadbalancers' 2021-02-03 17:44:34.502 1 INFO kuryr_kubernetes.watcher [-] Started watching '/apis/openstack.org/v1/kuryrports' 2021-02-03 17:44:34.509 1 INFO kuryr_kubernetes.watcher [-] Started watching '/api/v1/namespaces' 2021-02-03 17:44:34.512 1 INFO kuryr_kubernetes.watcher [-] Started watching '/api/v1/pods' 2021-02-03 17:44:34.517 1 INFO kuryr_kubernetes.watcher [-] Started watching '/apis/networking.k8s.io/v1/networkpolicies' 2021-02-03 17:44:34.522 1 INFO kuryr_kubernetes.watcher [-] Started watching '/apis/openstack.org/v1/kuryrnetworkpolicies' 2021-02-03 17:44:34.527 1 INFO kuryr_kubernetes.watcher [-] Started watching '/apis/openstack.org/v1/kuryrnetworks' 2021-02-03 17:44:34.531 1 INFO kuryr_kubernetes.watcher [-] Started watching '/api/v1/services' 2021-02-03 17:44:59.534 1 ERROR oslo_service.service [-] Error starting thread.: IndexError: list index out of range 2021-02-03 17:44:59.534 1 ERROR oslo_service.service Traceback (most recent call last): 2021-02-03 17:44:59.534 1 ERROR oslo_service.service File "/usr/lib/python3.6/site-packages/oslo_service/service.py", line 810, in run_service 2021-02-03 17:44:59.534 1 ERROR oslo_service.service service.start() 2021-02-03 17:44:59.534 1 ERROR oslo_service.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/service.py", line 110, in start 2021-02-03 17:44:59.534 1 ERROR oslo_service.service self.pool_driver.sync_pools() 2021-02-03 17:44:59.534 1 ERROR oslo_service.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 1241, in sync_pools 2021-02-03 17:44:59.534 1 ERROR oslo_service.service vif_drv.sync_pools() 2021-02-03 17:44:59.534 1 ERROR oslo_service.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 994, in sync_pools 2021-02-03 17:44:59.534 1 ERROR oslo_service.service self._recover_precreated_ports() 2021-02-03 17:44:59.534 1 ERROR oslo_service.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 999, in _recover_precreated_ports 2021-02-03 17:44:59.534 1 ERROR oslo_service.service self._precreated_ports(action='recover') 2021-02-03 17:44:59.534 1 ERROR oslo_service.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 1022, in _precreated_ports 2021-02-03 17:44:59.534 1 ERROR oslo_service.service parent_ports, available_subports, subnets = self._get_trunks_info() 2021-02-03 17:44:59.534 1 ERROR oslo_service.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 412, in _get_trunks_info 2021-02-03 17:44:59.534 1 ERROR oslo_service.service 'ip': port.fixed_ips[0]['ip_address'], 2021-02-03 17:44:59.534 1 ERROR oslo_service.service IndexError: list index out of range 2021-02-03 17:44:59.534 1 ERROR oslo_service.service 2021-02-03 17:44:59.536 1 INFO kuryr_kubernetes.controller.service [-] Service 'KuryrK8sService' stopping 2021-02-03 17:44:59.536 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/api/v1/endpoints' 2021-02-03 17:44:59.537 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/openstack.org/v1/kuryrloadbalancers' 2021-02-03 17:44:59.537 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/openstack.org/v1/kuryrports' 2021-02-03 17:44:59.538 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/api/v1/namespaces' 2021-02-03 17:44:59.539 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/api/v1/pods' 2021-02-03 17:44:59.539 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/networking.k8s.io/v1/networkpolicies' 2021-02-03 17:44:59.540 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/openstack.org/v1/kuryrnetworkpolicies' 2021-02-03 17:44:59.540 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/openstack.org/v1/kuryrnetworks' 2021-02-03 17:44:59.541 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/api/v1/services' 2021-02-03 17:44:59.541 1 INFO kuryr_kubernetes.watcher [-] No remaining active watchers, Exiting... ~~~ Actual results: kuryr-controller in Crash Loop Expected results: kuryr-controller OK Additional info: Enabeling the fixed IP on the port actually fixes the issue: Add IP: ~~~ openstack port set 52a86be6-763c-497e-9d06-21cff0fa4dab --no-fixed-ip --fixed-ip subnet=additional-network-subnet,ip-address=192.168.123.68 --no-security-group --no-allowed-address --allowed-address ip-address=192.168.123.0/24 ~~~ Restart Kuryr: ~~~ [stack@osp16amd ocp-test1]$ oc delete pods -n openshift-kuryr -l app=kuryr-controller pod "kuryr-controller-75957cd77d-4wtn5" deleted ~~~ Check Status: ~~~ [stack@osp16amd ocp-test1]$ oc get pods -n openshift-kuryr -l app=kuryr-controller NAME READY STATUS RESTARTS AGE kuryr-controller-75957cd77d-n7mbp 1/1 Running 0 44s ~~~
Looks like kuryr is configured to use that interface (the VM trunk port), and it is not supported to have the trunk port of the VM without IP, as that one is the one used by the subports attached to the containers
Actually only the IP of the SECONDARY interface (ens4) is removed, not the ip of the PRIMARY interface (ens3) used for kuryr and the subports Note: Port where IP was removed was 52a86be6-763c-497e-9d06-21cff0fa4dab ~~~ [stack@osp16amd ocp-test1]$ openstack network trunk list | grep ocp-phnb2-worker-1 | 0b491376-57a3-44c4-9576-40581c62b6b5 | ocp-phnb2-worker-1-9wpjw | 52a86be6-763c-497e-9d06-21cff0fa4dab | | | cf9547e0-5ce4-4120-9b70-36112c7b359e | ocp-phnb2-worker-1-9wpjw | 40607804-4fca-4db0-9191-8552481d61bf | | # This is one where the IP was removed [stack@osp16amd ocp-test1]$ openstack port show 52a86be6-763c-497e-9d06-21cff0fa4dab -f value -c mac_address -c name -c trunk_details fa:16:3e:bd:58:45 ocp-phnb2-worker-1-9wpjw {'trunk_id': '0b491376-57a3-44c4-9576-40581c62b6b5', 'sub_ports': []} # Here NO IP was removed [stack@osp16amd ocp-test1]$ openstack port show 40607804-4fca-4db0-9191-8552481d61bf -f value -c mac_address -c name -c trunk_details fa:16:3e:7b:87:db ocp-phnb2-worker-1-9wpjw {'trunk_id': 'cf9547e0-5ce4-4120-9b70-36112c7b359e', 'sub_ports': [{'segmentation_id': 6, 'segmentation_type': 'vlan', 'port_id': '953352a7-04e1-4837-875f-87002d6dd9a4', 'mac_address': 'fa:16:3e:66:bd:2c'}, {'segmentation_id': 57, 'segmentation_type': 'vlan', 'port_id': '7a9e7cd5-f2e8-4e5f-8146-49a504c6f119', 'mac_address': 'fa:16:3e:2d:eb:10'}, {'segmentation_id': 802, 'segmentation_type': 'vlan', 'port_id': '5bfbbaa8-97ea-415b-977d-3d8c2e089e6c', 'mac_address': 'fa:16:3e:c0:64:c1'}, {'segmentation_id': 878, 'segmentation_type': 'vlan', 'port_id': '3c317835-d264-4bf7-b7dc-511c4db6c9e3', 'mac_address': 'fa:16:3e:65:7b:a3'}, {'segmentation_id': 920, 'segmentation_type': 'vlan', 'port_id': 'e75716eb-5fe9-4828-98ae-1ffba35a2b44', 'mac_address': 'fa:16:3e:62:7c:03'}, {'segmentation_id': 1653, 'segmentation_type': 'vlan', 'port_id': 'fce27960-2151-48c8-98e2-db174971ecc1', 'mac_address': 'fa:16:3e:2c:f4:03'}, {'segmentation_id': 1699, 'segmentation_type': 'vlan', 'port_id': '573a65e0-0b6b-461b-b5b2-8e80ecc37258', 'mac_address': 'fa:16:3e:9b:26:b8'}, {'segmentation_id': 2009, 'segmentation_type': 'vlan', 'port_id': '6c98ab03-6175-4047-9caf-94ca0ff43baa', 'mac_address': 'fa:16:3e:30:1c:ea'}, {'segmentation_id': 2138, 'segmentation_type': 'vlan', 'port_id': '1d618b48-fdaa-4651-abb2-d33e67964916', 'mac_address': 'fa:16:3e:d1:3e:5c'}, {'segmentation_id': 2222, 'segmentation_type': 'vlan', 'port_id': 'f8b667d2-5ca3-4b09-80b9-429549939ec9', 'mac_address': 'fa:16:3e:f9:0a:fb'}, {'segmentation_id': 2280, 'segmentation_type': 'vlan', 'port_id': '32bc7b34-6fbf-46cb-ac73-c31943bcdcaa', 'mac_address': 'fa:16:3e:64:ed:4b'}, {'segmentation_id': 2302, 'segmentation_type': 'vlan', 'port_id': 'f2560925-e025-4c2e-b0ac-1c80e2769891', 'mac_address': 'fa:16:3e:d8:1a:e0'}, {'segmentation_id': 2428, 'segmentation_type': 'vlan', 'port_id': 'c922ba07-261e-441e-a76c-8c631299cf07', 'mac_address': 'fa:16:3e:61:32:af'}, {'segmentation_id': 2499, 'segmentation_type': 'vlan', 'port_id': 'a5a49479-89ae-4bcf-b450-1666cbfb7a83', 'mac_address': 'fa:16:3e:80:39:30'}, {'segmentation_id': 2598, 'segmentation_type': 'vlan', 'port_id': '46d19092-a5a9-4f33-9f83-39ef7124d09b', 'mac_address': 'fa:16:3e:4c:5d:08'}, {'segmentation_id': 2656, 'segmentation_type': 'vlan', 'port_id': '5ce9b3ed-d615-4427-a673-18de43f6753b', 'mac_address': 'fa:16:3e:af:22:bd'}, {'segmentation_id': 2935, 'segmentation_type': 'vlan', 'port_id': 'c75c3e74-e24d-457e-ac8d-7264cfdd8973', 'mac_address': 'fa:16:3e:72:5e:df'}, {'segmentation_id': 3011, 'segmentation_type': 'vlan', 'port_id': '6691528a-b062-4eb8-a532-e4e90c817880', 'mac_address': 'fa:16:3e:94:5a:4c'}, {'segmentation_id': 3203, 'segmentation_type': 'vlan', 'port_id': 'dc4fee62-e4fd-48a0-ba49-03d3f99a8fc9', 'mac_address': 'fa:16:3e:65:e8:30'}, {'segmentation_id': 3436, 'segmentation_type': 'vlan', 'port_id': '5adf3b0f-26f7-42af-91d2-08b34b5f1ab0', 'mac_address': 'fa:16:3e:bd:94:71'}, {'segmentation_id': 3475, 'segmentation_type': 'vlan', 'port_id': '762e55b8-fd28-43fa-b086-51560802f640', 'mac_address': 'fa:16:3e:34:7c:8b'}, {'segmentation_id': 3593, 'segmentation_type': 'vlan', 'port_id': 'ba9b5ed1-aa89-4183-bc0b-0b9af4890039', 'mac_address': 'fa:16:3e:9a:45:14'}, {'segmentation_id': 3628, 'segmentation_type': 'vlan', 'port_id': 'ada2be75-2edd-4c42-877b-f08b86133174', 'mac_address': 'fa:16:3e:3b:0f:29'}, {'segmentation_id': 3634, 'segmentation_type': 'vlan', 'port_id': '2336f374-1c4b-4d15-bf19-dcf4508ed236', 'mac_address': 'fa:16:3e:b5:0f:64'}, {'segmentation_id': 3650, 'segmentation_type': 'vlan', 'port_id': '899c5761-4bd6-4f9e-9c77-d5a3a132e95a', 'mac_address': 'fa:16:3e:79:1b:9b'}, {'segmentation_id': 3715, 'segmentation_type': 'vlan', 'port_id': '7d7899e8-a0dd-4d38-b865-15544c2415b3', 'mac_address': 'fa:16:3e:69:36:27'}, {'segmentation_id': 3822, 'segmentation_type': 'vlan', 'port_id': 'e535223e-2fa6-475a-b34f-d332c8084e10', 'mac_address': 'fa:16:3e:b4:a3:17'}, {'segmentation_id': 3937, 'segmentation_type': 'vlan', 'port_id': '2e7ca23a-93dc-4402-a318-6dffc6b2294c', 'mac_address': 'fa:16:3e:eb:59:ea'}, {'segmentation_id': 3977, 'segmentation_type': 'vlan', 'port_id': '7b3dd5db-d1f6-4e1a-b719-3e26e2bffb31', 'mac_address': 'fa:16:3e:42:dd:4e'}, {'segmentation_id': 4005, 'segmentation_type': 'vlan', 'port_id': '2df82508-9f57-4e58-8871-531401a7a9e1', 'mac_address': 'fa:16:3e:b5:84:c4'}]} [stack@osp16amd ocp-test1]$ oc debug node/ocp-phnb2-worker-1-9wpjw -- ip link Creating debug namespace/openshift-debug-node-rkknj ... Starting pod/ocp-phnb2-worker-1-9wpjw-debug ... To use host binaries, run `chroot /host` 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether fa:16:3e:7b:87:db brd ff:ff:ff:ff:ff:ff 3: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether fa:16:3e:bd:58:45 brd ff:ff:ff:ff:ff:ff Removing debug pod ... Removing debug namespace/openshift-debug-node-rkknj ... ~~~
I think I can confirm this happens. I filed BZ [1] due to this, the culprit is that the trunk port for the worker secondary interfaces should not get created by machine-api/CAPO in the first place. This means that a possible (untested, but should work unless trunks are recreated by CAPO) workaround for the problem would be to remove the trunks on secondary interfaces. We'll add a workaround in Kuryr anyway. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1925233
Verified on OCP4.8.0-0.nightly-2021-02-21-102854 over OSP13 (2021-01-20.1) with amphora provider. Steps: 1. Create extra network and subnet: $ openstack network create data-network $ openstack subnet create data-subnet --network data-network --gateway 10.196.0.1 --subnet-range 10.196.0.0/16 --dns-nameserver 10.46.0.31 2. Create new machineset including 1 worker with 2 interfaces (https://gist.github.com/rlobillo/4e80b1bdf1c5da995378db4aea01c76a) 3. Wait until is new worker is up and remove the secondary IP manually: $ openstack server list +--------------------------------------+-----------------------------+--------+----------------------------------------------------------------+---------------------------------------+------ -----+ | ID | Name | Status | Networks | Image | Flavo r | +--------------------------------------+-----------------------------+--------+----------------------------------------------------------------+---------------------------------------+------ -----+ | 1b8ebbbd-c460-451c-ab22-ff139ac62b58 | ostest-dzghr-data-0-d7rzc | ACTIVE | data-network=10.196.0.71; installer_host-network=172.16.40.235 | ostest-dzghr-rhcos | m4.xl arge | | d70b20e0-48fb-4df8-90f0-380bd4eb749e | ostest-dzghr-worker-0-wkn9v | ACTIVE | installer_host-network=172.16.40.187 | ostest-dzghr-rhcos | m4.xl arge | | b9a6e2bb-562f-46bc-b973-ef19db04a1f7 | ostest-dzghr-master-2 | ACTIVE | installer_host-network=172.16.40.84 | ostest-dzghr-rhcos | m4.xl arge | | 276f710a-af2d-4c7c-83dc-6bf69c45488c | ostest-dzghr-master-1 | ACTIVE | installer_host-network=172.16.40.216 | ostest-dzghr-rhcos | m4.xl arge | | a2bfb533-49cb-4155-8572-f4257de47c33 | ostest-dzghr-master-0 | ACTIVE | installer_host-network=172.16.40.156 | ostest-dzghr-rhcos | m4.xl arge | | f19bc5aa-4232-43a8-9827-17312294b997 | installer_host | ACTIVE | installer_host-network=172.16.40.120, 10.46.22.245 | rhel-guest-image-8.3-401.x86_64.qcow2 | m1.me dium | +--------------------------------------+-----------------------------+--------+----------------------------------------------------------------+---------------------------------------+------ -----+ $ openstack port list --network data-network | grep 10.196.0.71 | 323ac436-e40b-4c6f-aa87-e41bde227a7a | ostest-dzghr-data-0-d7rzc | fa:16:3e:7a:0c:94 | ip_address='10.196.0.71', subnet_id='dd2c3046-6a23-4e82-94a3-57a556e03fff' | ACTIVE | $ openstack port set 323ac436-e40b-4c6f-aa87-e41bde227a7a --no-fixed-ip --no-allowed-address --allowed-address ip-address=10.196.0.0/16 $ openstack server list +--------------------------------------+-----------------------------+--------+----------------------------------------------------+---------------------------------------+-----------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+-----------------------------+--------+----------------------------------------------------+---------------------------------------+-----------+ | 1b8ebbbd-c460-451c-ab22-ff139ac62b58 | ostest-dzghr-data-0-d7rzc | ACTIVE | installer_host-network=172.16.40.235 | ostest-dzghr-rhcos | m4.xlarge | | d70b20e0-48fb-4df8-90f0-380bd4eb749e | ostest-dzghr-worker-0-wkn9v | ACTIVE | installer_host-network=172.16.40.187 | ostest-dzghr-rhcos | m4.xlarge | | b9a6e2bb-562f-46bc-b973-ef19db04a1f7 | ostest-dzghr-master-2 | ACTIVE | installer_host-network=172.16.40.84 | ostest-dzghr-rhcos | m4.xlarge | | 276f710a-af2d-4c7c-83dc-6bf69c45488c | ostest-dzghr-master-1 | ACTIVE | installer_host-network=172.16.40.216 | ostest-dzghr-rhcos | m4.xlarge | | a2bfb533-49cb-4155-8572-f4257de47c33 | ostest-dzghr-master-0 | ACTIVE | installer_host-network=172.16.40.156 | ostest-dzghr-rhcos | m4.xlarge | | f19bc5aa-4232-43a8-9827-17312294b997 | installer_host | ACTIVE | installer_host-network=172.16.40.120, 10.46.22.245 | rhel-guest-image-8.3-401.x86_64.qcow2 | m1.medium | +--------------------------------------+-----------------------------+--------+----------------------------------------------------+---------------------------------------+-----------+ $ oc delete pods -n openshift-kuryr -l app=kuryr-controller pod "kuryr-controller-566f9cf79f-8794k" deleted kuryr-controller remains stable after that: $ oc get pods -n openshift-kuryr NAME READY STATUS RESTARTS AGE kuryr-cni-6jtlc 1/1 Running 0 123m kuryr-cni-f9rk7 1/1 Running 0 18m kuryr-cni-n6lf4 1/1 Running 0 106m kuryr-cni-qlrdv 1/1 Running 0 123m kuryr-cni-v4csm 1/1 Running 0 123m kuryr-controller-566f9cf79f-dq24h 1/1 Running 0 12m Furthermore, kuryr-tempest tests, NP tests and conformance tests passed for this build. Please refer to the attachment on https://bugzilla.redhat.com/show_bug.cgi?id=1927244#c6
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438