Bug 2017532
| Summary: | [ovn-controller] OVN controller is not binding ports | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | Surya Seetharaman <surya> |
| Component: | OVN | Assignee: | Numan Siddique <nusiddiq> |
| Status: | CLOSED WONTFIX | QA Contact: | Jianlin Shi <jishi> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | FDP 21.C | CC: | ctrautma, jiji, jlema, mmichels |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-01-07 14:21:58 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Description of problem: On 500 node scale cluster, we saw one worker node where pods were not getting created. They were failing with the error: failed to configure pod interface: timed out waiting for OVS port binding (ovn-installed) for 0a:58:0a:81:10:07 [10.129.16.7/23] We could see the lsp getting created on nbdb: sh-4.4# ovn-nbctl find logical-switch-port name=default_client-on-ovn-worker3 _uuid : b8d7381c-391e-4adb-b69d-2b1b5584a970 addresses : ["0a:58:0a:81:10:30 10.129.16.48"] dhcpv4_options : [] dhcpv6_options : [] dynamic_addresses : [] enabled : [] external_ids : {namespace=default, pod="true"} ha_chassis_group : [] name : default_client-on-ovn-worker3 options : {iface-id-ver="62170fe5-07bf-486d-a851-2d1938dd32f2", requested-chassis=worker007-fc640} parent_name : [] port_security : ["0a:58:0a:81:10:30 10.129.16.48"] tag : [] tag_request : [] type : "" up : false The ovs interface was also getting created (and recreated with every CNI_ADD failure because ovn wasn't binding the port correctly) sh-4.4# ovs-vsctl list interface | grep default_client-on-ovn-worker3 -C20 ofport_request : [] options : {csum="true", key=flow, remote_ip="192.168.217.57"} other_config : {} statistics : {rx_bytes=13319, rx_packets=141, tx_bytes=13415, tx_packets=141} status : {tunnel_egress_iface=br-ex, tunnel_egress_iface_carrier=up} type : geneve _uuid : a2592a95-6667-45d8-9bf9-0d4f5b91b384 admin_state : up bfd : {} bfd_status : {} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : full error : [] external_ids : {attached_mac="0a:58:0a:81:10:30", iface-id=default_client-on-ovn-worker3, iface-id-ver="62170fe5-07bf-486d-a851-2d1938dd32f2", ip_addresses="10.129.16.48/23", sandbox="2670dcaf7ea8444df4feb02226a8f2a695b33b15cf796cbcac84656c0fd3751b"} ifindex : 11573 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 0 link_speed : 10000000000 link_state : up lldp : {} mac : [] mac_in_use : "ce:85:47:39:f2:c7" mtu : 1400 mtu_request : [] name : "2670dcaf7ea8444" ofport : 11932 ofport_request : [] options : {} other_config : {} statistics : {collisions=0, rx_bytes=516, rx_crc_err=0, rx_dropped=1, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=6, tx_bytes=836, tx_dropped=0, tx_errors=0, tx_packets=8} status : {driver_name=veth, driver_version="1.0", firmware_version=""} type : "" Version-Release number of selected component (if applicable): OCP version: 4.10.0-0.nightly-2021-10-21-105053 ARG ovsver=2.16.0-15.el8fdp ARG ovnver=21.09.0-20.el8fdp OVN controller simply wasn't claiming any ovs interfaces and creation was timing out. How reproducible: not always Steps to Reproduce: 1. N/A unfortunately. Out of the 500 nodes, we only saw one such ovn-controller having issues 2. 3.