Bug 2002010

Summary: ovn-kube may never attempt to retry a pod creation
Product: OpenShift Container Platform Reporter: Tim Rozet <trozet>
Component: NetworkingAssignee: Tim Rozet <trozet>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: high CC: bbennett, wking
Version: 4.6Keywords: FastFix
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2005462 (view as bug list) Environment:
Last Closed: 2022-03-10 16:08:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2005462    

Description Tim Rozet 2021-09-07 17:19:56 UTC
Description of problem:
In ovnkube-master there is a cache of pods that need to be retried for creating their corresponding OVN logical port. If the initial pod add fails in ovnkube-master...say due to the pod not being scheduled yet. We will add the pod to the cache, but subsequent retries will not happen because we always check if the pod is scheduled based on the version in the cache, and not the latest version of the pod.

The end result of this is that the pod will never get a logical switch port in OVN and never come up.



Steps to Reproduce:
1. Create a pod that cannot be scheduled (marking workers as not ready), ovnkube-master will still get this event and fail to create its logical switch port.
2. Remove taint and allow pod to be scheduled. 
3. See if pod comes up or is infinitely stuck unless ovnkube-master is restarted.

Comment 5 errata-xmlrpc 2022-03-10 16:08:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056