Hi, I talked to the OVN team we agreed that this is not necessarily related to the aforementioned scale BZ. "failed to get pod annotation: timed out waiting for annotations" can have several different causes. This particular case's cluster only has 6 nodes. The scale BZ talked about 300+ nodes Example event from the marketplace namespace (from the attached must-gather): ~~~ apiVersion: v1 items: - apiVersion: v1 count: 5137 eventTime: null firstTimestamp: "2021-12-18T00:07:03Z" involvedObject: apiVersion: v1 kind: Pod name: image-pruner-27329760-6m6bn namespace: openshift-image-registry resourceVersion: "10264877" uid: c7f02a03-aa39-428b-ad98-262f607818dd kind: Event lastTimestamp: "2021-12-20T14:30:31Z" message: |- (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_image-pruner-27329760-6m6bn_openshift-image-registry_c7f02a03-aa39-428b-ad98-262f607818dd_0(4df65342b14239568ba6c396dd909e7257690541a7b76a9d522d585f20a7cc9a): error adding pod openshift-image-registry_image-pruner-27329760-6m6bn to CNI network "multus-cni-network": [openshift-image-registry/image-pruner-27329760-6m6bn:ovn-kubernetes]: error adding container to network "ovn-kubernetes": CNI request failed with status 400: '[openshift-image-registry/image-pruner-27329760-6m6bn 4df65342b14239568ba6c396dd909e7257690541a7b76a9d522d585f20a7cc9a] [openshift-image-registry/image-pruner-27329760-6m6bn 4df65342b14239568ba6c396dd909e7257690541a7b76a9d522d585f20a7cc9a] failed to get pod annotation: timed out waiting for annotations ~~~ The error message can be ambiguous, because all that it really reveals is that: * the CNI plugin on the node dialed into the K8s API and looked for annotation k8s.ovn.org/pod-networks on the pod in question * the lookup failed after 30 seconds, meaning the pod was not annotated with that annotation. The issue is being hit here ... https://github.com/openshift/ovn-kubernetes/blob/6c81aa64359cf47eea0211c4c4a7113a8f8325f6/go-controller/pkg/cni/cni.go#L118 ~~~ func (pr *PodRequest) cmdAdd(kubeAuth *KubeAPIAuth, podLister corev1listers.PodLister, useOVSExternalIDs bool, kclient kubernetes.Interface) ([]byte, error) { namespace := pr.PodNamespace podName := pr.PodName if namespace == "" || podName == "" { return nil, fmt.Errorf("required CNI variable missing") } kubecli := &kube.Kube{KClient: kclient} annotCondFn := isOvnReady if pr.IsSmartNIC { // Add Smart-NIC connection-details annotation so ovnkube-node running on smart-NIC // performs the needed network plumbing. if err := pr.addSmartNICConnectionDetailsAnnot(kubecli); err != nil { return nil, err } annotCondFn = isSmartNICReady } // Get the IP address and MAC address of the pod // for Smart-Nic, ensure connection-details is present podUID, annotations, err := GetPodAnnotations(pr.ctx, podLister, kclient, namespace, podName, annotCondFn) if err != nil { return nil, fmt.Errorf("failed to get pod annotation: %v", err) } if err := pr.checkOrUpdatePodUID(podUID); err != nil { return nil, err } ~~~ ... and here: https://github.com/openshift/ovn-kubernetes/blob/6c81aa64359cf47eea0211c4c4a7113a8f8325f6/go-controller/pkg/cni/utils.go#L78 ~~~ // GetPodAnnotations obtains the pod UID and annotation from the cache or apiserver func GetPodAnnotations(ctx context.Context, podLister corev1listers.PodLister, kclient kubernetes.Interface, namespace, name string, annotCond podAnnotWaitCond) (string, map[string]string, error) { var notFoundCount uint timeout := time.After(30 * time.Second) for { select { case <-ctx.Done(): return "", nil, fmt.Errorf("canceled waiting for annotations") case <-timeout: return "", nil, fmt.Errorf("timed out waiting for annotations") default: pod, err := getPod(podLister, kclient, namespace, name) if err != nil { if !apierrors.IsNotFound(err) { return "", nil, fmt.Errorf("failed to get pod for annotations: %v", err) } // Allow up to 1 second for pod to be found notFoundCount++ if notFoundCount >= 5 { return "", nil, fmt.Errorf("timed out waiting for pod after 1s: %v", err) } // drop through to try again } else if pod != nil { if annotCond(pod.Annotations) { return string(pod.UID), pod.Annotations, nil } } // try again later time.Sleep(200 * time.Millisecond) } } } ~~~ The pod is either read from cache or directly form the API: https://github.com/openshift/ovn-kubernetes/blob/6c81aa64359cf47eea0211c4c4a7113a8f8325f6/go-controller/pkg/cni/utils.go#L48 ~~~ // getPod tries to read a Pod object from the informer cache, or if the pod // doesn't exist there, the apiserver. If neither a list or a kube client is // given, returns no pod and no error func getPod(podLister corev1listers.PodLister, kclient kubernetes.Interface, namespace, name string) (*kapi.Pod, error) { var pod *kapi.Pod var err error if podLister != nil { pod, err = podLister.Pods(namespace).Get(name) if err != nil && !apierrors.IsNotFound(err) { return nil, err } // drop through } if kclient != nil { // If the pod wasn't in our local cache, ask for it directly pod, err = kclient.CoreV1().Pods(namespace).Get(context.TODO(), name, metav1.GetOptions{}) } return pod, err } ~~~ After that we run the passed annotCond function to extract the annotation: https://github.com/openshift/ovn-kubernetes/blob/6c81aa64359cf47eea0211c4c4a7113a8f8325f6/go-controller/pkg/cni/utils.go#L92 https://github.com/openshift/ovn-kubernetes/blob/6c81aa64359cf47eea0211c4c4a7113a8f8325f6/go-controller/pkg/cni/utils.go#L22 ~~~ // isOvnReady is a wait condition for OVN master to set pod-networks annotation func isOvnReady(podAnnotation map[string]string) bool { if _, ok := podAnnotation[util.OvnPodAnnotationName]; ok { return true } return false } ~~~ With: go-controller/pkg/util/pod_annotation.go: OvnPodAnnotationName = "k8s.ovn.org/pod-networks" Just a side note if this turned out to be related to the scale issue: The BZ also mentions the following: https://bugzilla.redhat.com/show_bug.cgi?id=2014332#c5 >> "We are only backporting to 4.8 and 4.7 what is necessary to fix the problem in the <redacted customer name> scenario, which is only 20-25 pods per node instead of 250. I'm not sure if you want to try reducing the number of pods per node (...)" There's also the next comment here: https://bugzilla.redhat.com/show_bug.cgi?id=2014332#c7 >> "Verified on 4.8 cluster-bot cluster build from openshift ovn-kubernetes pull 798 using the workload from comment 6 on a 120 node AWS cluster. No annotation timeouts were seen. Also regression tested on a small bare metal cluster" Which refers to comment 6 with this: >> "but we are really only looking to back port to 4.7 the fixes that will get us to <redacted customer name> level scale targets. Right now as per our understanding there won't be more than 50 pods per node (including OCP/infra pods) on the 120 node cluster."
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days