Bug 1937594 - multiple pods in ContainerCreating state after migration from OpenshiftSDN to OVNKubernetes
Summary: multiple pods in ContainerCreating state after migration from OpenshiftSDN to...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.8
Hardware: ppc64le
OS: Linux
medium
medium
Target Milestone: ---
: 4.8.0
Assignee: Peng Liu
QA Contact: huirwang
URL:
Whiteboard:
Depends On:
Blocks: 2019093
TreeView+ depends on / blocked
 
Reported: 2021-03-11 05:23 UTC by Tania Kapoor
Modified: 2021-11-01 15:43 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2019093 (view as bug list)
Environment:
Last Closed: 2021-07-27 22:52:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
oc adm must gather logs (99.27 KB, text/plain)
2021-03-11 05:23 UTC, Tania Kapoor
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 1057 0 None open Bug 1937594: Bump openshift/api 2021-04-14 13:34:54 UTC
Github openshift cluster-network-operator pull 1063 0 None open Bug 1937594: Update the codegen with the latest API 2021-04-19 12:56:07 UTC
Github openshift cluster-network-operator pull 763 0 None open Split SDN migration into 2 phase 2021-03-22 15:00:53 UTC
Github openshift machine-config-operator pull 2518 0 None open Bug 1937594: Respect status.Migration in network.config when exsits 2021-04-12 03:48:43 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:53:05 UTC

Description Tania Kapoor 2021-03-11 05:23:40 UTC
Created attachment 1762537 [details]
oc adm must gather logs

Created attachment 1762537 [details]
oc adm must gather logs

Created attachment 1762537 [details]
oc adm must gather logs

Created attachment 1762537 [details]
oc adm must gather logs

Created attachment 1762537 [details]
oc adm must gather logs

Description of problem:

machine-config pod in containercreating state after migration 

Version-Release number of selected component (if applicable):
4.8.0-0.nightly-ppc64le-2021-03-08-045421

How reproducible:

After the OVNKube migration gets completed, it has been seen that co get degraded and unstable and the pods get into container creating state.

[root@ktania-48-bastion ~]# oc get network.config/cluster -o jsonpath='{.status.networkType}{"\n"}'
OVNKubernetes


[root@ktania-48-bastion ~]# oc get machineconfigpool -n openshift-machine-config-operator
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-39eb02fc1740972313bfb43b25984015   True      False      False      3              3                   3                     0                      40h
worker   rendered-worker-0eaf2d65761bd2fbb9984835a4986e26   True      False      False      2              2                   2                     0                      40h


[root@ktania-48-bastion ~]# oc get csr | grep "Pending"

[root@ktania-48-bastion ~]# oc get nodes
NAME       STATUS   ROLES    AGE   VERSION
master-0   Ready    master   42h   v1.20.0+69d7e87
master-1   Ready    master   42h   v1.20.0+69d7e87
master-2   Ready    master   42h   v1.20.0+69d7e87
worker-0   Ready    worker   42h   v1.20.0+69d7e87
worker-1   Ready    worker   42h   v1.20.0+69d7e87



[root@ktania-48-bastion ~]# oc get co
NAME                                       VERSION                                     AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.8.0-0.nightly-ppc64le-2021-03-08-045421   False       True          False      23h
baremetal                                  4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
cloud-credential                           4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
cluster-autoscaler                         4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
config-operator                            4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
console                                    4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         True       39h
csi-snapshot-controller                    4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
dns                                        4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
etcd                                       4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
image-registry                             4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      39h
ingress                                    4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      39h
insights                                   4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      39h
kube-apiserver                             4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
kube-controller-manager                    4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
kube-scheduler                             4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
kube-storage-version-migrator              4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      39h
machine-api                                4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
machine-approver                           4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
machine-config                             4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      39h
marketplace                                4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
monitoring                                 4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      39h
network                                    4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        True          True       40h
node-tuning                                4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
openshift-apiserver                        4.8.0-0.nightly-ppc64le-2021-03-08-045421   False       False         False      23h
openshift-controller-manager               4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
openshift-samples                          4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
operator-lifecycle-manager                 4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
operator-lifecycle-manager-catalog         4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
operator-lifecycle-manager-packageserver   4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      23h
service-ca                                 4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h
storage                                    4.8.0-0.nightly-ppc64le-2021-03-08-045421   True        False         False      40h


[root@ktania-48-bastion ~]# oc get pod -n openshift-machine-config-operator
NAME                                         READY   STATUS              RESTARTS   AGE
machine-config-controller-6f467668cd-977kr   0/1     ContainerCreating   0          6h16m
machine-config-daemon-8d9bp                  2/2     Running             0          42h
machine-config-daemon-9nqkg                  2/2     Running             0          42h
machine-config-daemon-bsvhl                  2/2     Running             0          42h
machine-config-daemon-cgzsq                  2/2     Running             0          42h
machine-config-daemon-f7b5h                  2/2     Running             0          42h
machine-config-operator-86c7698f5f-hlt2n     0/1     ContainerCreating   0          26m
machine-config-server-kp9s7                  1/1     Running             0          42h
machine-config-server-qw4m6                  1/1     Running             0          42h
machine-config-server-tnvg8                  1/1     Running             0          42h



[root@ktania-48-bastion ~]# oc describe pod machine-config-controller-6f467668cd-ws6z9 -n openshift-machine-config-operator
Name:                      machine-config-controller-6f467668cd-ws6z9
Namespace:                 openshift-machine-config-operator
Priority:                  2000000000
Priority Class Name:       system-cluster-critical
Node:                      master-1/9.114.99.134
Start Time:                Wed, 10 Mar 2021 01:05:14 -0500
Labels:                    k8s-app=machine-config-controller
                           pod-template-hash=6f467668cd
Annotations:               k8s.ovn.org/pod-networks:
                             {"default":{"ip_addresses":["10.131.0.24/23"],"mac_address":"0a:58:0a:83:00:18","gateway_ips":["10.131.0.1"],"ip_address":"10.131.0.24/23"...
Status:                    Terminating (lasts 4h2m)
Termination Grace Period:  30s
IP:                        
IPs:                       <none>
Controlled By:             ReplicaSet/machine-config-controller-6f467668cd
Containers:
  machine-config-controller:
    Container ID:  
    Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3a821d740256cb7b9951add0ef0af6bbe1870433c439b0aaf7b8ffeb5ef1655e
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /usr/bin/machine-config-controller
    Args:
      start
      --resourcelock-namespace=openshift-machine-config-operator
      --v=2
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:        20m
      memory:     50Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from machine-config-controller-token-lrjn9 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  machine-config-controller-token-lrjn9:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  machine-config-controller-token-lrjn9
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  node-role.kubernetes.io/master=
Tolerations:     node-role.kubernetes.io/master:NoSchedule op=Exists
                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                 node.kubernetes.io/not-ready:NoExecute op=Exists for 120s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 120s
Events:          <none>



[root@ktania-48-bastion ~]# oc get pods -A | grep -v "Running\|Completed"
NAMESPACE                                          NAME                                                      READY   STATUS              RESTARTS   AGE
nfs-provisioner                                    nfs-client-provisioner-65ddb449dd-5b5t8                   0/1     ContainerCreating   0          22h
openshift-apiserver-operator                       openshift-apiserver-operator-77798d5ddf-vswwc             0/1     ContainerCreating   0          23h
openshift-apiserver                                apiserver-6cbd49df69-44x5d                                0/2     Pending             0          4h2m
openshift-apiserver                                apiserver-6cbd49df69-7fc9s                                0/2     Terminating         0          23h
openshift-apiserver                                apiserver-6cbd49df69-7mmq4                                0/2     Init:0/1            0          23h
openshift-apiserver                                apiserver-6cbd49df69-jttqx                                0/2     Init:0/1            0          23h
openshift-authentication-operator                  authentication-operator-644777f6ff-jf742                  0/1     ContainerCreating   0          23h
openshift-authentication                           oauth-openshift-659b4b6565-chchm                          0/1     Pending             0          4h2m
openshift-authentication                           oauth-openshift-659b4b6565-fgw5d                          0/1     ContainerCreating   0          23h
openshift-authentication                           oauth-openshift-659b4b6565-qkl59                          0/1     Terminating         0          23h
openshift-cloud-credential-operator                cloud-credential-operator-7844cd5f7b-7n65x                0/2     ContainerCreating   0          23h
openshift-cluster-machine-approver                 machine-approver-568b48b94d-dbgnw                         1/2     CrashLoopBackOff    77         23h
openshift-cluster-node-tuning-operator             cluster-node-tuning-operator-777f99848d-948tk             0/1     Terminating         0          23h
openshift-cluster-node-tuning-operator             cluster-node-tuning-operator-777f99848d-jvhfs             0/1     Pending             0          4h2m
openshift-cluster-samples-operator                 cluster-samples-operator-556b4f4958-fwfqn                 0/2     ContainerCreating   0          23h
openshift-cluster-storage-operator                 cluster-storage-operator-57bb9bf6d5-qcdbf                 0/1     Pending             0          4h2m
openshift-cluster-storage-operator                 cluster-storage-operator-57bb9bf6d5-xgppm                 0/1     Terminating         0          23h
openshift-cluster-storage-operator                 csi-snapshot-controller-5ff576bc58-qmlzp                  0/1     Terminating         0          23h
openshift-cluster-storage-operator                 csi-snapshot-controller-5ff576bc58-wst8r                  0/1     Pending             0          4h2m
openshift-cluster-storage-operator                 csi-snapshot-controller-operator-5bc6875cd6-qrgd4         0/1     ContainerCreating   0          23h
openshift-cluster-storage-operator                 csi-snapshot-webhook-86dd954b68-8s2pk                     0/1     Terminating         0          23h
openshift-cluster-storage-operator                 csi-snapshot-webhook-86dd954b68-dnwsg                     0/1     Pending             0          4h2m
openshift-config-operator                          openshift-config-operator-654744db69-lpt9w                0/1     ContainerCreating   0          23h
openshift-console-operator                         console-operator-75cfcc996f-cw9t7                         0/1     Terminating         0          23h
openshift-console-operator                         console-operator-75cfcc996f-zxrf5                         0/1     Pending             0          4h2m
openshift-console                                  console-7c84c6885-92w2m                                   0/1     Pending             0          4h2m
openshift-console                                  console-7c84c6885-f2snv                                   0/1     Terminating         0          23h
openshift-console                                  console-7c84c6885-jkbmk                                   0/1     Pending             0          4h2m
openshift-console                                  console-7c84c6885-lnn6l                                   0/1     Terminating         0          23h
openshift-console                                  downloads-77766fb9b9-d4jcw                                0/1     Terminating         0          22h
openshift-console                                  downloads-77766fb9b9-hnr8g                                0/1     Pending             0          4h2m
openshift-console                                  downloads-77766fb9b9-j8qxb                                0/1     Terminating         0          22h
openshift-console                                  downloads-77766fb9b9-jxq6r                                0/1     Pending             0          4h2m
openshift-controller-manager-operator              openshift-controller-manager-operator-5b95959987-j8s8d    0/1     ContainerCreating   0          23h
openshift-controller-manager                       controller-manager-24xbv                                  0/1     ContainerCreating   0          39h
openshift-controller-manager                       controller-manager-c8xwj                                  0/1     ContainerCreating   0          39h
openshift-controller-manager                       controller-manager-gll8s                                  0/1     ContainerCreating   0          39h
openshift-dns-operator                             dns-operator-8655d97566-b4qqm                             0/2     Pending             0          4h2m
openshift-dns-operator                             dns-operator-8655d97566-xbfrp                             0/2     Terminating         0          23h
openshift-dns                                      dns-default-2b9xt                                         0/3     ContainerCreating   0          40h
openshift-dns                                      dns-default-7fdcj                                         0/3     ContainerCreating   0          40h
openshift-dns                                      dns-default-7sfxp                                         0/3     ContainerCreating   0          40h
openshift-dns                                      dns-default-9h6xq                                         0/3     ContainerCreating   0          39h
openshift-dns                                      dns-default-w7wcj                                         0/3     ContainerCreating   0          39h
openshift-etcd-operator                            etcd-operator-5fb99985b4-fkb8d                            0/1     ContainerCreating   0          23h
openshift-image-registry                           cluster-image-registry-operator-6775554c75-xq9r6          0/1     ContainerCreating   0          23h
openshift-image-registry                           image-pruner-1615420800-wk462                             0/1     Pending             0          5h16m
openshift-image-registry                           image-registry-ffd7ccd7d-drt6l                            0/1     ContainerCreating   0          22h
openshift-ingress-canary                           ingress-canary-6wbdz                                      0/1     ContainerCreating   0          39h
openshift-ingress-canary                           ingress-canary-p99m7                                      0/1     ContainerCreating   0          39h
openshift-ingress-operator                         ingress-operator-6588d5bc87-5qdq8                         0/2     ContainerCreating   0          23h
openshift-ingress                                  router-default-9966b6b6b-n69db                            0/1     CrashLoopBackOff    125        23h
openshift-ingress                                  router-default-9966b6b6b-pnnl9                            0/1     CrashLoopBackOff    103        22h
openshift-insights                                 insights-operator-65cf84578d-bkpnh                        0/1     ContainerCreating   1          40h
openshift-kube-apiserver-operator                  kube-apiserver-operator-798b887d75-5bln8                  0/1     ContainerCreating   0          23h
openshift-kube-controller-manager-operator         kube-controller-manager-operator-6b5546947d-lqkrw         0/1     ContainerCreating   0          23h
openshift-kube-scheduler-operator                  openshift-kube-scheduler-operator-7d6d89856c-5dntc        0/1     Pending             0          4h2m
openshift-kube-scheduler-operator                  openshift-kube-scheduler-operator-7d6d89856c-x8bbm        0/1     Terminating         0          23h
openshift-kube-storage-version-migrator-operator   kube-storage-version-migrator-operator-6f6b6d7f5c-92gvr   0/1     ContainerCreating   0          23h
openshift-kube-storage-version-migrator            migrator-84d8f6c6dc-672vg                                 0/1     ContainerCreating   0          22h
openshift-machine-api                              cluster-autoscaler-operator-dc9f865cf-56j9k               0/2     ContainerCreating   0          23h
openshift-machine-api                              cluster-baremetal-operator-664c5999c8-jw5jt               0/2     ContainerCreating   0          23h
openshift-machine-api                              machine-api-operator-689bccd7f5-t25m4                     0/2     ContainerCreating   0          23h
openshift-machine-config-operator                  machine-config-controller-6f467668cd-977kr                0/1     Pending             0          4h2m
openshift-machine-config-operator                  machine-config-controller-6f467668cd-ws6z9                0/1     Terminating         0          23h
openshift-machine-config-operator                  machine-config-operator-86c7698f5f-rrwfm                  0/1     ContainerCreating   0          23h
openshift-marketplace                              marketplace-operator-5b5dd7bcc9-f2wj2                     0/1     ContainerCreating   0          23h
openshift-monitoring                               alertmanager-main-0                                       0/5     ContainerCreating   0          23h
openshift-monitoring                               alertmanager-main-1                                       0/5     ContainerCreating   0          22h
openshift-monitoring                               alertmanager-main-2                                       0/5     ContainerCreating   0          22h
openshift-monitoring                               cluster-monitoring-operator-854c6c68b5-fj7hr              0/2     ContainerCreating   0          23h
openshift-monitoring                               grafana-989865765-vdhrs                                   0/2     ContainerCreating   0          22h
openshift-monitoring                               kube-state-metrics-5bb8cb9bc5-ppld8                       0/3     ContainerCreating   0          22h
openshift-monitoring                               openshift-state-metrics-848bd7d949-pfqxn                  0/3     ContainerCreating   0          22h
openshift-monitoring                               prometheus-adapter-84c57d866f-4xp9z                       0/1     ContainerCreating   0          22h
openshift-monitoring                               prometheus-adapter-84c57d866f-fs2sz                       0/1     ContainerCreating   0          22h
openshift-monitoring                               prometheus-k8s-0                                          0/7     ContainerCreating   0          22h
openshift-monitoring                               prometheus-k8s-1                                          0/7     ContainerCreating   0          23h
openshift-monitoring                               prometheus-operator-dbb5d666b-6wdr4                       0/2     Terminating         0          23h
openshift-monitoring                               prometheus-operator-dbb5d666b-c84rh                       0/2     Pending             0          3h59m
openshift-monitoring                               telemeter-client-78f9657f88-skrg6                         0/3     ContainerCreating   0          22h
openshift-monitoring                               thanos-querier-54f8c6c887-4r46v                           0/5     ContainerCreating   0          22h
openshift-monitoring                               thanos-querier-54f8c6c887-rqvzv                           0/5     ContainerCreating   0          22h
openshift-multus                                   multus-admission-controller-8k5vq                         0/2     ContainerCreating   0          40h
openshift-multus                                   multus-admission-controller-fksxv                         0/2     ContainerCreating   0          40h
openshift-multus                                   multus-admission-controller-s45pz                         0/2     ContainerCreating   0          40h
openshift-multus                                   network-metrics-daemon-8gfb9                              0/2     ContainerCreating   0          40h
openshift-multus                                   network-metrics-daemon-gz6bp                              0/2     ContainerCreating   0          40h
openshift-multus                                   network-metrics-daemon-kqbwj                              0/2     ContainerCreating   0          40h
openshift-multus                                   network-metrics-daemon-nkctp                              0/2     ContainerCreating   0          39h
openshift-multus                                   network-metrics-daemon-tppj9                              0/2     ContainerCreating   0          39h
openshift-network-diagnostics                      network-check-source-5ccc7fb9cd-rdnqr                     0/1     ContainerCreating   0          22h
openshift-network-diagnostics                      network-check-target-59tzc                                0/1     ContainerCreating   0          40h
openshift-network-diagnostics                      network-check-target-5vsbs                                0/1     ContainerCreating   0          39h
openshift-network-diagnostics                      network-check-target-rvnsf                                0/1     ContainerCreating   0          40h
openshift-network-diagnostics                      network-check-target-vs6tq                                0/1     ContainerCreating   0          40h
openshift-network-diagnostics                      network-check-target-zzrfr                                0/1     ContainerCreating   0          39h
openshift-oauth-apiserver                          apiserver-6cf594b4b5-6gwrj                                0/1     Init:0/1            0          23h
openshift-oauth-apiserver                          apiserver-6cf594b4b5-czcc9                                0/1     Init:0/1            0          23h
openshift-oauth-apiserver                          apiserver-6cf594b4b5-gnwmh                                0/1     Terminating         0          23h
openshift-oauth-apiserver                          apiserver-6cf594b4b5-rpfp9                                0/1     Pending             0          4h2m
openshift-operator-lifecycle-manager               catalog-operator-9574c4ff5-n478j                          0/1     ContainerCreating   0          23h
openshift-operator-lifecycle-manager               olm-operator-675f7cb4cf-fbdxn                             0/1     Pending             0          4h2m
openshift-operator-lifecycle-manager               olm-operator-675f7cb4cf-svnp5                             0/1     Terminating         0          23h
openshift-operator-lifecycle-manager               packageserver-5567566c55-8jqxs                            0/1     ContainerCreating   0          23h
openshift-operator-lifecycle-manager               packageserver-5567566c55-dpmxw                            0/1     Terminating         0          23h
openshift-operator-lifecycle-manager               packageserver-5567566c55-vhgsl                            0/1     Pending             0          4h2m
openshift-ovn-kubernetes                           ovnkube-node-dzm9t                                        2/3     CrashLoopBackOff    81         23h
openshift-ovn-kubernetes                           ovnkube-node-fh2cw                                        2/3     CrashLoopBackOff    85         23h
openshift-ovn-kubernetes                           ovnkube-node-mr8vm                                        2/3     CrashLoopBackOff    86         23h
openshift-ovn-kubernetes                           ovnkube-node-smhgn                                        2/3     Error               83         23h
openshift-ovn-kubernetes                           ovnkube-node-v95rn                                        2/3     CrashLoopBackOff    84         23h
openshift-service-ca-operator                      service-ca-operator-6f4cdfb89c-mll8t                      0/1     ContainerCreating   0          23h
openshift-service-ca                               service-ca-6c94fc887d-hfns4                               0/1     ContainerCreating   0          23h



[root@ktania-48-bastion ~]# oc describe pod ovnkube-node-mr8vm -n openshift-ovn-kubernetes
Name:                 ovnkube-node-mr8vm
Namespace:            openshift-ovn-kubernetes
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 master-0/9.114.99.70
Start Time:           Wed, 10 Mar 2021 00:54:22 -0500
Labels:               app=ovnkube-node
                      component=network
                      controller-revision-hash=db4c44784
                      kubernetes.io/os=linux
                      openshift.io/component=network
                      pod-template-generation=1
                      type=infra
Annotations:          <none>
Status:               Running
IP:                   9.114.99.70
IPs:
  IP:           9.114.99.70
Controlled By:  DaemonSet/ovnkube-node
Containers:
  ovn-controller:
    Container ID:  cri-o://3a3fe21c490972caf3e88385e5c1fad0c11e80cbbb2285007952a128230acbf9
    Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bc9b1c6f7e550147e8e1b53e668a44037e02912487e990bb1c24771656573a6c
    Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bc9b1c6f7e550147e8e1b53e668a44037e02912487e990bb1c24771656573a6c
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/bash
      -c
      set -e
      if [[ -f "/env/${K8S_NODE}" ]]; then
        set -o allexport
        source "/env/${K8S_NODE}"
        set +o allexport
      fi
      echo "$(date -Iseconds) - starting ovn-controller"
      exec ovn-controller unix:/var/run/openvswitch/db.sock -vfile:off \
        --no-chdir --pidfile=/var/run/ovn/ovn-controller.pid \
        -p /ovn-cert/tls.key -c /ovn-cert/tls.crt -C /ovn-ca/ca-bundle.crt \
        -vconsole:"${OVN_LOG_LEVEL}"
      
    State:          Running
      Started:      Wed, 10 Mar 2021 00:54:54 -0500
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:     10m
      memory:  300Mi
    Environment:
      OVN_LOG_LEVEL:  info
      K8S_NODE:        (v1:spec.nodeName)
    Mounts:
      /env from env-overrides (rw)
      /etc/openvswitch from etc-openvswitch (rw)
      /etc/ovn/ from etc-openvswitch (rw)
      /ovn-ca from ovn-ca (rw)
      /ovn-cert from ovn-cert (rw)
      /run/openvswitch from run-openvswitch (rw)
      /run/ovn/ from run-ovn (rw)
      /var/lib/openvswitch from var-lib-openvswitch (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from ovn-kubernetes-node-token-22qmz (ro)
  kube-rbac-proxy:
    Container ID:  cri-o://71d1d89ecac60830f14ba9ce917ea0f3962b0bb261ef7647ed4b42749f1dd440
    Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cedc48906e2064b38d982dff88e74e639897039f602928414581fc0a6330d1ed
    Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cedc48906e2064b38d982dff88e74e639897039f602928414581fc0a6330d1ed
    Port:          9103/TCP
    Host Port:     9103/TCP
    Command:
      /bin/bash
      -c
      #!/bin/bash
      set -euo pipefail
      TLS_PK=/etc/pki/tls/metrics-cert/tls.key
      TLS_CERT=/etc/pki/tls/metrics-cert/tls.crt
      # As the secret mount is optional we must wait for the files to be present.
      # The service is created in monitor.yaml and this is created in sdn.yaml.
      # If it isn't created there is probably an issue so we want to crashloop.
      retries=0
      TS=$(date +%s)
      WARN_TS=$(( ${TS} + $(( 20 * 60)) ))
      HAS_LOGGED_INFO=0
      
      log_missing_certs(){
          CUR_TS=$(date +%s)
          if [[ "${CUR_TS}" -gt "WARN_TS"  ]]; then
            echo $(date -Iseconds) WARN: ovn-node-metrics-cert not mounted after 20 minutes.
          elif [[ "${HAS_LOGGED_INFO}" -eq 0 ]] ; then
            echo $(date -Iseconds) INFO: ovn-node-metrics-cert not mounted. Waiting one hour.
            HAS_LOGGED_INFO=1
          fi
      }
      while [[ ! -f "${TLS_PK}" ||  ! -f "${TLS_CERT}" ]] ; do
        log_missing_certs
        sleep 5
      done
      
      echo $(date -Iseconds) INFO: ovn-node-metrics-certs mounted, starting kube-rbac-proxy
      exec /usr/bin/kube-rbac-proxy \
        --logtostderr \
        --secure-listen-address=:9103 \
        --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 \
        --upstream=http://127.0.0.1:29103/ \
        --tls-private-key-file=${TLS_PK} \
        --tls-cert-file=${TLS_CERT}
      
    State:          Running
      Started:      Wed, 10 Mar 2021 00:54:54 -0500
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:        10m
      memory:     20Mi
    Environment:  <none>
    Mounts:
      /etc/pki/tls/metrics-cert from ovn-node-metrics-cert (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from ovn-kubernetes-node-token-22qmz (ro)
  ovnkube-node:
    Container ID:  cri-o://38338a9cadfdd5ec01f7185dc9e67295c7bde937d762688cd6a2daa8aef4ba5d
    Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bc9b1c6f7e550147e8e1b53e668a44037e02912487e990bb1c24771656573a6c
    Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bc9b1c6f7e550147e8e1b53e668a44037e02912487e990bb1c24771656573a6c
    Port:          29103/TCP
    Host Port:     29103/TCP
    Command:
      /bin/bash
      -c
      set -xe
      if [[ -f "/env/${K8S_NODE}" ]]; then
        set -o allexport
        source "/env/${K8S_NODE}"
        set +o allexport
      fi
      echo "I$(date "+%m%d %H:%M:%S.%N") - waiting for db_ip addresses"
      cp -f /usr/libexec/cni/ovn-k8s-cni-overlay /cni-bin-dir/
      ovn_config_namespace=openshift-ovn-kubernetes
      echo "I$(date "+%m%d %H:%M:%S.%N") - disable conntrack on geneve port"
      iptables -t raw -A PREROUTING -p udp --dport 6081 -j NOTRACK
      iptables -t raw -A OUTPUT -p udp --dport 6081 -j NOTRACK
      retries=0
      while true; do
        # TODO: change to use '--request-timeout=30s', if https://github.com/kubernetes/kubernetes/issues/49343 is fixed. 
        db_ip=$(timeout 30 kubectl get ep -n ${ovn_config_namespace} ovnkube-db -o jsonpath='{.subsets[0].addresses[0].ip}')
        if [[ -n "${db_ip}" ]]; then
          break
        fi
        (( retries += 1 ))
        if [[ "${retries}" -gt 40 ]]; then
          echo "E$(date "+%m%d %H:%M:%S.%N") - db endpoint never came up"
          exit 1
        fi
        echo "I$(date "+%m%d %H:%M:%S.%N") - waiting for db endpoint"
        sleep 5
      done
      
      echo "I$(date "+%m%d %H:%M:%S.%N") - starting ovnkube-node db_ip ${db_ip}"
      
      gateway_mode_flags=
      # Check to see if ovs is provided by the node. This is only for upgrade from 4.5->4.6 or
      # openshift-sdn to ovn-kube conversion
      if grep -q OVNKubernetes /etc/systemd/system/ovs-configuration.service ; then
        gateway_mode_flags="--gateway-mode local --gateway-interface br-ex"
      else
        gateway_mode_flags="--gateway-mode local --gateway-interface none"
      fi
      
      exec /usr/bin/ovnkube --init-node "${K8S_NODE}" \
        --nb-address "ssl:9.114.99.105:9641,ssl:9.114.99.134:9641,ssl:9.114.99.70:9641" \
        --sb-address "ssl:9.114.99.105:9642,ssl:9.114.99.134:9642,ssl:9.114.99.70:9642" \
        --nb-client-privkey /ovn-cert/tls.key \
        --nb-client-cert /ovn-cert/tls.crt \
        --nb-client-cacert /ovn-ca/ca-bundle.crt \
        --nb-cert-common-name "ovn" \
        --sb-client-privkey /ovn-cert/tls.key \
        --sb-client-cert /ovn-cert/tls.crt \
        --sb-client-cacert /ovn-ca/ca-bundle.crt \
        --sb-cert-common-name "ovn" \
        --config-file=/run/ovnkube-config/ovnkube.conf \
        --loglevel "${OVN_KUBE_LOG_LEVEL}" \
        --inactivity-probe="${OVN_CONTROLLER_INACTIVITY_PROBE}" \
        ${gateway_mode_flags} \
        --metrics-bind-address "127.0.0.1:29103"
      
    State:      Terminated
      Reason:   Error
      Message:  9] exec(3): stdout: ""
I0311 07:29:01.685654  979736 ovs.go:170] exec(3): stderr: ""
I0311 07:29:01.685680  979736 ovs.go:166] exec(4): /usr/bin/ovs-ofctl dump-aggregate br-int
I0311 07:29:01.690370  979736 ovs.go:169] exec(4): stdout: "NXST_AGGREGATE reply (xid=0x4): packet_count=0 byte_count=0 flow_count=2382\n"
I0311 07:29:01.690418  979736 ovs.go:170] exec(4): stderr: ""
I0311 07:29:01.690462  979736 ovs.go:166] exec(5): /usr/bin/ovs-vsctl --timeout=15 -- --if-exists del-port br-int k8s-master-0 -- --may-exist add-port br-int ovn-k8s-mp0 -- set interface ovn-k8s-mp0 type=internal mtu_request=1400 external-ids:iface-id=k8s-master-0
I0311 07:29:01.697343  979736 ovs.go:169] exec(5): stdout: ""
I0311 07:29:01.697380  979736 ovs.go:170] exec(5): stderr: ""
I0311 07:29:01.697404  979736 ovs.go:166] exec(6): /usr/bin/ovs-vsctl --timeout=15 --if-exists get interface ovn-k8s-mp0 mac_in_use
I0311 07:29:01.703189  979736 ovs.go:169] exec(6): stdout: "\"02:d7:2b:fa:dd:74\"\n"
I0311 07:29:01.703241  979736 ovs.go:170] exec(6): stderr: ""
I0311 07:29:01.703282  979736 ovs.go:166] exec(7): /usr/bin/ovs-vsctl --timeout=15 set interface ovn-k8s-mp0 mac=02\:d7\:2b\:fa\:dd\:74
I0311 07:29:01.711008  979736 ovs.go:169] exec(7): stdout: ""
I0311 07:29:01.711056  979736 ovs.go:170] exec(7): stderr: ""
I0311 07:29:01.759473  979736 gateway_init.go:162] Initializing Gateway Functionality
I0311 07:29:01.759824  979736 gateway_localnet.go:184] Node local addresses initialized to: map[10.129.0.2:{10.129.0.0 fffffe00} 127.0.0.1:{127.0.0.0 ff000000} 9.114.99.70:{9.114.96.0 fffffc00} ::1:{::1 ffffffffffffffffffffffffffffffff} fe80::50b2:16ff:fe43:a8a3:{fe80:: ffffffffffffffff0000000000000000} fe80::b47a:e4de:548:45a5:{fe80:: ffffffffffffffff0000000000000000} fe80::d7:2bff:fefa:dd74:{fe80:: ffffffffffffffff0000000000000000}]
I0311 07:29:01.760000  979736 helper_linux.go:73] Found default gateway interface env32 9.114.96.1
F0311 07:29:01.760053  979736 ovnkube.go:130] could not find IP addresses: failed to lookup link none: Link not found

      Exit Code:  1
      Started:    Thu, 11 Mar 2021 02:29:00 -0500
      Finished:   Thu, 11 Mar 2021 02:29:01 -0500
    Last State:   Terminated
      Reason:     Error
      Message:    9] exec(3): stdout: ""
I0311 07:28:09.571730  978518 ovs.go:170] exec(3): stderr: ""
I0311 07:28:09.571751  978518 ovs.go:166] exec(4): /usr/bin/ovs-ofctl dump-aggregate br-int
I0311 07:28:09.576322  978518 ovs.go:169] exec(4): stdout: "NXST_AGGREGATE reply (xid=0x4): packet_count=0 byte_count=0 flow_count=2382\n"
I0311 07:28:09.576395  978518 ovs.go:170] exec(4): stderr: ""
I0311 07:28:09.576468  978518 ovs.go:166] exec(5): /usr/bin/ovs-vsctl --timeout=15 -- --if-exists del-port br-int k8s-master-0 -- --may-exist add-port br-int ovn-k8s-mp0 -- set interface ovn-k8s-mp0 type=internal mtu_request=1400 external-ids:iface-id=k8s-master-0
I0311 07:28:09.583200  978518 ovs.go:169] exec(5): stdout: ""
I0311 07:28:09.583276  978518 ovs.go:170] exec(5): stderr: ""
I0311 07:28:09.583321  978518 ovs.go:166] exec(6): /usr/bin/ovs-vsctl --timeout=15 --if-exists get interface ovn-k8s-mp0 mac_in_use
I0311 07:28:09.588979  978518 ovs.go:169] exec(6): stdout: "\"02:d7:2b:fa:dd:74\"\n"
I0311 07:28:09.589011  978518 ovs.go:170] exec(6): stderr: ""
I0311 07:28:09.589045  978518 ovs.go:166] exec(7): /usr/bin/ovs-vsctl --timeout=15 set interface ovn-k8s-mp0 mac=02\:d7\:2b\:fa\:dd\:74
I0311 07:28:09.594582  978518 ovs.go:169] exec(7): stdout: ""
I0311 07:28:09.594609  978518 ovs.go:170] exec(7): stderr: ""
I0311 07:28:09.638485  978518 gateway_init.go:162] Initializing Gateway Functionality
I0311 07:28:09.638959  978518 gateway_localnet.go:184] Node local addresses initialized to: map[10.129.0.2:{10.129.0.0 fffffe00} 127.0.0.1:{127.0.0.0 ff000000} 9.114.99.70:{9.114.96.0 fffffc00} ::1:{::1 ffffffffffffffffffffffffffffffff} fe80::50b2:16ff:fe43:a8a3:{fe80:: ffffffffffffffff0000000000000000} fe80::b47a:e4de:548:45a5:{fe80:: ffffffffffffffff0000000000000000} fe80::d7:2bff:fefa:dd74:{fe80:: ffffffffffffffff0000000000000000}]
I0311 07:28:09.639157  978518 helper_linux.go:73] Found default gateway interface env32 9.114.96.1
F0311 07:28:09.639228  978518 ovnkube.go:130] could not find IP addresses: failed to lookup link none: Link not found

      Exit Code:    1
      Started:      Thu, 11 Mar 2021 02:28:08 -0500
      Finished:     Thu, 11 Mar 2021 02:28:09 -0500
    Ready:          False
    Restart Count:  91
    Requests:
      cpu:      10m
      memory:   300Mi
    Readiness:  exec [test -f /etc/cni/net.d/10-ovn-kubernetes.conf] delay=5s timeout=1s period=5s #success=1 #failure=3
    Environment:
      KUBERNETES_SERVICE_PORT:          6443
      KUBERNETES_SERVICE_HOST:          api-int.ktania-48.redhat.com
      OVN_CONTROLLER_INACTIVITY_PROBE:  30000
      OVN_KUBE_LOG_LEVEL:               4
      K8S_NODE:                          (v1:spec.nodeName)
    Mounts:
      /cni-bin-dir from host-cni-bin (rw)
      /env from env-overrides (rw)
      /etc/cni/net.d from host-cni-netd (rw)
      /etc/openvswitch from etc-openvswitch (rw)
      /etc/ovn/ from etc-openvswitch (rw)
      /etc/systemd/system from systemd-units (ro)
      /host from host-slash (ro)
      /ovn-ca from ovn-ca (rw)
      /ovn-cert from ovn-cert (rw)
      /run/netns from host-run-netns (ro)
      /run/openvswitch from run-openvswitch (rw)
      /run/ovn-kubernetes/ from host-run-ovn-kubernetes (rw)
      /run/ovn/ from run-ovn (rw)
      /run/ovnkube-config/ from ovnkube-config (rw)
      /var/lib/cni/networks/ovn-k8s-cni-overlay from host-var-lib-cni-networks-ovn-kubernetes (rw)
      /var/lib/openvswitch from var-lib-openvswitch (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from ovn-kubernetes-node-token-22qmz (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  systemd-units:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/systemd/system
    HostPathType:  
  host-slash:
    Type:          HostPath (bare host directory volume)
    Path:          /
    HostPathType:  
  host-run-netns:
    Type:          HostPath (bare host directory volume)
    Path:          /run/netns
    HostPathType:  
  var-lib-openvswitch:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/openvswitch/data
    HostPathType:  
  etc-openvswitch:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/openvswitch/etc
    HostPathType:  
  run-openvswitch:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/openvswitch
    HostPathType:  
  run-ovn:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/ovn
    HostPathType:  
  host-run-ovn-kubernetes:
    Type:          HostPath (bare host directory volume)
    Path:          /run/ovn-kubernetes
    HostPathType:  
  host-cni-bin:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/cni/bin
    HostPathType:  
  host-cni-netd:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/multus/cni/net.d
    HostPathType:  
  host-var-lib-cni-networks-ovn-kubernetes:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/cni/networks/ovn-k8s-cni-overlay
    HostPathType:  
  ovnkube-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ovnkube-config
    Optional:  false
  env-overrides:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      env-overrides
    Optional:  true
  ovn-ca:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ovn-ca
    Optional:  false
  ovn-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ovn-cert
    Optional:    false
  ovn-node-metrics-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ovn-node-metrics-cert
    Optional:    true
  ovn-kubernetes-node-token-22qmz:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ovn-kubernetes-node-token-22qmz
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  beta.kubernetes.io/os=linux
Tolerations:     op=Exists
Events:
  Type     Reason   Age                 From     Message
  ----     ------   ----                ----     -------
  Normal   Pulled   12s (x4 over 109s)  kubelet  Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bc9b1c6f7e550147e8e1b53e668a44037e02912487e990bb1c24771656573a6c" already present on machine
  Normal   Created  12s (x4 over 108s)  kubelet  Created container ovnkube-node
  Normal   Started  12s (x4 over 108s)  kubelet  Started container ovnkube-node
  Warning  BackOff  10s (x8 over 107s)  kubelet  Back-off restarting failed container

Comment 1 lmcfadde 2021-03-12 17:59:56 UTC
pliu any updates?

Comment 2 Peng Liu 2021-03-15 14:05:05 UTC
In 4.8.0, there are some ovn-kube changes that break the current SDN migration approach. We need to refactor the migration solution in a way. Here's the PR I proposed https://github.com/openshift/cluster-network-operator/pull/763. After this PR merged, there will be a new operation procedure for 4.8, which I think will fix this bz.

Comment 3 lmcfadde 2021-03-16 19:40:07 UTC
Part of the test is to validate the migration of OpenshiftSDN to OVNKube.  Installation with OpenshiftSDN was successful. However migration failed with BZ.  So currently this BZ is blocking a regressions validation story.  Add this info here.  I can see activity is happening with the PR.

Comment 4 Dan Li 2021-03-17 14:06:43 UTC
The IBM Z team has also reported observation of this issue on the s390x platform. Should we escalate this bug to a blocker.

Comment 5 Peng Liu 2021-03-22 09:27:08 UTC
This BZ should be a blocker.

Comment 11 errata-xmlrpc 2021-07-27 22:52:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.