Description of problem: Version-Release number of selected component (if applicable): 4.5.0-0.nightly-2020-03-30-174407 How reproducible: Steps to Reproduce: 1. Setup cluster with 3 masters, 7 workers. 2. Create more pods with : or i in {1..9} ; do oc new-project project$i ; done for i in {1..9} ; do oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/list_for_pods.json -n project$i ; done 3. New a project test, create networkpolicy in it. kind: NetworkPolicy apiVersion: networking.k8s.io/v1 metadata: name: test-podselector-and-ipblock spec: podSelector: {} ingress: - from: - ipBlock: cidr: 10.131.0.0/24 4. Create pv load following guide https://github.com/qinpingli/external-storage/tree/master/iscsi/targetd2 in test project. oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE myclaim Bound pvc-4a01bb9a-e406-440e-af35-a3790b796358 100Mi RWO iscsi-targetd-vg-targetd 9s oc get pod iscsi-pv-pod1 NAME READY STATUS RESTARTS AGE iscsi-pv-pod1 1/1 Running 0 23s 5. oc annotate Network.operator.openshift.io cluster "networkoperator.openshift.io/network-migration"="" 6. oc patch Network.config.openshift.io cluster --type='merge' --patch '{"spec":{"networkType":"OVNKubernetes"}}' 7. Wait until all old the pods(openshift-sdn) are gone. 8. reboot the all the nodes for ip in `oc get node -o wide | egrep -v "NAME" |awk '{print $6}'` do echo "reboot node $ip" ssh -i ~/.ssh/openshift-qe.pem -o StrictHostKeyChecking=no core@$ip sudo shutdown -r -t 3 done Actual results: After nodes reboot and in ready status, check the pods status oc get pods --all-namespaces -o wide | egrep -v "Running|Comple" | wc -l 119 Warning FailedCreatePodSandBox 94s kubelet, hrw-bar5-77h2d-compute-2 (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_test-rc-xm4r9_test_19f60add-93f2-4b9e-805b-81f5a99b5126_0(c24e4057c820edf25465d91466a7b9ee96aaad1f09036e19fe793a5a497c6bcd): Multus: [test/test-rc-xm4r9]: error adding container to network "ovn-kubernetes": delegateAdd: error invoking DelegateAdd - "ovn-k8s-cni-overlay": error in getting result from AddNetwork: CNI request failed with status 400: '[test/test-rc-xm4r9] failed to configure pod interface: failed to open netns "/var/run/crio/ns/2e699a7e-ed0c-4bc7-9c38-196cc3e5ee83/net": failed to Statfs "/var/run/crio/ns/2e699a7e-ed0c-4bc7-9c38-196cc3e5ee83/net": no such file or directory Expected results: All the pods should in running or complete status Additional info: Reboot all the nodes again, but it does not help.
seems this is not migrate issue. the cluster cannot be worked for OVN, maybe this PR https://github.com/openshift/machine-config-operator/pull/1568/ caused it.
GCP cluster installation with OVN failed with error: # oc describe -n openshift-apiserver-operator pod/openshift-apiserver-operator-7f64c8f747-8czpd Warning FailedCreatePodSandBox <invalid> (x309 over 69m) kubelet, yy3775-6pnkt-m-0.c.openshift-qe.internal (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_openshift-apiserver-operator-7f64c8f747-8czpd_openshift-apiserver-operator_ed803cd5-47b9-45cc-b80c-b15a37765bc1_0(b79e99040efde0ac4818ff48e50d385cfb9cde8d02b4c2a0574f9f81d7a9f763): Multus: [openshift-apiserver-operator/openshift-apiserver-operator-7f64c8f747-8czpd]: error adding container to network "ovn-kubernetes": delegateAdd: error invoking DelegateAdd - "ovn-k8s-cni-overlay": error in getting result from AddNetwork: CNI request failed with status 400: '[openshift-apiserver-operator/openshift-apiserver-operator-7f64c8f747-8czpd] failed to configure pod interface: failed to open netns "/var/run/crio/ns/b92b8d46-b3f4-40b4-b338-efd3765234e7/net": failed to Statfs "/var/run/crio/ns/b92b8d46-b3f4-40b4-b338-efd3765234e7/net": no such file or directory ' ovnkube-master pod has below error: [core@yy3775-6pnkt-m-0 ~]$ sudo crictl logs $(sudo crictl ps --pod=$(sudo crictl pods --name=ovnkube-master --quiet) --quiet) E0401 07:09:02.951433 1 reflector.go:283] k8s.io/client-go/informers/factory.go:133: Failed to watch *v1.Namespace: Get https://api-int.yy3775.qe.gcp.devcluster.openshift.com:6443/api/v1/namespaces?resourceVersion=2345&timeout=6m33s&timeoutSeconds=393&watch=true: dial tcp 34.71.121.10:6443: connect: connection refused ovs-node pod has below error: # oc logs ovs-node-4xrsp -n openshift-ovn-kubernetes 2020-04-01T06:52:34.886Z|00004|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting... 2020-04-01T06:52:34.886Z|00005|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected 2020-04-01T06:52:34.888Z|00006|dpdk|INFO|DPDK Disabled - Use other_config:dpdk-init to enable 2020-04-01T06:52:34.929Z|00007|bridge|INFO|ovs-vswitchd (Open vSwitch) 2.12.0 2020-04-01T06:52:44.581Z|00003|memory|INFO|6296 kB peak resident set size after 10.0 seconds 2020-04-01T06:52:44.582Z|00004|memory|INFO|cells:38 monitors:3 sessions:2 2020-04-01T06:53:12.613Z|00005|jsonrpc|WARN|unix#7: receive error: Connection reset by peer 2020-04-01T06:53:12.613Z|00006|reconnect|WARN|unix#7: connection dropped (Connection reset by peer) 2020-04-01T06:53:12.646Z|00007|stream_ssl|ERR|SSL_use_certificate_file: error:02001002:system library:fopen:No such file or directory 2020-04-01T06:53:12.646Z|00008|stream_ssl|ERR|SSL_use_PrivateKey_file: error:20074002:BIO routines:FILE_CTRL:system lib
AWS install also hit such problem, so set testblocker keyword.
https://github.com/openshift/machine-config-operator/pull/1600 has been merged. this issue should be fixed by checking registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-04-01-232323
*** Bug 1819930 has been marked as a duplicate of this bug. ***
Verified this bug according to comment 5
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409