Bug 1826463

Summary: TestContainerRuntimeConfigPidsLimit test frequently failing
Product: OpenShift Container Platform Reporter: Kirsten Garrison <kgarriso>
Component: Machine Config OperatorAssignee: Peter Hunt <pehunt>
Status: CLOSED DEFERRED QA Contact: Michael Nguyen <mnguyen>
Severity: high Docs Contact:
Priority: medium    
Version: 4.5CC: amurdaca, smilner, zyu
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-29 12:42:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Kirsten Garrison 2020-04-21 18:09:30 UTC
Description of problem:
As of last evening, TestContainerRuntimeConfigPidsLimit starting failing across PRs in the MCO repo

Version-Release number of selected component (if applicable):
4.5

How reproducible:
look at runs

I0421 14:16:43.378036       1 status.go:84] Pool node-pids-limit-00fa9209-0569-4914-903b-ecad037596d2: All nodes are updated with rendered-node-pids-limit-00fa9209-0569-4914-903b-ecad037596d2-634eaf836ce89aff6462d23da0346981
E0421 14:23:12.707088       1 render_controller.go:253] error finding pools for machineconfig: could not find any MachineConfigPool set for MachineConfig mc-pids-limit-00fa9209-0569-4914-903b-ecad037596d2 with labels: map[machineconfiguration.openshift.io/role:node-pids-limit-00fa9209-0569-4914-903b-ecad037596d2]
E0421 14:23:12.761894       1 render_controller.go:253] error finding pools for machineconfig: no MachineConfigPool found for MachineConfig rendered-node-pids-limit-00fa9209-0569-4914-903b-ecad037596d2-37c6d1d18b03faa3c537d237a672dba2 because it has no labels
E0421 14:23:12.764990       1 render_controller.go:253] error finding pools for machineconfig: no MachineConfigPool found for MachineConfig rendered-node-pids-limit-00fa9209-0569-4914-903b-ecad037596d2-634eaf836ce89aff6462d23da0346981 because it has no labels
I0421 14:23:18.130009       1 render_controller.go:497] Generated machineconfig rendered-worker-9d2efc5bce4e84a0546bd8a44f8f9965 from 6 configs: [{MachineConfig  00-worker  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-container-runtime  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-kubelet  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-3f894b33-60fa-4a57-81c3-704cdfedfb7b-registries  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-ssh  machineconfiguration.openshift.io/v1  } {MachineConfig  add-a-file-874e00b5-e0b9-4dd9-bd28-0f27ec2f9708  machineconfiguration.openshift.io/v1  }]
I0421 14:23:18.143079       1 render_controller.go:516] Pool worker: now targeting: rendered-worker-9d2efc5bce4e84a0546bd8a44f8f9965
I0421 14:43:24.161981       1 render_controller.go:497] Generated machineconfig rendered-worker-675914441b553b8d9dbb9547260b51b6 from 7 configs: [{MachineConfig  00-worker  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-container-runtime  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-kubelet  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-3f894b33-60fa-4a57-81c3-704cdfedfb7b-registries  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-ssh  machineconfiguration.openshift.io/v1  } {MachineConfig  add-a-file-874e00b5-e0b9-4dd9-bd28-0f27ec2f9708  machineconfiguration.openshift.io/v1  } {MachineConfig  sshkeys-worker-8c1a5281-4d92-43c6-bd62-f7c5b77d66b7  machineconfiguration.openshift.io/v1  }]
I0421 14:43:24.173947       1 render_controller.go:516] Pool worker: now targeting: rendered-worker-675914441b553b8d9dbb9547260b51b6
E0421 14:56:20.034287       1 render_controller.go:216] error finding pools for machineconfig: could not find any MachineConfigPool set for MachineConfig 99-node-pids-limit-00fa9209-0569-4914-903b-ecad037596d2-7eeb753b-3a6a-4fb5-aee8-a5089c26fdbe-containerruntime with labels: map[machineconfiguration.openshift.io/role:node-pids-limit-00fa9209-0569-4914-903b-ecad037596d2]
I0421 15:03:30.195606       1 render_controller.go:497] Generated machineconfig rendered-worker-dda5263d5acf90f62166be65dbf62dbd from 8 configs: [{MachineConfig  00-worker  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-container-runtime  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-kubelet  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-3f894b33-60fa-4a57-81c3-704cdfedfb7b-registries  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-ssh  machineconfiguration.openshift.io/v1  } {MachineConfig  add-a-file-874e00b5-e0b9-4dd9-bd28-0f27ec2f9708  machineconfiguration.openshift.io/v1  } {MachineConfig  kargs-8a131c63-fa29-4ca4-a9e8-6a383da4ed17  machineconfiguration.openshift.io/v1  } {MachineConfig  sshkeys-worker-8c1a5281-4d92-43c6-bd62-f7c5b77d66b7  machineconfiguration.openshift.io/v1  }]
I0421 15:03:30.209304       1 render_controller.go:516] Pool worker: now targeting: rendered-worker-dda5263d5acf90f62166be65dbf62dbd
I0421 15:23:36.222171       1 render_controller.go:497] Generated machineconfig rendered-worker-84e8ff69f53b20985f5531045c857815 from 9 configs: [{MachineConfig  00-worker  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-container-runtime  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-kubelet  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-3f894b33-60fa-4a57-81c3-704cdfedfb7b-registries  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-ssh  machineconfiguration.openshift.io/v1  } {MachineConfig  add-a-file-874e00b5-e0b9-4dd9-bd28-0f27ec2f9708  machineconfiguration.openshift.io/v1  } {MachineConfig  kargs-8a131c63-fa29-4ca4-a9e8-6a383da4ed17  machineconfiguration.openshift.io/v1  } {MachineConfig  kerneltype-71a0d6a1-0657-4006-aecd-2dbeff6665f7  machineconfiguration.openshift.io/v1  } {MachineConfig  sshkeys-worker-8c1a5281-4d92-43c6-bd62-f7c5b77d66b7  machineconfiguration.openshift.io/v1  }]
I0421 15:23:36.235619       1 render_controller.go:516] Pool worker: now targeting: rendered-worker-84e8ff69f53b20985f5531045c857815
E0421 15:36:01.672378       1 render_controller.go:216] error finding pools for machineconfig: could not find any MachineConfigPool set for MachineConfig 99-node-pids-limit-00fa9209-0569-4914-903b-ecad037596d2-7eeb753b-3a6a-4fb5-aee8-a5089c26fdbe-containerruntime with labels: map[machineconfiguration.openshift.io/role:node-pids-limit-00fa9209-0569-4914-903b-ecad037596d2]

Example runs: 
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/1649/pull-ci-openshift-machine-config-operator-master-e2e-gcp-op/1929

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/1659/pull-ci-openshift-machine-config-operator-master-e2e-gcp-op/1930

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/1474/pull-ci-openshift-machine-config-operator-master-e2e-gcp-op/1927

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/1474/pull-ci-openshift-machine-config-operator-master-e2e-gcp-op/1925

Comment 1 Kirsten Garrison 2020-04-21 18:12:01 UTC
Topmost failure seen on CI run homepage:

 --- FAIL: TestContainerRuntimeConfigPidsLimit (2609.90s)
    utils_test.go:60: Pool node-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600 has rendered config mc-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600 with rendered-node-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600-8af41a11d07a4a871ca61c3410da5427 (waited 6.00957333s)
    utils_test.go:82: Pool node-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600 has completed rendered-node-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600-8af41a11d07a4a871ca61c3410da5427 (waited 2m14.007560508s)
    utils_test.go:60: Pool node-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600 has rendered config 99-node-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600-d7a28ec4-491b-4759-b841-0c256b760067-containerruntime with rendered-node-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600-2cf2fc59855924e9fd231eb4cef9fc18 (waited 2.015521109s)
    utils_test.go:82: Pool node-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600 has completed rendered-node-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600-2cf2fc59855924e9fd231eb4cef9fc18 (waited 54.006398508s)
    ctrcfg_test.go:99: Deleted ContainerRuntimeConfig ctrcfg-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600
    utils_test.go:60: Pool node-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600 has rendered config mc-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600 with rendered-node-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600-8af41a11d07a4a871ca61c3410da5427 (waited 7.247007ms)
    utils_test.go:36: 
        	Error Trace:	utils_test.go:36
        	            				ctrcfg_test.go:105
        	            				ctrcfg_test.go:21
        	Error:      	Expected nil, but got: pool node-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600 didn't report rendered-node-pids-limit-55df0de7-fbf2-4d45-b46b-d3a00cf3d600-8af41a11d07a4a871ca61c3410da5427 to updated (waited 20m0.009915796s): timed out waiting for the condition

Comment 2 Kirsten Garrison 2020-04-21 18:52:06 UTC
In an old passing run:

utils_test.go:60: Pool node-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4 has rendered config mc-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4 with rendered-node-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4-c8fdd7eabb731fced9b7a47d657ae6c1 (waited 6.011953781s)
    utils_test.go:82: Pool node-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4 has completed rendered-node-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4-c8fdd7eabb731fced9b7a47d657ae6c1 (waited 1m40.006386104s)
    utils_test.go:60: Pool node-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4 has rendered config 99-node-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4-96d2d0c8-3828-4577-85b6-dd9c8ee26c59-containerruntime with rendered-node-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4-f79292b27668b87d0a12ebed841d9ccb (waited 2.020746925s)
    utils_test.go:82: Pool node-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4 has completed rendered-node-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4-f79292b27668b87d0a12ebed841d9ccb (waited 54.012257612s)
    ctrcfg_test.go:99: Deleted ContainerRuntimeConfig ctrcfg-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4
    utils_test.go:60: Pool node-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4 has rendered config mc-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4 with rendered-node-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4-c8fdd7eabb731fced9b7a47d657ae6c1 (waited 6.773856ms)
    utils_test.go:82: Pool node-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4 has completed rendered-node-pids-limit-38a610f4-78bd-4813-8d1c-a0d636f44af4-c8fdd7eabb731fced9b7a47d657ae6c1 (waited 50.010949018s)
    utils_test.go:82: Pool worker has completed rendered-worker-c8fdd7eabb731fced9b7a47d657ae6c1 (waited 50.011878179s)

we have this mc-pids-limit and also 99-pids-limit that are both making new rendered-node-pids-limit configs... is this correct?

Comment 3 Kirsten Garrison 2020-04-21 18:53:39 UTC
To clarify the above:

ctrcfg-pids-limit-xxx -> mc-pids-limit/99-pids-limit -> rendered-node-pids-limit-xxx

I dont think we need both?? or why do we have two?

Comment 4 Kirsten Garrison 2020-04-21 20:28:17 UTC
In the failed runs we seem to lose a worker node:

                },
                "degradedMachineCount": 0,
                "machineCount": 3,
                "observedGeneration": 6,
                "readyMachineCount": 0,
                "unavailableMachineCount": 1,
                "updatedMachineCount": 0
            }

Am going through the worker journals now but am seeing quite a few sdn/ovs/nto failures during the drain/reboot to apply the rendered config.. For ex:
```
Apr 21 10:10:43.587097 ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal hyperkube[1593]: I0421 10:10:43.587025    1593 status_manager.go:570] Patch status for pod "sdn-dd2qz_openshift-sdn(2b78e6b7-0035-4aa5-9bc9-82659e308899)" with "{\"metadata\":{\"uid\":\"2b78e6b7-0035-4aa5-9bc9-82659e308899\"},\"status\":{\"containerStatuses\":[{\"containerID\":\"cri-o://c3018fedd27ecf4c92d196c6fe2d5ec947e51b99a53546f65e51e372ff95afec\",\"image\":\"registry.svc.ci.openshift.org/ci-op-x3q91yfw/stable@sha256:7c23c0e8ecd2689f09c4c0d651a4c1d2dbf9275a0ef42bfd80d9034f5d5e3668\",\"imageID\":\"registry.svc.ci.openshift.org/ci-op-x3q91yfw/stable@sha256:7c23c0e8ecd2689f09c4c0d651a4c1d2dbf9275a0ef42bfd80d9034f5d5e3668\",\"lastState\":{\"terminated\":{\"containerID\":\"cri-o://e4052528899875c1212dc497d04cec2f958cc3c05f43e6152827b4f45147c4b9\",\"exitCode\":255,\"finishedAt\":\"2020-04-21T10:10:28Z\",\"message\":\"I0421 10:09:28.039335    2015 node.go:148] Initializing SDN node \\\"ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal\\\" (10.0.32.4) of type \\\"redhat/openshift-ovs-networkpolicy\\\"\\nI0421 10:09:28.090242    2015 cmd.go:159] Starting node networking (unknown)\\nF0421 10:10:28.266902    2015 cmd.go:111] Failed to start sdn: node SDN setup failed: timed out waiting for the condition\\n\",\"reason\":\"Error\",\"startedAt\":\"2020-04-21T10:09:27Z\"}},\"name\":\"sdn\",\"ready\":false,\"restartCount\":3,\"started\":true,\"state\":{\"running\":{\"startedAt\":\"2020-04-21T10:10:43Z\"}}}]}}"
Apr 21 10:10:43.587259 ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal hyperkube[1593]: I0421 10:10:43.587078    1593 status_manager.go:578] Status for pod "sdn-dd2qz_openshift-sdn(2b78e6b7-0035-4aa5-9bc9-82659e308899)" updated successfully: (5, {Phase:Running Conditions:[{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-04-21 09:52:43 +0000 UTC Reason: Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-04-21 10:09:26 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [sdn]} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-04-21 10:09:26 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [sdn]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-04-21 09:52:40 +0000 UTC Reason: Message:}] Message: Reason: NominatedNodeName: HostIP:10.0.32.4 PodIP:10.0.32.4 PodIPs:[{IP:10.0.32.4}] StartTime:2020-04-21 09:52:43 +0000 UTC InitContainerStatuses:[] ContainerStatuses:[{Name:sdn State:{Waiting:nil Running:&ContainerStateRunning{StartedAt:2020-04-21 10:10:43 +0000 UTC,} Terminated:nil} LastTerminationState:{Waiting:nil Running:nil Terminated:&ContainerStateTerminated{ExitCode:255,Signal:0,Reason:Error,Message:I0421 10:09:28.039335    2015 node.go:148] Initializing SDN node "ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal" (10.0.32.4) of type "redhat/openshift-ovs-networkpolicy"
Apr 21 10:10:43.587259 ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal hyperkube[1593]: I0421 10:09:28.090242    2015 cmd.go:159] Starting node networking (unknown)
Apr 21 10:10:43.587259 ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal hyperkube[1593]: F0421 10:10:28.266902    2015 cmd.go:111] Failed to start sdn: node SDN setup failed: timed out waiting for the condition
Apr 21 10:10:43.587259 ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal hyperkube[1593]: ,StartedAt:2020-04-21 10:09:27 +0000 UTC,FinishedAt:2020-04-21 10:10:28 +0000 
```

Comment 5 Peter Hunt 2020-04-21 20:36:19 UTC
Kirsten,

do you have access to the rendered crio.conf on nodes that upgraded, and can you post them here?

I don't have time left today to launch a cluster and check. If you can't get it, I'll be able to get it tomorrow

Comment 6 Kirsten Garrison 2020-04-21 21:58:00 UTC
These are ci runs so everything would be in the artifacts folders of each run (if it exists).

Comment 7 Kirsten Garrison 2020-04-21 23:44:30 UTC
Another thing that I see is are some sdn errors:
```
Apr 21 10:23:13.233143 ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal hyperkube[1593]: I0421 10:20:05.480540    9367 cmd.go:159] Starting node networking (unknown)
Apr 21 10:23:13.233143 ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal hyperkube[1593]: F0421 10:21:05.563116    9367 cmd.go:111] Failed to start sdn: node SDN setup failed: timed out waiting for the condition
Apr 21 10:23:13.233143 ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal hyperkube[1593]: ,StartedAt:2020-04-21 10:20:05 +0000 UTC,FinishedAt:2020-04-21 10:21:05 +0000 
```

```
Apr 21 10:27:07.413522 ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal hyperkube[1593]: E0421 10:27:07.413056    1593 pod_workers.go:191] Error syncing pod 2b78e6b7-0035-4aa5-9bc9-82659e308899 ("sdn-dd2qz_openshift-sdn(2b78e6b7-0035-4aa5-9bc9-82659e308899)"), skipping: failed to "StartContainer" for "sdn" with CrashLoopBackOff: "back-off 5m0s restarting failed container=sdn pod=sdn-dd2qz_openshift-sdn(2b78e6b7-0035-4aa5-9bc9-82659e308899)"
Apr 21 10:27:07.413522 ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal hyperkube[1593]: I0421 10:27:07.413086    1593 event.go:278] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-sdn", Name:"sdn-dd2qz", UID:"2b78e6b7-0035-4aa5-9bc9-82659e308899", APIVersion:"v1", ResourceVersion:"29701", FieldPath:"spec.containers{sdn}"}): type: 'Warning' reason: 'BackOff' Back-off restarting failed container
Apr 21 10:27:07.433004 ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal hyperkube[1593]: I0421 10:27:07.432896    1593 status_manager.go:570] Patch status for pod "sdn-dd2qz_openshift-sdn(2b78e6b7-0035-4aa5-9bc9-82659e308899)" with "{\"metadata\":{\"uid\":\"2b78e6b7-0035-4aa5-9bc9-82659e308899\"},\"status\":{\"containerStatuses\":[{\"containerID\":\"cri-o://1bd5e1cfae85762e44ff687816bb989e6494dbc3e64a92889966910594b275fa\",\"image\":\"registry.svc.ci.openshift.org/ci-op-x3q91yfw/stable@sha256:7c23c0e8ecd2689f09c4c0d651a4c1d2dbf9275a0ef42bfd80d9034f5d5e3668\",\"imageID\":\"registry.svc.ci.openshift.org/ci-op-x3q91yfw/stable@sha256:7c23c0e8ecd2689f09c4c0d651a4c1d2dbf9275a0ef42bfd80d9034f5d5e3668\",\"lastState\":{\"terminated\":{\"containerID\":\"cri-o://446e86e16b7a073a902a1ae494e7aa579ddb49d06cda9f2a96682acb2ee284fd\",\"exitCode\":255,\"finishedAt\":\"2020-04-21T10:21:05Z\",\"message\":\"I0421 10:20:05.474357    9367 node.go:148] Initializing SDN node \\\"ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal\\\" (10.0.32.4) of type \\\"redhat/openshift-ovs-networkpolicy\\\"\\nI0421 10:20:05.480540    9367 cmd.go:159] Starting node networking (unknown)\\nF0421 10:21:05.563116    9367 cmd.go:111] Failed to start sdn: node SDN setup failed: timed out waiting for the condition\\n\",\"reason\":\"Error\",\"startedAt\":\"2020-04-21T10:20:05Z\"}},\"name\":\"sdn\",\"ready\":false,\"restartCount\":8,\"started\":false,\"state\":{\"terminated\":{\"containerID\":\"cri-o://1bd5e1cfae85762e44ff687816bb989e6494dbc3e64a92889966910594b275fa\",\"exitCode\":255,\"finishedAt\":\"2020-04-21T10:27:06Z\",\"message\":\"I0421 10:26:06.469058   12442 node.go:148] Initializing SDN node \\\"ci-op-wg85w-w-d-sbn7r.c.openshift-gce-devel-ci.internal\\\" (10.0.32.4) of type \\\"redhat/openshift-ovs-networkpolicy\\\"\\nI0421 10:26:06.476353   12442 cmd.go:159] Starting node networking (unknown)\\nF0421 10:27:06.558969   12442 cmd.go:111] Failed to start sdn: node SDN setup failed: timed out waiting for the condition\\n\",\"reason\":\"Error\",\"startedAt\":\"2020-04-21T10:26:06Z\"}}}]}}"
```

Comment 9 Kirsten Garrison 2020-04-21 23:57:16 UTC
Spoke with Ryan and he thinks this might actually be : https://bugzilla.redhat.com/show_bug.cgi?id=1802534

I'm going to keep this open until we confirm and use it to track....

Comment 10 Antonio Murdaca 2020-04-28 21:24:14 UTC
This hasn't failed this week so whatever was causing it doesn't appear to be the case anymore. (and if I searched the CI wrong, please reopen)

Comment 11 Antonio Murdaca 2020-04-28 21:24:43 UTC
(In reply to Antonio Murdaca from comment #10)
> This hasn't failed this week so whatever was causing it doesn't appear to be
> the case anymore. (and if I searched the CI wrong, please reopen)

also that test has been since dropped for another who's not failing but we'll get it back hopefully.