Description of problem: This cluster appears to have self-destructed catastrophically during the e2e run: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.4/1335 Among other things, there appear to be no available nodes: https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.4/1335/artifacts/e2e-openstack/nodes.json My theory is this lead to the variety of other errors seen (connections being reset talking to the apiserver, failure to contact the apiserver, watches being closed). Version-Release number of selected component (if applicable): 4.4 on openstack
possibly related, this run (also on openstack) saw similar failures and in this case all the nodes are marked as "unreachable"(maybe a networking issue?) https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.4/1348/artifacts/e2e-openstack/nodes.json e.g.: "taints": [ { "effect": "NoSchedule", "key": "node-role.kubernetes.io/master" }, { "effect": "NoSchedule", "key": "node.kubernetes.io/unreachable", "timeAdded": "2020-03-28T15:53:22Z" } ]
One more indicident: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.4/1330 https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.4/1330/artifacts/e2e-openstack/nodes.json again the nodes where marked unreachable. Raising severity to urgent as this seems to represent a fundamental stability problem for clusters on openstack.
same (nodes unreachable): https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.4/1358 https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.4/1358/artifacts/e2e-openstack/nodes.json
This is a problem with exec liveness probes within conmon. We have a fix and are getting it backported into the tree. Severity of 1817568 has been raised to Urgent. *** This bug has been marked as a duplicate of bug 1817568 ***