The etcd quorum guard test does not correctly make nodes unschedulable, resulting in occasional failures when the quorum guard exits more quickly per fix for bug 1712507
https://github.com/openshift/machine-config-operator/pull/822 is still open.
I searched through the last 14d of CI results for log messages that were removed/changed in the PR (https://github.com/openshift/machine-config-operator/pull/822):
- "etcdQuotaGard deployment not present"
- "Node object was modified and not up to date; retrying"
- "Failed to make node %s %sschedulable"
I was unable to find any evidence of those messages.
Additionally, I pulled the machine-config-operator image included in the 4.1.0-0.nightly-2019-06-19-033215 release and inspected the contents of the changed manifest:
$ ./oc image info -a ../all-the-pull-secrets.json $(./oc adm release info -a ../all-the-pull-secrets.json --image-for=machine-config-operator registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-06-19-033215) | grep Name
$ sudo podman pull --authfile ../all-the-pull-secrets.json quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:976cd21a9b96fa
$ ctr=$(sudo podman create quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:976cd21a9b96fa2e4e1bed568e3f34b9087703f4d18c91
$ mnt=$(sudo podman mount $ctr)
$ sudo grep -C 10 TERM $mnt/manifests/0000_80_machine-config-operator_07_etcdquorumguard_deployment.yaml
- mountPath: /mnt/kube
# properly handle TERM and exit as soon as it is signaled
set -euo pipefail
trap 'jobs -p | xargs -r kill; exit 0' TERM
sleep infinity & wait
declare -r croot=/mnt/kube declare -r health_endpoint="https://127.0.0.1:2379/health"
declare -r cert="$(find $croot -name 'system:etcd-peer*.crt' -print -quit)"
This confirms the manifest has the changes included in https://github.com/openshift/machine-config-operator/pull/822
Moving to VERIFIED
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.