Bug 1703699
| Summary: | MCD is being killed and recreated causing a failed sync | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Antonio Murdaca <amurdaca> |
| Component: | Machine Config Operator | Assignee: | Antonio Murdaca <amurdaca> |
| Status: | CLOSED ERRATA | QA Contact: | Micah Abbott <miabbott> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | unspecified | CC: | ccoleman, deads, mpatel, walters, wking |
| Target Milestone: | --- | Keywords: | Upgrades |
| Target Release: | 4.1.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-06-04 10:48:05 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1703879 | ||
|
Description
Antonio Murdaca
2019-04-27 17:12:23 UTC
I see the probes failing in this log - https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/22653/pull-ci-openshift-origin-master-e2e-aws-upgrade/89 > I see the probes failing in this log
Which probes?
(In reply to Colin Walters from comment #2) > > I see the probes failing in this log > > Which probes? Lots of probes? $ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/22653/pull-ci-openshift-origin-master-e2e-aws-upgrade/89/build-log.txt | sed -n 's|.*ns/\([a-z-]*\) pod/\([a-z0-9-]*\) \([A-Za-z]*\) probe \([a-z]*\):.*|\3\t\4\t\1/\2|p' | sort | uniq -c | sort -n 1 Liveness errored openshift-sdn/ovs-79jml 1 Liveness errored openshift-sdn/ovs-bd22c 1 Liveness errored openshift-sdn/ovs-vzq8s 1 Liveness failed openshift-console/console-5bb6bf7db4-6bwl7 1 Liveness failed openshift-operator-lifecycle-manager/catalog-operator-6478bf6988-f4d5l 1 Readiness errored kube-system/etcd-quorum-guard-69b7b4499b-6pqrv 1 Readiness failed openshift-marketplace/certified-operators-747d97b84-mvcds 2 Liveness errored openshift-marketplace/community-operators-6dd8c5c5f4-g7p7q 3 Liveness failed openshift-apiserver/apiserver-qxc4z 3 Liveness failed openshift-console/console-5bb6bf7db4-kjfvp 3 Liveness failed openshift-operator-lifecycle-manager/packageserver-5ffdb9d78c-6ntft 3 Readiness failed openshift-marketplace/community-operators-6dd8c5c5f4-g7p7q 3 Readiness failed openshift-operator-lifecycle-manager/packageserver-6bb686cfbb-4gkkg 3 Readiness failed openshift-operator-lifecycle-manager/packageserver-79c89fc4bd-257l7 4 Liveness failed openshift-operator-lifecycle-manager/packageserver-6bb686cfbb-4gkkg 4 Liveness failed openshift-operator-lifecycle-manager/packageserver-6bb686cfbb-hkt49 4 Liveness failed openshift-operator-lifecycle-manager/packageserver-79c89fc4bd-257l7 4 Readiness failed openshift-operator-lifecycle-manager/packageserver-5ffdb9d78c-6ntft 5 Liveness failed openshift-operator-lifecycle-manager/packageserver-5ffdb9d78c-j6k9t 5 Readiness failed openshift-operator-lifecycle-manager/packageserver-5ffdb9d78c-j6k9t 5 Readiness failed openshift-sdn/sdn-579mq 6 Liveness failed openshift-operator-lifecycle-manager/packageserver-5ffdb9d78c-xgbgl 8 Readiness failed openshift-operator-lifecycle-manager/packageserver-6bb686cfbb-hkt49 11 Readiness failed openshift-apiserver/apiserver-qxc4z 13 Liveness failed openshift-dns/dns-default-fjn84 17 Readiness failed openshift-console/console-5bb6bf7db4-kjfvp 18 Readiness failed kube-system/etcd-quorum-guard-69b7b4499b-6pqrv 28 Readiness failed openshift-operator-lifecycle-manager/packageserver-5ffdb9d78c-xgbgl This is not happening anymore. The last jobs 1162 recent -e2e- jobs aren't throwing this anymore and it's probably related to the systemd fix which went in as well. *** Bug 1702390 has been marked as a duplicate of this bug. *** I checked some of the recent failures in the CI jobs that were referenced in comment #1. I don't see any evidence of the same kind of failures anymore (thanks to Trevor for the handy oneliner!). Moving to VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758 |