Bug 1866868
Summary: | Flake: error waiting for deployment e2e-aws-fips | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Matthew Heon <mheon> |
Component: | kube-controller-manager | Assignee: | Maciej Szulik <maszulik> |
Status: | CLOSED ERRATA | QA Contact: | RamaKasturi <knarra> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | unspecified | CC: | aos-bugs, danili, fromani, jokerman, knarra, mfojtik |
Target Milestone: | --- | ||
Target Release: | 4.6.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | non-multi-arch | ||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-10-27 16:26:05 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Matthew Heon
2020-08-06 16:26:37 UTC
e2e-fips is having a miriade of issues right now. Sometimes the cluster doesn't even init and other times it has a ton of flakes/failures. I want to try to narrow this before suggesting changes or passing it along. https://search.ci.openshift.org/?search=k8s.io%2Fkubernetes%2Ftest%2Fe2e%2Fapps%2Fdeployment.go%3A904 1 of the 3 deployment test pods is not being scheduled Aug 12 18:33:01.831: INFO: At 0001-01-01 00:00:00 +0000 UTC - event for test-rolling-update-with-lb-b9c9c6bcc-4phtw: { } Scheduled: Successfully assigned e2e-deployment-5371/test-rolling-update-with-lb-b9c9c6bcc-4phtw to ip-10-0-146-124.us-east-2.compute.internal Aug 12 18:33:01.831: INFO: At 0001-01-01 00:00:00 +0000 UTC - event for test-rolling-update-with-lb-b9c9c6bcc-hlsnz: { } FailedScheduling: 0/5 nodes are available: 2 node(s) didn't match pod affinity/anti-affinity, 2 node(s) didn't match pod anti-affinity rules, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Aug 12 18:33:01.831: INFO: At 0001-01-01 00:00:00 +0000 UTC - event for test-rolling-update-with-lb-b9c9c6bcc-vx4j5: { } Scheduled: Successfully assigned e2e-deployment-5371/test-rolling-update-with-lb-b9c9c6bcc-vx4j5 to ip-10-0-228-200.us-east-2.compute.internal I’m adding UpcomingSprint, because I was occupied by fixing bugs with higher priority/severity, developing new features with higher priority, or developing new features to improve stability at a macro level. I will revisit this bug next sprint. Fix actually landed in https://github.com/openshift/origin/pull/25010 will wait for few more days to check the flake and then move to verified state. Moving the bug to verified state as i see that the fix landed about 6 days ago and no failures seen from that point when checked here for about 7 days. https://search.ci.openshift.org/?search=k8s.io%2Fkubernetes%2Ftest%2Fe2e%2Fapps%2Fdeployment.go%3A904 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |