Bug 1707510
| Summary: | Install failed: unable to check route health: failed to GET route: dial tcp: lookup [...]: no such host | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Samuel Padgett <spadgett> |
| Component: | Networking | Assignee: | Dan Mace <dmace> |
| Networking sub component: | router | QA Contact: | Hongan Li <hongli> |
| Status: | CLOSED DUPLICATE | Docs Contact: | |
| Severity: | high | ||
| Priority: | unspecified | CC: | aos-bugs, bbennett, nagrawal, wking |
| Version: | 4.1.0 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.1.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-05-07 19:12:41 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Samuel Padgett
2019-05-07 16:46:31 UTC
cluster-authentication-operator reporting degraded due to error checking current version: unable to check route health: failed to GET route: dial tcp: lookup oauth-openshift.apps.ci-op-3gbj403q-c4a31.origin-ci-int-aws.dev.rhcloud.com on 172.30.0.10:53: no such host Actually, looks like an issue w/ LB provisioning. Moving back to Routing. Every cited CI run is the same AWS LoadBalancer quota issue. So, this isn't a routing bug. > Every cited CI run is the same AWS LoadBalancer quota issue.
I'm reaping leaked AWS resources, which should help with this. But it would be nice to have an extended inability to create a load-balancer get bubbled up into a Degraded status. The ingress operator should be able to monitor this and set Degraded if its LoadBalancer request remained unfulfilled for $TOO_LONG. And it could watch for Error/Warning Events in the openshift-ingress namespace to get the reason from the Service controller.
As it stands, I don't think: $ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/716/pull-ci-openshift-machine-config-operator-master-e2e-aws/3512/artifacts/e2e-aws/clusteroperators.json | jq '.items[] | select(.metadata.name == "ingress").status.conditions' [ { "lastTransitionTime": "2019-05-07T15:58:56Z", "message": "operand namespace exists", "status": "False", "type": "Degraded" }, { "lastTransitionTime": "2019-05-07T15:59:44Z", "message": "desired and current number of IngressControllers are equal", "status": "False", "type": "Progressing" }, { "lastTransitionTime": "2019-05-07T15:59:43Z", "message": "desired and current number of IngressControllers are equal", "status": "True", "type": "Available" } ] is an accurate summary for an ingress operator without a fulfilled LoadBalancer Service. Agree that we need to improve the conditions in this case. During this, Miciah also realized we haven't implemented our declared API around DNS status: https://github.com/openshift/api/blob/master/operator/v1/types_ingress.go#L253-L275 Given that this was caused by AWS quota issues and https://bugzilla.redhat.com/show_bug.cgi?id=1707545 tracks the fix to report the status properly, I'm closing this. *** This bug has been marked as a duplicate of bug 1707545 *** |