Bug 1957584
Summary: | Routes are not getting created when using hostname without FQDN standard | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jobin A T <jat> |
Component: | Networking | Assignee: | Miciah Dashiel Butler Masters <mmasters> |
Networking sub component: | router | QA Contact: | Arvind iyengar <aiyengar> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | aos-bugs, bmcelvee, cholman, dpateriy, ffranz, fminafra, hongli, jat, kuiwang, lmohanty, mjoseph, mmasters, redhat-info, rkant, scuppett, sgreene, wking |
Version: | 4.7 | Keywords: | FastFix, Regression, Reopened |
Target Milestone: | --- | ||
Target Release: | 4.8.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-07-27 23:06:39 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1965402 |
Description
Jobin A T
2021-05-06 04:52:11 UTC
*** Bug 1957583 has been marked as a duplicate of this bug. *** This broke half of our internal routes after the upgrade to 4.7.11. We can't use shortname (non fqdn) DNS entries anymore. I don't understand the reasoning behind enforcing this globally for all customers without even an option to disable this behavior if needed? Is there any workaround to disable this check on the ingress controller? Previously, OpenShift API admitted routes that had invalid DNS names in spec.host, and OpenShift router rejected these routes in some cases. Having a route admitted by the API and rejected by the router caused confusion, and the issue was reported as bug 1896977, which we fixed by modifying the API to reject routes with host names that the router would reject, such as hosts with overly long labels. Unintentionally, we also changed the router to reject routes with host names that the router previously accepted, such as hosts with single labels. Ideally, the API should reject routes that the router would reject, but the API and router should continue to admit any routes that the router admitted in previous versions of OpenShift. In order to provide an expedited solution for this new report (bug 1957584), we will revert the fix for bug 1896977. We will need to first revert the change in OpenShift 4.8 and then backport the reversion to OpenShift 4.7. For now, I am setting the target version of this report to 4.8.0, and we will clone the report for the 4.7.z backport. verified in "4.8.0-0.ci.test-2021-05-26-040433-ci-ln-zdb6fwk-latest" ci image. With the patch revert in place, the unformatted hostnames are now accepted and functional: ------- oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.ci.test-2021-05-26-040433-ci-ln-zdb6fwk-latest True False 80m Cluster version is 4.8.0-0.ci.test-2021-05-26-040433-ci-ln-zdb6fwk-latest oc expose svc service-unsecure --hostname=foobar route.route.openshift.io/service-unsecure exposed oc get route NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD service-unsecure foobar service-unsecure http None oc get route service-unsecure -o yaml apiVersion: route.openshift.io/v1 kind: Route metadata: creationTimestamp: "2021-05-26T05:54:47Z" labels: name: service-unsecure name: service-unsecure namespace: test1 resourceVersion: "58363" uid: b869b810-656a-437d-8b40-77170f0330ec spec: host: foobar port: targetPort: http to: kind: Service name: service-unsecure weight: 100 wildcardPolicy: None status: ingress: - conditions: - lastTransitionTime: "2021-05-26T05:54:47Z" status: "True" type: Admitted host: foobar routerCanonicalHostname: apps.ci-ln-zdb6fwk-f76d1.origin-ci-int-gce.dev.openshift.com routerName: default wildcardPolicy: None ------- We're asking the following questions to evaluate whether or not this bug warrants blocking an upgrade edge from either the previous X.Y or X.Y.Z. The ultimate goal is to avoid delivering an update which introduces new risk or reduces cluster functionality in any way. Sample answers are provided to give more context and the UpgradeBlocker flag has been added to this bug. It will be removed if the assessment indicates that this should not block upgrade edges. The expectation is that the assignee answers these questions. Who is impacted? If we have to block upgrade edges based on this issue, which edges would need blocking? example: Customers upgrading from 4.y.Z to 4.y+1.z running on GCP with thousands of namespaces, approximately 5% of the subscribed fleet example: All customers upgrading from 4.y.z to 4.y+1.z fail approximately 10% of the time What is the impact? Is it serious enough to warrant blocking edges? example: Up to 2 minute disruption in edge routing example: Up to 90seconds of API downtime example: etcd loses quorum and you have to restore from backup How involved is remediation (even moderately serious impacts might be acceptable if they are easy to mitigate)? example: Issue resolves itself after five minutes example: Admin uses oc to fix things example: Admin must SSH to hosts, restore from backups, or other non standard admin activities Is this a regression (if all previous versions were also vulnerable, updating to the new, vulnerable version does not increase exposure)? example: No, it’s always been like this we just never noticed example: Yes, from 4.y.z to 4.y+1.z Or 4.y.z to 4.y.z+1 I've added ImpactStatementRequested, per [1]. Once we have an impact statement candidate, can you clear that and set ImpactStatementProposed? If not, no worries, we'll probably notice anyway ;) [1]: https://github.com/openshift/enhancements/pull/475 > Who is impacted? If we have to block upgrade edges based on this issue, which edges would need blocking? Customers upgrading from 4.6.z to 4.7.z with routes that have invalid DNS names that the router previously accepted, namely single-label names (i.e., host names without dots). > What is the impact? Is it serious enough to warrant blocking edges? Routes with invalid DNS names are now rejected. Applications will be unreachable. > How involved is remediation (even moderately serious impacts might be acceptable if they are easy to mitigate)? Admin must changes routes to use valid DNS names or downgrade to an earlier OpenShift release. > Is this a regression (if all previous versions were also vulnerable, updating to the new, vulnerable version does not increase exposure)? Yes, from 4.6.z to 4.7.z. The PR merge made into "4.8.0-0.nightly-2021-05-27-185001" nightly release and the bug has been verified via pre-merge (c#15) but the bot likely did not move it to "verified". Hence manually the appropriate state. We are not considering this as UpgradeBlocker because we don't know how many clusters are impacted as this information isn't submitted via Telemetry/Insights. However we see more customer cases we might change our stance. Hi, does this bug require doc text? If so, please update the doc text field. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |