Bug 1957584

Summary:	Routes are not getting created when using hostname without FQDN standard
Product:	OpenShift Container Platform	Reporter:	Jobin A T <jat>
Component:	Networking	Assignee:	Miciah Dashiel Butler Masters <mmasters>
Networking sub component:	router	QA Contact:	Arvind iyengar <aiyengar>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	high
Priority:	high	CC:	aos-bugs, bmcelvee, cholman, dpateriy, ffranz, fminafra, hongli, jat, kuiwang, lmohanty, mjoseph, mmasters, redhat-info, rkant, scuppett, sgreene, wking
Version:	4.7	Keywords:	FastFix, Regression, Reopened
Target Milestone:	---
Target Release:	4.8.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-07-27 23:06:39 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1965402

Description Jobin A T 2021-05-06 04:52:11 UTC

Description of problem:
Routes are not getting created when using hostname without FQDN standard 
In OCP 4.6.3 we are able to create routes using hostname "foo" with out the FQDN standard.
But in OCP4.7 we are not able to create the same kind of route. while creating the route getting the below error.

The Route "httpd" is invalid: spec.host: Invalid value: "pipequery": host must conform to DNS 1123 naming conventions: [spec.host: Invalid value: "pipequery": should be a domain with at least two segments separated by dots] 

Version-Release number of selected component (if applicable):
version   4.7.1

How reproducible:
100%

Steps to Reproduce:

1. oc new-project demo
2. oc new-app httpd
3. oc expose svc/httpd --hostname=foo

Actual results:

The Route "httpd" is invalid: spec.host: Invalid value: "foo": host must conform to DNS 1123 naming conventions: [spec.host: Invalid value: "foo": should be a domain with at least two segments separated by dots]

Expected results:

Route should create with out any warning.

Additional info:

In OCP version 4.6.3 its working but in OCP version 4.7.1 its not working

Comment 2 Miciah Dashiel Butler Masters 2021-05-07 16:04:26 UTC

*** Bug 1957583 has been marked as a duplicate of this bug. ***

Comment 9 redhat-info 2021-05-22 12:08:24 UTC

This broke half of our internal routes after the upgrade to 4.7.11. 
We can't use shortname (non fqdn) DNS entries anymore. 

I don't understand the reasoning behind enforcing this globally for all customers without even an option to disable this behavior if needed? 

Is there any workaround to disable this check on the ingress controller?

Comment 14 Miciah Dashiel Butler Masters 2021-05-24 22:18:59 UTC

Previously, OpenShift API admitted routes that had invalid DNS names in spec.host, and OpenShift router rejected these routes in some cases.  Having a route admitted by the API and rejected by the router caused confusion, and the issue was reported as bug 1896977, which we fixed by modifying the API to reject routes with host names that the router would reject, such as hosts with overly long labels.  Unintentionally, we also changed the router to reject routes with host names that the router previously accepted, such as hosts with single labels.  

Ideally, the API should reject routes that the router would reject, but the API and router should continue to admit any routes that the router admitted in previous versions of OpenShift.  

In order to provide an expedited solution for this new report (bug 1957584), we will revert the fix for bug 1896977.  We will need to first revert the change in OpenShift 4.8 and then backport the reversion to OpenShift 4.7.  For now, I am setting the target version of this report to 4.8.0, and we will clone the report for the 4.7.z backport.

Comment 15 Arvind iyengar 2021-05-26 06:22:49 UTC

verified in "4.8.0-0.ci.test-2021-05-26-040433-ci-ln-zdb6fwk-latest" ci image. With the patch revert in place, the unformatted hostnames are now accepted and functional: 
-------
oc get clusterversion   
NAME      VERSION                                                  AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.ci.test-2021-05-26-040433-ci-ln-zdb6fwk-latest   True        False         80m     Cluster version is 4.8.0-0.ci.test-2021-05-26-040433-ci-ln-zdb6fwk-latest

oc expose svc service-unsecure --hostname=foobar
route.route.openshift.io/service-unsecure exposed

oc get route                        
NAME               HOST/PORT   PATH   SERVICES           PORT   TERMINATION   WILDCARD
service-unsecure   foobar             service-unsecure   http                 None

oc get route service-unsecure -o yaml
apiVersion: route.openshift.io/v1
kind: Route
metadata:
  creationTimestamp: "2021-05-26T05:54:47Z"
  labels:
    name: service-unsecure
  name: service-unsecure
  namespace: test1
  resourceVersion: "58363"
  uid: b869b810-656a-437d-8b40-77170f0330ec
spec:
  host: foobar
  port:
    targetPort: http
  to:
    kind: Service
    name: service-unsecure
    weight: 100
  wildcardPolicy: None
status:
  ingress:
  - conditions:
    - lastTransitionTime: "2021-05-26T05:54:47Z"
      status: "True"
      type: Admitted
    host: foobar
    routerCanonicalHostname: apps.ci-ln-zdb6fwk-f76d1.origin-ci-int-gce.dev.openshift.com
    routerName: default
    wildcardPolicy: None
-------

Comment 16 Lalatendu Mohanty 2021-05-27 16:20:20 UTC

We're asking the following questions to evaluate whether or not this bug warrants blocking an upgrade edge from either the previous X.Y or X.Y.Z. The ultimate goal is to avoid delivering an update which introduces new risk or reduces cluster functionality in any way. Sample answers are provided to give more context and the UpgradeBlocker flag has been added to this bug. It will be removed if the assessment indicates that this should not block upgrade edges. The expectation is that the assignee answers these questions.

Who is impacted?  If we have to block upgrade edges based on this issue, which edges would need blocking?
  example: Customers upgrading from 4.y.Z to 4.y+1.z running on GCP with thousands of namespaces, approximately 5% of the subscribed fleet
  example: All customers upgrading from 4.y.z to 4.y+1.z fail approximately 10% of the time
What is the impact?  Is it serious enough to warrant blocking edges?
  example: Up to 2 minute disruption in edge routing
  example: Up to 90seconds of API downtime
  example: etcd loses quorum and you have to restore from backup
How involved is remediation (even moderately serious impacts might be acceptable if they are easy to mitigate)?
  example: Issue resolves itself after five minutes
  example: Admin uses oc to fix things
  example: Admin must SSH to hosts, restore from backups, or other non standard admin activities
Is this a regression (if all previous versions were also vulnerable, updating to the new, vulnerable version does not increase exposure)?
  example: No, it’s always been like this we just never noticed
  example: Yes, from 4.y.z to 4.y+1.z Or 4.y.z to 4.y.z+1

Comment 17 W. Trevor King 2021-05-27 16:27:12 UTC

I've added ImpactStatementRequested, per [1].  Once we have an impact statement candidate, can you clear that and set ImpactStatementProposed?  If not, no worries, we'll probably notice anyway ;)

[1]: https://github.com/openshift/enhancements/pull/475

Comment 19 Miciah Dashiel Butler Masters 2021-05-29 07:31:43 UTC

> Who is impacted?  If we have to block upgrade edges based on this issue, which edges would need blocking?

Customers upgrading from 4.6.z to 4.7.z with routes that have invalid DNS names that the router previously accepted, namely single-label names (i.e., host names without dots).  

> What is the impact?  Is it serious enough to warrant blocking edges?

Routes with invalid DNS names are now rejected.  Applications will be unreachable.  

> How involved is remediation (even moderately serious impacts might be acceptable if they are easy to mitigate)?

Admin must changes routes to use valid DNS names or downgrade to an earlier OpenShift release.

> Is this a regression (if all previous versions were also vulnerable, updating to the new, vulnerable version does not increase exposure)?

Yes, from 4.6.z to 4.7.z.

Comment 20 Arvind iyengar 2021-05-31 03:32:10 UTC

The PR merge made into "4.8.0-0.nightly-2021-05-27-185001" nightly release and the bug has been verified via pre-merge (c#15) but the bot likely did not move it to "verified". Hence manually the appropriate state.

Comment 21 Lalatendu Mohanty 2021-06-02 13:07:11 UTC

We are not considering this as UpgradeBlocker because we don't know how many clusters are impacted as this information isn't submitted via Telemetry/Insights. However we see more customer cases we might change our stance.

Comment 23 Brandi Munilla 2021-06-24 16:49:57 UTC

Hi, does this bug require doc text? If so, please update the doc text field.

Comment 25 errata-xmlrpc 2021-07-27 23:06:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438