Bug 1916401 - Deleting an ingress controller with a bad DNS Record hangs
Summary: Deleting an ingress controller with a bad DNS Record hangs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Routing
Version: 4.7
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.7.0
Assignee: Stephen Greene
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-14 17:17 UTC by Stephen Greene
Modified: 2021-02-24 15:53 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Creating an ingress controller with a bad hostname, and then deleting it (since it is never successfully created). Consequence: The DNS record for the ingress controller cannot be deleted from the provider since it was never successfully created, so deleting the ingress controller hangs. The ingress-operator finalizer needs to be manually removed from the DNS record for deletion to complete. Fix: The ingress-operator now only deletes DNS records in a given zone for an ingress controller if the record actually exists in that zone for the provider. Result: It is possible to delete a broken ingress controller created with a bad domain.
Clone Of:
Environment:
Last Closed: 2021-02-24 15:53:18 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-ingress-operator pull 529 0 None closed Bug 1916401: DNS: Skip deleting records that were not published. 2021-01-20 02:34:12 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:53:39 UTC

Description Stephen Greene 2021-01-14 17:17:31 UTC
Description of problem:
Deleting a custom ingress controller with a bad domain name hangs on DNS record deletion (at least on GCP, need to test other platforms).


Version-Release number of selected component (if applicable): 4.7 (and prior)


How reproducible:
100%


Steps to Reproduce:
1. 
Create a trivial ingress controller with spec.Domain set to an invalid domain (ie, use a mutated version of the default ingress controller's domain)
example on GCP via cluster-bot:
---
apiVersion: operator.openshift.io/v1
kind: IngressController
metadata:
  name: test-ic
spec:
  domain: apps.<your-garbage-here>.origin-ci-int-gce.dev.openshift.com
---

2. Observe the ingress operator fail to create the DNS record for the ingress controller.
3. Delete the ingress controller via oc.


Actual results:
The busted ingress controller cannot be deleted via `oc delete ingresscontroller ...`. The DNS record finalizer has to be removed by hand so the delete operation can be completed (this is safe to do since no DNS record was never created).

Expected results:
Deleting an ingress controller with a bad domain works without delays or user interventions.

Additional info:
Observed this bug when accidentally creating an ingress controller on GCP with the base domain of a prior cluster.

Comment 3 Hongan Li 2021-01-20 03:30:57 UTC
Verified with 4.7.0-0.nightly-2021-01-19-095812 and passed

can delete the ingress controller with a bad domain successfully (without removing finalizers by hand).

Comment 6 errata-xmlrpc 2021-02-24 15:53:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.