Bug 1916401

Summary: Deleting an ingress controller with a bad DNS Record hangs
Product: OpenShift Container Platform Reporter: Stephen Greene <sgreene>
Component: NetworkingAssignee: Stephen Greene <sgreene>
Networking sub component: router QA Contact: Hongan Li <hongli>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: amcdermo, aos-bugs
Version: 4.7   
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Creating an ingress controller with a bad hostname, and then deleting it (since it is never successfully created). Consequence: The DNS record for the ingress controller cannot be deleted from the provider since it was never successfully created, so deleting the ingress controller hangs. The ingress-operator finalizer needs to be manually removed from the DNS record for deletion to complete. Fix: The ingress-operator now only deletes DNS records in a given zone for an ingress controller if the record actually exists in that zone for the provider. Result: It is possible to delete a broken ingress controller created with a bad domain.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-24 15:53:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Stephen Greene 2021-01-14 17:17:31 UTC
Description of problem:
Deleting a custom ingress controller with a bad domain name hangs on DNS record deletion (at least on GCP, need to test other platforms).


Version-Release number of selected component (if applicable): 4.7 (and prior)


How reproducible:
100%


Steps to Reproduce:
1. 
Create a trivial ingress controller with spec.Domain set to an invalid domain (ie, use a mutated version of the default ingress controller's domain)
example on GCP via cluster-bot:
---
apiVersion: operator.openshift.io/v1
kind: IngressController
metadata:
  name: test-ic
spec:
  domain: apps.<your-garbage-here>.origin-ci-int-gce.dev.openshift.com
---

2. Observe the ingress operator fail to create the DNS record for the ingress controller.
3. Delete the ingress controller via oc.


Actual results:
The busted ingress controller cannot be deleted via `oc delete ingresscontroller ...`. The DNS record finalizer has to be removed by hand so the delete operation can be completed (this is safe to do since no DNS record was never created).

Expected results:
Deleting an ingress controller with a bad domain works without delays or user interventions.

Additional info:
Observed this bug when accidentally creating an ingress controller on GCP with the base domain of a prior cluster.

Comment 3 Hongan Li 2021-01-20 03:30:57 UTC
Verified with 4.7.0-0.nightly-2021-01-19-095812 and passed

can delete the ingress controller with a bad domain successfully (without removing finalizers by hand).

Comment 6 errata-xmlrpc 2021-02-24 15:53:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633