Bug 1686204
Summary: | Restoring the default ingress controller requires operator restart | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Dan Mace <dmace> |
Component: | Networking | Assignee: | Dan Mace <dmace> |
Networking sub component: | router | QA Contact: | Hongan Li <hongli> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | unspecified | CC: | aos-bugs, dhansen |
Version: | 4.1.0 | ||
Target Milestone: | --- | ||
Target Release: | 4.1.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-06-04 10:45:17 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Dan Mace
2019-03-06 22:34:20 UTC
I was able to delete and recreate the default ingress controller (aka clusteringress) without restarting the cluster-ingress-operator: $ oc delete clusteringress/default -n openshift-ingress-operator clusteringress.ingress.openshift.io "default" deleted $ oc get clusteringresses -n openshift-ingress-operator No resources found. $ oc get deploy -n openshift-ingress No resources found. $ oc get svc -n openshift-ingress No resources found. $ oc create -f assets/defaults/cluster-ingress.yaml clusteringress.ingress.openshift.io/default created $ oc get clusteringresses -n openshift-ingress-operator NAME AGE default 56s $ oc get deploy -n openshift-ingress NAME READY UP-TO-DATE AVAILABLE AGE router-default 2/2 2 2 61s $ oc get svc -n openshift-ingress NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE router-default LoadBalancer 172.30.197.85 aa41d9d6c406e11e9bd5a0e63434e58b-1280454278.us-east-1.elb.amazonaws.com 80:30794/TCP,443:32084/TCP 65s router-internal-default ClusterIP 172.30.139.207 <none> 80/TCP,443/TCP,1936/TCP 65s I verified before/after ingress connectivity by accessing the web console using the hostname from the console route. Keep in mind that propagating DNS names from authoritative name servers to resolvers can take several minutes: $ dig console-openshift-console.apps.danehans.devcluster.openshift.com ; <<>> DiG 9.10.6 <<>> console-openshift-console.apps.danehans.devcluster.openshift.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12580 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;console-openshift-console.apps.danehans.devcluster.openshift.com. IN A ;; ANSWER SECTION: console-openshift-console.apps.danehans.devcluster.openshift.com. 5 IN A 34.199.157.5 console-openshift-console.apps.danehans.devcluster.openshift.com. 5 IN A 18.214.218.55 ;; Query time: 42 msec ;; SERVER: 10.192.20.245#53(10.192.20.245) ;; WHEN: Wed Mar 06 19:26:40 EST 2019 ;; MSG SIZE rcvd: 125 For the "The ingress operator should recreate the default without user intervention" part of the bug, it sounds like this ingress controller should be named "mandatory" or "required" instead of default. verified with 4.0.0-0.nightly-2019-03-20-153904 and issue has been fixed. The ingresscontroller/default can be recreated automatically after deleting it. $ oc get pod -n openshift-ingress NAME READY STATUS RESTARTS AGE router-default-7cf558bd7f-hj5cm 1/1 Running 0 4h48m router-default-7cf558bd7f-r55hj 1/1 Running 0 4h48m $ oc get svc -n openshift-ingress NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE router-default LoadBalancer 172.30.110.219 a74971e0e4b7511e9a8eb0a21cd44590-1114946490.ap-northeast-1.elb.amazonaws.com 80:32276/TCP,443:31791/TCP 4h50m router-internal-default ClusterIP 172.30.180.114 <none> 80/TCP,443/TCP,1936/TCP 4h50m $ oc delete -n openshift-ingress-operator ingresscontroller/default $ oc get pod -n openshift-ingress NAME READY STATUS RESTARTS AGE router-default-7cf558bd7f-rj46k 0/1 ContainerCreating 0 2s router-default-7cf558bd7f-sp8fh 0/1 ContainerCreating 0 2s $ oc get svc -n openshift-ingress NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE router-default LoadBalancer 172.30.129.155 a4123fbf24b9e11e980c9069c412f6e1-1870353468.ap-northeast-1.elb.amazonaws.com 80:31146/TCP,443:31551/TCP 65s router-internal-default ClusterIP 172.30.145.212 <none> 80/TCP,443/TCP,1936/TCP 65s As mentioned in Comment 1, since LB changed and DNS propagation need some time so cannot access any route during this time. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758 |