Bug 1740374

Summary: 5 clusters reporting ingress unavailable and reason ingressunavailable
Product: OpenShift Container Platform Reporter: Ben Parees <bparees>
Component: NetworkingAssignee: Miciah Dashiel Butler Masters <mmasters>
Networking sub component: router QA Contact: Hongan Li <hongli>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: low CC: aos-bugs, nagrawal
Version: 4.2.0   
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-23 11:05:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 3 Dan Mace 2019-08-14 15:01:15 UTC
What would be a more meaningful reason?

At the end of the day, beyond just knowing Available=False, I doubt a more refined Reason is going to be of any serious use in further diagnosing the problem. You have to get the actual operator and operand resources and look at their details (including related objects, condition messages, etc) to have any hope of understanding what's wrong.

Comment 4 Ben Parees 2019-08-14 15:07:15 UTC
Some hint of why it's unavailable (basically the same thing i said about DNS).  

Is the operator hitting an error that is preventing it from creating deployments/pods (e.g. api server inaccesssibility?)  Is the config invalid?  Are the pods created but not running?

Comment 5 Dan Mace 2019-08-14 15:16:59 UTC
The reason means "some ingresscontroller resources have an Available=False condition". To be any more specific here would be to try and aggregate those potentially disparate ingresscontroller condition reasons into a single reason, which would be incomprehensible. This is why messages and related resources exist. The only path I see to further understanding is looking at the conditions on the specific ingress controllers.

Would "SomeIngressControllersUnavailable" be more clear? AtLeastOneIngressControllerUnavailable?

Comment 7 Ben Parees 2019-08-14 16:05:53 UTC
Dan and I discussed this at his desk, I think making distinguishing some vs all unavailable is useful, as well as potentially adding alerts that can fire for specific controllers being unavailable.

Comment 9 Dan Mace 2019-11-12 14:22:16 UTC
I think the status improvements in https://github.com/openshift/cluster-ingress-operator/pull/314 cover the spirit of this issue sufficiently. Let's call it a fix.

Comment 11 Hongan Li 2019-11-18 08:15:42 UTC
verified with 4.3.0-0.nightly-2019-11-17-224250 and issue has been fixed.

### DNSReady
  - lastTransitionTime: "2019-11-18T07:57:11Z"
    message: 'The record failed to provision in some zones: [{hongxx-mbtt2-private-zone
      map[]}]'
    reason: FailedZones
    status: "False"
    type: DNSReady

### DeploymentDegraded
  - lastTransitionTime: "2019-11-18T08:05:57Z"
    message: 'The deployment has Available status condition set to False (reason:
      MinimumReplicasUnavailable) with message: Deployment does not have minimum availability.'
    reason: DeploymentUnavailable
    status: "True"
    type: DeploymentDegraded

### co/ingress
status:
  conditions:
  - lastTransitionTime: "2019-11-18T07:58:10Z"
    message: 'Some ingresscontrollers are degraded: default'
    reason: IngressControllersDegraded
    status: "True"
    type: Degraded

Comment 13 errata-xmlrpc 2020-01-23 11:05:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062