Bug 1896730 - Ingresscontroller are recreated and router pods scaled to 0 causing intermittent outage
Summary: Ingresscontroller are recreated and router pods scaled to 0 causing intermitt...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.5
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Rick Rackow
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-11 12:03 UTC by Rick Rackow
Modified: 2022-08-04 22:30 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-11-11 16:56:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Rick Rackow 2020-11-11 12:03:22 UTC
Description of problem:
On an  OSD cluster, ingresscontrollers seems to be recreated more or less frequently without any actual reason


Version-Release number of selected component (if applicable):
```
    Group:
    Name:       openshift-ingress-operator
    Resource:   namespaces
    Group:      operator.openshift.io
    Name:
    Namespace:  openshift-ingress-operator
    Resource:   IngressController
    Group:      ingress.operator.openshift.io
    Name:
    Namespace:  openshift-ingress-operator
    Resource:   DNSRecord
    Group:
    Name:       openshift-ingress
    Resource:   namespaces
  Versions:
    Name:     operator
    Version:  4.5.11
    Name:     ingress-controller
    Version:  quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d0d21ae3e27140e1fa13b49d6b2883a0f1466d8e47a2a4839f22de80668d5c9
```


```
$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.11    True        False         28d     Cluster version is 4.5.11
```

How reproducible:

It is unclear what is causing this and therefore not clear how to reproduce.

Comment 1 Rick Rackow 2020-11-11 14:55:09 UTC
Assigning to SRE-P for further investigation.
Upcoming scprint because we'll not have a fix out the door by tomorrow

Comment 3 Rick Rackow 2020-11-11 16:56:05 UTC
This has been caused by a race condition in CLOUD-ingress-operator which is OSD specific tooling.
Closing as this is not a bug in cluster ingress or any other netorkring team maintained component


Note You need to log in before you can comment on or make changes to this bug.