Bug 1749209 - [IPI] [OSP] Kuryr - pods not getting annotated in Kuryr Controller due to watcher stopped
Summary: [IPI] [OSP] Kuryr - pods not getting annotated in Kuryr Controller due to wat...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.2.0
Assignee: Michał Dulko
QA Contact: GenadiC
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-05 07:09 UTC by Udi Shkalim
Modified: 2019-10-16 06:40 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-16 06:40:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 680096 0 None MERGED Timeout connections when watching K8s API 2020-06-15 12:42:30 UTC
Red Hat Product Errata RHBA-2019:2922 0 None None None 2019-10-16 06:40:44 UTC

Description Udi Shkalim 2019-09-05 07:09:03 UTC
Description of problem:
Installation failed to complete.
Seems like in the HA setups kuryr-controller's watchers die after a while (typically when left overnight) and never get recreated. This leads to pods not getting annotated.

Version-Release number of selected component (if applicable):
4.2.0-0.nightly-2019-09-04-102339

How reproducible:
1/4

Steps to Reproduce:
1. Deploy IPI OSP + kuryr
2. 
3.

Actual results:
Installation failed

Expected results:
Installation pass

Additional info:
[stack@undercloud-0 ~]$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version             False       True          12h     Unable to apply 4.2.0-0.nightly-2019-09-04-102339: an unknown error has occurred

[stack@undercloud-0 ~]$ oc get co
NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
cloud-credential                           4.2.0-0.nightly-2019-09-04-102339   True        False         False      12h
dns                                        4.2.0-0.nightly-2019-09-04-102339   True        False         False      12h
insights                                   4.2.0-0.nightly-2019-09-04-102339   True        False         True       12h
kube-apiserver                             4.2.0-0.nightly-2019-09-04-102339   True        False         False      11h
kube-controller-manager                    4.2.0-0.nightly-2019-09-04-102339   True        False         False      11h
kube-scheduler                             4.2.0-0.nightly-2019-09-04-102339   True        False         False      11h
machine-api                                4.2.0-0.nightly-2019-09-04-102339   True        False         False      12h
machine-config                             4.2.0-0.nightly-2019-09-04-102339   True        False         False      11h
network                                    4.2.0-0.nightly-2019-09-04-102339   True        False         False      12h
openshift-apiserver                        4.2.0-0.nightly-2019-09-04-102339   False       False         False      11h
openshift-controller-manager               4.2.0-0.nightly-2019-09-04-102339   True        False         False      12h
operator-lifecycle-manager                 4.2.0-0.nightly-2019-09-04-102339   True        False         False      12h
operator-lifecycle-manager-catalog         4.2.0-0.nightly-2019-09-04-102339   True        False         False      12h
operator-lifecycle-manager-packageserver   4.2.0-0.nightly-2019-09-04-102339   True        False         False      11h
service-ca                                 4.2.0-0.nightly-2019-09-04-102339   True        False         False      12h



[stack@undercloud-0 ~]$ oc -n openshift-kuryr get pods
NAME                                   READY   STATUS    RESTARTS   AGE
kuryr-cni-b5978                        1/1     Running   0          12h
kuryr-cni-fnp9r                        1/1     Running   0          11h
kuryr-cni-gpf9x                        1/1     Running   3          12h
kuryr-cni-hrc5x                        1/1     Running   1          12h
kuryr-cni-qvq7s                        1/1     Running   0          12h
kuryr-cni-vc9v7                        1/1     Running   3          12h
kuryr-controller-77c68665db-vm2m7      1/1     Running   6          12h
kuryr-dns-admission-controller-6ssxj   1/1     Running   0          12h
kuryr-dns-admission-controller-grzkr   1/1     Running   0          12h
kuryr-dns-admission-controller-gw9h7   1/1     Running   0          12h

Comment 1 Michał Dulko 2019-09-10 10:03:01 UTC
The fix is now merged into openshift/kuryr-kubernetes.

Comment 3 Udi Shkalim 2019-10-06 13:10:19 UTC
Verified on 4.2.0-0.nightly-2019-10-02-122541
Installation completed multiple times.

[stack@undercloud-0 ~]$ oc get co
NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d19h
cloud-credential                           4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d20h
cluster-autoscaler                         4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d19h
console                                    4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d19h
dns                                        4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d20h
image-registry                             4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d19h
ingress                                    4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d19h
insights                                   4.2.0-0.nightly-2019-10-02-122541   True        False         True       3d20h
kube-apiserver                             4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d20h
kube-controller-manager                    4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d20h
kube-scheduler                             4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d20h
machine-api                                4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d20h
machine-config                             4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d20h
marketplace                                4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d18h
monitoring                                 4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d18h
network                                    4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d20h
node-tuning                                4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d19h
openshift-apiserver                        4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d20h
openshift-controller-manager               4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d20h
openshift-samples                          4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d19h
operator-lifecycle-manager                 4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d20h
operator-lifecycle-manager-catalog         4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d20h
operator-lifecycle-manager-packageserver   4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d1h
service-ca                                 4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d20h
service-catalog-apiserver                  4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d19h
service-catalog-controller-manager         4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d19h
storage                                    4.2.0-0.nightly-2019-10-02-122541   True        False         False      3d19h

Comment 4 errata-xmlrpc 2019-10-16 06:40:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922


Note You need to log in before you can comment on or make changes to this bug.