Description of problem: Installation failed to complete. Seems like in the HA setups kuryr-controller's watchers die after a while (typically when left overnight) and never get recreated. This leads to pods not getting annotated. Version-Release number of selected component (if applicable): 4.2.0-0.nightly-2019-09-04-102339 How reproducible: 1/4 Steps to Reproduce: 1. Deploy IPI OSP + kuryr 2. 3. Actual results: Installation failed Expected results: Installation pass Additional info: [stack@undercloud-0 ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version False True 12h Unable to apply 4.2.0-0.nightly-2019-09-04-102339: an unknown error has occurred [stack@undercloud-0 ~]$ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE cloud-credential 4.2.0-0.nightly-2019-09-04-102339 True False False 12h dns 4.2.0-0.nightly-2019-09-04-102339 True False False 12h insights 4.2.0-0.nightly-2019-09-04-102339 True False True 12h kube-apiserver 4.2.0-0.nightly-2019-09-04-102339 True False False 11h kube-controller-manager 4.2.0-0.nightly-2019-09-04-102339 True False False 11h kube-scheduler 4.2.0-0.nightly-2019-09-04-102339 True False False 11h machine-api 4.2.0-0.nightly-2019-09-04-102339 True False False 12h machine-config 4.2.0-0.nightly-2019-09-04-102339 True False False 11h network 4.2.0-0.nightly-2019-09-04-102339 True False False 12h openshift-apiserver 4.2.0-0.nightly-2019-09-04-102339 False False False 11h openshift-controller-manager 4.2.0-0.nightly-2019-09-04-102339 True False False 12h operator-lifecycle-manager 4.2.0-0.nightly-2019-09-04-102339 True False False 12h operator-lifecycle-manager-catalog 4.2.0-0.nightly-2019-09-04-102339 True False False 12h operator-lifecycle-manager-packageserver 4.2.0-0.nightly-2019-09-04-102339 True False False 11h service-ca 4.2.0-0.nightly-2019-09-04-102339 True False False 12h [stack@undercloud-0 ~]$ oc -n openshift-kuryr get pods NAME READY STATUS RESTARTS AGE kuryr-cni-b5978 1/1 Running 0 12h kuryr-cni-fnp9r 1/1 Running 0 11h kuryr-cni-gpf9x 1/1 Running 3 12h kuryr-cni-hrc5x 1/1 Running 1 12h kuryr-cni-qvq7s 1/1 Running 0 12h kuryr-cni-vc9v7 1/1 Running 3 12h kuryr-controller-77c68665db-vm2m7 1/1 Running 6 12h kuryr-dns-admission-controller-6ssxj 1/1 Running 0 12h kuryr-dns-admission-controller-grzkr 1/1 Running 0 12h kuryr-dns-admission-controller-gw9h7 1/1 Running 0 12h
The fix is now merged into openshift/kuryr-kubernetes.
Verified on 4.2.0-0.nightly-2019-10-02-122541 Installation completed multiple times. [stack@undercloud-0 ~]$ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.2.0-0.nightly-2019-10-02-122541 True False False 3d19h cloud-credential 4.2.0-0.nightly-2019-10-02-122541 True False False 3d20h cluster-autoscaler 4.2.0-0.nightly-2019-10-02-122541 True False False 3d19h console 4.2.0-0.nightly-2019-10-02-122541 True False False 3d19h dns 4.2.0-0.nightly-2019-10-02-122541 True False False 3d20h image-registry 4.2.0-0.nightly-2019-10-02-122541 True False False 3d19h ingress 4.2.0-0.nightly-2019-10-02-122541 True False False 3d19h insights 4.2.0-0.nightly-2019-10-02-122541 True False True 3d20h kube-apiserver 4.2.0-0.nightly-2019-10-02-122541 True False False 3d20h kube-controller-manager 4.2.0-0.nightly-2019-10-02-122541 True False False 3d20h kube-scheduler 4.2.0-0.nightly-2019-10-02-122541 True False False 3d20h machine-api 4.2.0-0.nightly-2019-10-02-122541 True False False 3d20h machine-config 4.2.0-0.nightly-2019-10-02-122541 True False False 3d20h marketplace 4.2.0-0.nightly-2019-10-02-122541 True False False 3d18h monitoring 4.2.0-0.nightly-2019-10-02-122541 True False False 3d18h network 4.2.0-0.nightly-2019-10-02-122541 True False False 3d20h node-tuning 4.2.0-0.nightly-2019-10-02-122541 True False False 3d19h openshift-apiserver 4.2.0-0.nightly-2019-10-02-122541 True False False 3d20h openshift-controller-manager 4.2.0-0.nightly-2019-10-02-122541 True False False 3d20h openshift-samples 4.2.0-0.nightly-2019-10-02-122541 True False False 3d19h operator-lifecycle-manager 4.2.0-0.nightly-2019-10-02-122541 True False False 3d20h operator-lifecycle-manager-catalog 4.2.0-0.nightly-2019-10-02-122541 True False False 3d20h operator-lifecycle-manager-packageserver 4.2.0-0.nightly-2019-10-02-122541 True False False 3d1h service-ca 4.2.0-0.nightly-2019-10-02-122541 True False False 3d20h service-catalog-apiserver 4.2.0-0.nightly-2019-10-02-122541 True False False 3d19h service-catalog-controller-manager 4.2.0-0.nightly-2019-10-02-122541 True False False 3d19h storage 4.2.0-0.nightly-2019-10-02-122541 True False False 3d19h
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922