Description of problem: Upgrade upi/gcp cluster from 4.2.2 to 4.3.0-0.nightly-2019-10-29-073252 fail. # ./oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.2 True True 70m Unable to apply 4.3.0-0.nightly-2019-10-29-073252: the cluster operator ingress is degraded # ./oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE ... dns 4.3.0-0.nightly-2019-10-29-073252 True False False 99m image-registry 4.3.0-0.nightly-2019-10-29-073252 True False False 91m ingress 4.3.0-0.nightly-2019-10-29-073252 True False True 91m ... machine-api 4.3.0-0.nightly-2019-10-29-073252 True False False 100m machine-config 4.2.2 True False False 99m ... =========================================================================== # ./oc describe co ingress Name: ingress Namespace: ... Status: Conditions: Last Transition Time: 2019-10-30T02:55:27Z Message: Some ingresscontrollers are degraded: default Reason: IngressControllersDegraded Status: True Type: Degraded Last Transition Time: 2019-10-30T02:45:26Z Message: desired and current number of IngressControllers are equal Status: False Type: Progressing Last Transition Time: 2019-10-30T02:14:24Z Message: desired and current number of IngressControllers are equal Status: True Type: Available ... Checked ingresscontroller that the new deployment timeout. # oc get IngressController default -o yaml ... - lastTransitionTime: "2019-10-30T02:55:27Z" message: 'The deployment failed (reason: ProgressDeadlineExceeded) with message: ReplicaSet "router-default-5c94bd7d94" has timed out progressing.' reason: DeploymentFailed status: "True" type: Degraded ... Checked that the new deployed router pod in pending status. # oc get all -n openshift-ingress NAME READY STATUS RESTARTS AGE pod/router-default-5c94bd7d94-xl4pt 0/1 Pending 0 68m pod/router-default-78f49dfd9-2p4pk 1/1 Running 0 104m pod/router-default-78f49dfd9-hx5vm 1/1 Running 0 104m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/router-default LoadBalancer 172.30.12.124 35.239.159.216 80:30778/TCP,443:30771/TCP 104m service/router-internal-default ClusterIP 172.30.31.142 <none> 80/TCP,443/TCP,1936/TCP 104m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/router-default 2/2 1 2 104m NAME DESIRED CURRENT READY AGE replicaset.apps/router-default-5c94bd7d94 1 1 0 68m replicaset.apps/router-default-78f49dfd9 2 2 2 104m # oc describe pod/router-default-5c94bd7d94-xl4pt -n openshift-ingress Name: router-default-5c94bd7d94-xl4pt Namespace: openshift-ingress Priority: 2000000000 Priority Class Name: system-cluster-critical Node: <none> Labels: ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default ingresscontroller.operator.openshift.io/hash=576796cc5d pod-template-hash=5c94bd7d94 Annotations: openshift.io/scc: restricted Status: Pending IP: IPs: <none> Controlled By: ReplicaSet/router-default-5c94bd7d94 Containers: router: Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:83defbeee71841e5d3f2d9d5a971f3fb89605fdc4503c5a7d60af9609bf1a5bb Ports: 80/TCP, 443/TCP, 1936/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP Requests: cpu: 100m memory: 256Mi Liveness: http-get http://:1936/healthz delay=10s timeout=1s period=10s #success=1 #failure=3 Readiness: http-get http://:1936/healthz/ready delay=10s timeout=1s period=10s #success=1 #failure=3 Environment: DEFAULT_CERTIFICATE_DIR: /etc/pki/tls/private ROUTER_CANONICAL_HOSTNAME: apps.jliu-bug.qe.gcp.devcluster.openshift.com ROUTER_CIPHERS: TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384 ROUTER_METRICS_TLS_CERT_FILE: /etc/pki/tls/metrics-certs/tls.crt ROUTER_METRICS_TLS_KEY_FILE: /etc/pki/tls/metrics-certs/tls.key ROUTER_METRICS_TYPE: haproxy ROUTER_SERVICE_NAME: default ROUTER_SERVICE_NAMESPACE: openshift-ingress ROUTER_THREADS: 4 SSL_MIN_VERSION: TLSv1.2 STATS_PASSWORD: <set to the key 'statsPassword' in secret 'router-stats-default'> Optional: false STATS_PORT: 1936 STATS_USERNAME: <set to the key 'statsUsername' in secret 'router-stats-default'> Optional: false Mounts: /etc/pki/tls/metrics-certs from metrics-certs (ro) /etc/pki/tls/private from default-certificate (ro) /var/run/secrets/kubernetes.io/serviceaccount from router-token-jpcxg (ro) Conditions: Type Status PodScheduled False Volumes: default-certificate: Type: Secret (a volume populated by a Secret) SecretName: router-certs-default Optional: false metrics-certs: Type: Secret (a volume populated by a Secret) SecretName: router-metrics-certs-default Optional: false router-token-jpcxg: Type: Secret (a volume populated by a Secret) SecretName: router-token-jpcxg Optional: false QoS Class: Burstable Node-Selectors: kubernetes.io/os=linux node-role.kubernetes.io/worker= Tolerations: node.kubernetes.io/memory-pressure:NoSchedule node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 node(s) didn't match pod affinity/anti-affinity, 2 node(s) didn't satisfy existing pods anti-affinity rules, 3 node(s) didn't match node selector. Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 node(s) didn't match pod affinity/anti-affinity, 2 node(s) didn't satisfy existing pods anti-affinity rules, 3 node(s) didn't match node selector. Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 node(s) didn't match pod affinity/anti-affinity, 2 node(s) didn't satisfy existing pods anti-affinity rules, 3 node(s) didn't match node selector. Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 node(s) didn't match pod affinity/anti-affinity, 2 node(s) didn't satisfy existing pods anti-affinity rules, 3 node(s) didn't match node selector. Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't satisfy existing pods anti-affinity rules, 1 node(s) had taints that the pod didn't tolerate, 3 node(s) didn't match node selector. Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't satisfy existing pods anti-affinity rules, 1 node(s) had taints that the pod didn't tolerate, 3 node(s) didn't match node selector. Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't satisfy existing pods anti-affinity rules, 1 node(s) had taints that the pod didn't tolerate, 3 node(s) didn't match node selector. Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 node(s) didn't match pod affinity/anti-affinity, 2 node(s) didn't satisfy existing pods anti-affinity rules, 3 node(s) didn't match node selector. Version-Release number of selected component (if applicable): 4.3.0-0.nightly-2019-10-29-073252 How reproducible: always Steps to Reproduce: 1. upgrade upi/gcp cluster from 4.2.2 to 4.3 nightly build. 2. 3. Actual results: upgrade failed at ingress operator. Expected results: upgrade succeed. Additional info: Refer to master gather logs.
Version:4.3.0-0.nightly-2019-11-12-000306 Upgrade from v4.2.4 to latest 4.3.0-0.nightly-2019-11-12-000306 succeed. # ./oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.3.0-0.nightly-2019-11-12-000306 True False False 112m cloud-credential 4.3.0-0.nightly-2019-11-12-000306 True False False 127m cluster-autoscaler 4.3.0-0.nightly-2019-11-12-000306 True False False 122m console 4.3.0-0.nightly-2019-11-12-000306 True False False 73m dns 4.3.0-0.nightly-2019-11-12-000306 True False False 126m image-registry 4.3.0-0.nightly-2019-11-12-000306 True False False 86m ingress 4.3.0-0.nightly-2019-11-12-000306 True False False 81m insights 4.3.0-0.nightly-2019-11-12-000306 True False False 127m kube-apiserver 4.3.0-0.nightly-2019-11-12-000306 True False False 126m kube-controller-manager 4.3.0-0.nightly-2019-11-12-000306 True False False 124m kube-scheduler 4.3.0-0.nightly-2019-11-12-000306 True False False 125m machine-api 4.3.0-0.nightly-2019-11-12-000306 True False False 127m machine-config 4.3.0-0.nightly-2019-11-12-000306 True False False 126m marketplace 4.3.0-0.nightly-2019-11-12-000306 True False False 79m monitoring 4.3.0-0.nightly-2019-11-12-000306 True False False 75m network 4.3.0-0.nightly-2019-11-12-000306 True False False 126m node-tuning 4.3.0-0.nightly-2019-11-12-000306 True False False 85m openshift-apiserver 4.3.0-0.nightly-2019-11-12-000306 True False False 83m openshift-controller-manager 4.3.0-0.nightly-2019-11-12-000306 True False False 124m openshift-samples 4.3.0-0.nightly-2019-11-12-000306 True False False 98m operator-lifecycle-manager 4.3.0-0.nightly-2019-11-12-000306 True False False 126m operator-lifecycle-manager-catalog 4.3.0-0.nightly-2019-11-12-000306 True False False 126m operator-lifecycle-manager-packageserver 4.3.0-0.nightly-2019-11-12-000306 True False False 73m service-ca 4.3.0-0.nightly-2019-11-12-000306 True False False 127m service-catalog-apiserver 4.3.0-0.nightly-2019-11-12-000306 True False False 123m service-catalog-controller-manager 4.3.0-0.nightly-2019-11-12-000306 True False False 123m storage 4.3.0-0.nightly-2019-11-12-000306 True False False 99m # ./oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.0-0.nightly-2019-11-12-000306 True False 70m Cluster version is 4.3.0-0.nightly-2019-11-12-000306 # ./oc get all -n openshift-ingress NAME READY STATUS RESTARTS AGE pod/router-default-54c44cb495-flhrk 1/1 Running 0 128m pod/router-default-54c44cb495-zldq9 1/1 Running 0 132m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/router-default LoadBalancer 172.30.141.82 35.222.217.2 80:32378/TCP,443:31958/TCP 164m service/router-internal-default ClusterIP 172.30.53.70 <none> 80/TCP,443/TCP,1936/TCP 164m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/router-default 2/2 2 2 164m NAME DESIRED CURRENT READY AGE replicaset.apps/router-default-54c44cb495 2 2 2 133m replicaset.apps/router-default-5b4cc4c6f6 0 0 0 164m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062