Bug 2020705 - Upgrade starting from 4.2.36 to 4.5.41-> 4.6.49 failing crashlooping container is waiting in apiserver-* pod
Summary: Upgrade starting from 4.2.36 to 4.5.41-> 4.6.49 failing crashlooping containe...
Keywords:
Status: CLOSED EOL
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: MCO Team
QA Contact: Rio Liu
URL:
Whiteboard:
: 1940844 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-11-05 17:00 UTC by Paige Rubendall
Modified: 2021-11-29 16:58 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-29 16:58:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Paige Rubendall 2021-11-05 17:00:32 UTC
Description of problem:
Upgrade form 4.2.36 all the way to 4.6.49 fails with the following message
APIServerDeployment_UnavailablePod APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver (crashlooping container is waiting in apiserver-97cf8f95b-kngz9 pod

Version-Release number of selected component (if applicable): 4.6.49


How reproducible: Unknown


Steps to Reproduce:
1. Create 4.2.36 cluster with IPI on GCP (FIPS off)
2. Upgrade to 4.3.40
3. Upgrade to 4.4.33 
4. Upgrade to 4.5.41
5. Upgrade to 4.6.49

Actual results:
Upgrade fails with degraded operators with main message being: APIServerDeployment_UnavailablePod APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver (crashlooping container is waiting in apiserver-97cf8f95b-kngz9 pod

Expected results:
Upgrade passes with no degraded operators or notReady nodes

Additional info:


11-03 17:07:08.038  Post action: #oc get node: NAME                                             STATUS   ROLES    AGE     VERSION           INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                 CONTAINER-RUNTIME
11-03 17:07:08.038  ugd-18-l74kz-m-0.c.openshift-qe.internal         Ready    master   7h      v1.18.3+d8ef5ad   10.0.0.2                    Red Hat Enterprise Linux CoreOS 45.82.202106211530-0 (Ootpa)   4.18.0-193.56.1.el8_2.x86_64   cri-o://1.18.4-11.rhaos4.5.gitfa57051.el8
11-03 17:07:08.038  ugd-18-l74kz-m-1.c.openshift-qe.internal         Ready    master   7h      v1.18.3+d8ef5ad   10.0.0.3                    Red Hat Enterprise Linux CoreOS 45.82.202106211530-0 (Ootpa)   4.18.0-193.56.1.el8_2.x86_64   cri-o://1.18.4-11.rhaos4.5.gitfa57051.el8
11-03 17:07:08.038  ugd-18-l74kz-m-2.c.openshift-qe.internal         Ready    master   7h      v1.18.3+d8ef5ad   10.0.0.5                    Red Hat Enterprise Linux CoreOS 45.82.202106211530-0 (Ootpa)   4.18.0-193.56.1.el8_2.x86_64   cri-o://1.18.4-11.rhaos4.5.gitfa57051.el8
11-03 17:07:08.038  ugd-18-l74kz-w-a-jh4pj.c.openshift-qe.internal   Ready    worker   6h54m   v1.18.3+d8ef5ad   10.0.32.2                   Red Hat Enterprise Linux CoreOS 45.82.202106211530-0 (Ootpa)   4.18.0-193.56.1.el8_2.x86_64   cri-o://1.18.4-11.rhaos4.5.gitfa57051.el8
11-03 17:07:08.038  ugd-18-l74kz-w-b-8hwv4.c.openshift-qe.internal   Ready    worker   6h54m   v1.18.3+d8ef5ad   10.0.32.4                   Red Hat Enterprise Linux CoreOS 45.82.202106211530-0 (Ootpa)   4.18.0-193.56.1.el8_2.x86_64   cri-o://1.18.4-11.rhaos4.5.gitfa57051.el8
11-03 17:07:08.038  ugd-18-l74kz-w-c-fbhtz.c.openshift-qe.internal   Ready    worker   6h54m   v1.18.3+d8ef5ad   10.0.32.3                   Red Hat Enterprise Linux CoreOS 45.82.202106211530-0 (Ootpa)   4.18.0-193.56.1.el8_2.x86_64   cri-o://1.18.4-11.rhaos4.5.gitfa57051.el8
11-03 17:07:08.038  
11-03 17:07:08.038  
11-03 17:07:08.038  Post action: #oc get co:NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
11-03 17:07:08.038  authentication                             4.6.49    True        False         True       146m
11-03 17:07:08.038  cloud-credential                           4.6.49    True        False         False      6h58m
11-03 17:07:08.038  cluster-autoscaler                         4.6.49    True        False         False      6h47m
11-03 17:07:08.038  config-operator                            4.6.49    True        False         False      3h48m
11-03 17:07:08.038  console                                    4.6.49    False       False         False      147m
11-03 17:07:08.038  csi-snapshot-controller                    4.6.49    True        False         False      3h15m
11-03 17:07:08.038  dns                                        4.5.41    True        True          False      6h58m
11-03 17:07:08.038  etcd                                       4.6.49    True        False         False      4h40m
11-03 17:07:08.038  image-registry                             4.6.49    True        False         False      3h14m
11-03 17:07:08.039  ingress                                    4.6.49    True        False         False      157m
11-03 17:07:08.039  insights                                   4.6.49    True        False         False      6h58m
11-03 17:07:08.039  kube-apiserver                             4.6.49    True        False         False      4h38m
11-03 17:07:08.039  kube-controller-manager                    4.6.49    True        False         False      4h36m
11-03 17:07:08.039  kube-scheduler                             4.6.49    True        False         False      4h36m
11-03 17:07:08.039  kube-storage-version-migrator              4.6.49    True        False         False      3h13m
11-03 17:07:08.039  machine-api                                4.6.49    True        False         False      6h58m
11-03 17:07:08.039  machine-approver                           4.6.49    True        False         False      3h44m
11-03 17:07:08.039  machine-config                             4.5.41    True        False         False      178m
11-03 17:07:08.039  marketplace                                4.6.49    True        False         False      148m
11-03 17:07:08.039  monitoring                                 4.6.49    False       False         True       147m
11-03 17:07:08.039  network                                    4.6.49    True        False         False      6h57m
11-03 17:07:08.039  node-tuning                                4.6.49    True        False         False      158m
11-03 17:07:08.039  openshift-apiserver                        4.6.49    True        False         False      148m
11-03 17:07:08.039  openshift-controller-manager               4.6.49    True        False         False      148m
11-03 17:07:08.039  openshift-samples                          4.6.49    True        False         False      156m
11-03 17:07:08.039  operator-lifecycle-manager                 4.6.49    True        False         False      6h48m
11-03 17:07:08.039  operator-lifecycle-manager-catalog         4.6.49    True        False         False      6h48m
11-03 17:07:08.039  operator-lifecycle-manager-packageserver   4.6.49    True        False         False      157m
11-03 17:07:08.039  service-ca                                 4.6.49    True        True          False      6h58m
11-03 17:07:08.039  service-catalog-apiserver                  4.4.33    True        False         False      148m
11-03 17:07:08.039  service-catalog-controller-manager         4.4.33    True        False         False      4h11m
11-03 17:07:08.039  storage                                    4.6.49    True        False         False      158m


Will add must gather and describe of components in separate private comment

Comment 2 Stefan Schimanski 2021-11-08 08:45:58 UTC
The oauth-apiserver pods reports:

  transport: Error while dialing dial tcp 10.0.0.3:2379: connect: no route to host
  Get "https://172.30.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication?timeout=10s": dial tcp 172.30.0.1:443: connect: no route to host 

This means SDN does not route our traffic.

Comment 5 Alexander Constantinescu 2021-11-11 14:17:04 UTC
*** Bug 1940844 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.