Bug 1800633 - Bootstrap times out waiting to etcd-member pods
Summary: Bootstrap times out waiting to etcd-member pods
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.4.0
Assignee: Maysa Macedo
QA Contact: Jon Uriarte
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-07 15:26 UTC by Jon Uriarte
Modified: 2020-05-04 11:35 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-04 11:34:48 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 468 None closed Bug 1800633: Ensure etcd and authentication operator resolves dns over TCP 2020-03-30 09:35:56 UTC
Red Hat Product Errata RHBA-2020:0581 None None None 2020-05-04 11:35:11 UTC

Description Jon Uriarte 2020-02-07 15:26:42 UTC
Description of problem:

OCP 4.4 installer fails when configured with Kuryr on top of Openstack 13.

"Bootstrap failed to complete: failed to wait for bootstrapping to complete: timed out waiting for the condition"

$ oc get pods -n openshift-etcd                                                                                                                                                 
NAME                                READY   STATUS     RESTARTS   AGE
etcd-member-ostest-f29hc-master-0   0/2     Init:1/4   0          163m
etcd-member-ostest-f29hc-master-1   0/2     Init:1/4   0          163m
etcd-member-ostest-f29hc-master-2   0/2     Init:1/4   0          163m
etcd-staticpod-5n5qm                1/1     Running    0          160m
etcd-staticpod-htmr8                1/1     Running    0          160m
etcd-staticpod-mktcf                1/1     Running    0          160m
etcd-staticsync-8v8bs               1/1     Running    0          160m
etcd-staticsync-dg54h               1/1     Running    0          160m
etcd-staticsync-djmbc               1/1     Running    0          160m

etcd-member-* pods show the next message constantly:
"E0207 15:13:37.601140       1 run.go:267] error decoding result unexpected end of JSON input"

$ oc get pods -n openshift-apiserver
NAME              READY   STATUS             RESTARTS   AGE
apiserver-88shg   0/1     CrashLoopBackOff   32         154m
apiserver-bccpj   0/1     CrashLoopBackOff   32         154m
apiserver-nm4r4   0/1     CrashLoopBackOff   32         154m

$ oc get pods -n openshift-kuryr                                                                                                                                                
NAME                                   READY   STATUS    RESTARTS   AGE
kuryr-cni-2bl2r                        1/1     Running   0          162m
kuryr-cni-47tp4                        1/1     Running   0          145m
kuryr-cni-7p4wq                        1/1     Running   0          162m
kuryr-cni-7xjd7                        1/1     Running   0          144m
kuryr-cni-bg8c9                        1/1     Running   0          145m
kuryr-cni-vt8n8                        1/1     Running   0          162m
kuryr-controller-6b95896f5f-2pxkj      1/1     Running   1          162m
kuryr-dns-admission-controller-4l5pb   1/1     Running   0          162m
kuryr-dns-admission-controller-b4txp   1/1     Running   0          162m
kuryr-dns-admission-controller-whcgv   1/1     Running   0          162m


$ oc get nodes
NAME                        STATUS   ROLES    AGE    VERSION
ostest-f29hc-master-0       Ready    master   163m   v1.17.1
ostest-f29hc-master-1       Ready    master   163m   v1.17.1
ostest-f29hc-master-2       Ready    master   163m   v1.17.1
ostest-f29hc-worker-6h749   Ready    worker   143m   v1.17.1
ostest-f29hc-worker-7qflg   Ready    worker   143m   v1.17.1
ostest-f29hc-worker-hn9rp   Ready    worker   144m   v1.17.1

$ oc get co
NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                                                                 Unknown     Unknown       True       161m
cloud-credential                           4.4.0-0.nightly-2020-02-07-033907   True        False         False      169m
csi-snapshot-controller                    4.4.0-0.nightly-2020-02-07-033907   False       True          False      145m
dns                                        4.4.0-0.nightly-2020-02-07-033907   True        False         False      155m
etcd                                       4.4.0-0.nightly-2020-02-07-033907   False       True          True       161m
insights                                   4.4.0-0.nightly-2020-02-07-033907   True        False         False      160m
kube-apiserver                             4.4.0-0.nightly-2020-02-07-033907   True        False         False      157m
kube-controller-manager                    4.4.0-0.nightly-2020-02-07-033907   True        False         False      155m
kube-scheduler                             4.4.0-0.nightly-2020-02-07-033907   True        False         False      156m
kube-storage-version-migrator              4.4.0-0.nightly-2020-02-07-033907   True        False         False      146m
machine-api                                4.4.0-0.nightly-2020-02-07-033907   True        False         False      157m
machine-config                             4.4.0-0.nightly-2020-02-07-033907   True        False         False      151m
network                                    4.4.0-0.nightly-2020-02-07-033907   True        False         False      156m
node-tuning                                4.4.0-0.nightly-2020-02-07-033907   True        False         False      162m
openshift-apiserver                        4.4.0-0.nightly-2020-02-07-033907   False       False         True       162m
openshift-controller-manager               4.4.0-0.nightly-2020-02-07-033907   True        False         False      155m
operator-lifecycle-manager                 4.4.0-0.nightly-2020-02-07-033907   True        False         False      157m
operator-lifecycle-manager-catalog         4.4.0-0.nightly-2020-02-07-033907   True        False         False      157m
operator-lifecycle-manager-packageserver   4.4.0-0.nightly-2020-02-07-033907   True        False         False      150m
service-ca                                 4.4.0-0.nightly-2020-02-07-033907   True        False         False      159m
service-catalog-apiserver                  4.4.0-0.nightly-2020-02-07-033907   True        False         False      162m
service-catalog-controller-manager         4.4.0-0.nightly-2020-02-07-033907   True        False         False      162m


Version-Release number of selected component (if applicable):

OCP: 4.4.0-0.nightly-2020-02-07-033907
OSP: 13 puddle 2020-01-15.3

How reproducible: always


Steps to Reproduce:
1. Install OSP
2. Install OCP 4.4 on top with Kuryr sdn

Actual results: Installer fails during bootstrap

Expected results: Installer works


Additional info: Find attached Bootstrap gather logs (log-bundle-20200207080442.tar.gz)

Comment 3 Jon Uriarte 2020-02-18 07:26:28 UTC
Verified in 4.4.0-0.nightly-2020-02-17-131733 on top of OSP 13 2020-01-15.3 puddle.

The installer ends up successfully, no timeout during bootstrap.

>> INFO Install complete!


$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.0-0.nightly-2020-02-17-131733   True        False         13h     Cluster version is 4.4.0-0.nightly-2020-02-17-131733

$ oc get co
NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
cloud-credential                           4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
cluster-autoscaler                         4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
console                                    4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
csi-snapshot-controller                    4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
dns                                        4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
etcd                                       4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
image-registry                             4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
ingress                                    4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
insights                                   4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
kube-apiserver                             4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
kube-controller-manager                    4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
kube-scheduler                             4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
kube-storage-version-migrator              4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
machine-api                                4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
machine-config                             4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
marketplace                                4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
monitoring                                 4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
network                                    4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
node-tuning                                4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
openshift-apiserver                        4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
openshift-controller-manager               4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
openshift-samples                          4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
operator-lifecycle-manager                 4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
operator-lifecycle-manager-catalog         4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
operator-lifecycle-manager-packageserver   4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
service-ca                                 4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
service-catalog-apiserver                  4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
service-catalog-controller-manager         4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h
storage                                    4.4.0-0.nightly-2020-02-17-131733   True        False         False      14h

Comment 5 errata-xmlrpc 2020-05-04 11:34:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581


Note You need to log in before you can comment on or make changes to this bug.