Bug 1555394 - [3.7] [free-int] kube-service-catalog/apiserver pod in crash loop after upgrade
Summary: [3.7] [free-int] kube-service-catalog/apiserver pod in crash loop after upgrade
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 3.7.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.7.z
Assignee: Michael Gugino
QA Contact: Weihua Meng
URL:
Whiteboard:
Depends On: 1546365 1547803
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-03-14 15:29 UTC by Scott Dodson
Modified: 2021-06-10 15:19 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1546365
Environment:
Last Closed: 2018-05-18 03:54:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1576 0 None None None 2018-05-18 03:55:23 UTC

Comment 1 Michael Gugino 2018-04-02 19:57:44 UTC
Backport to 3.7 merged: https://github.com/openshift/openshift-ansible/pull/7523

Comment 2 Weihua Meng 2018-04-08 10:26:33 UTC
Not fixed.
openshift-ansible-3.7.42-1.git.2.9ee4e71.el7.noarch

PR change in this RPM.

Steps:
1. install OCP3.6 with external etcd
2. upgrade to 3.7.42(3.7 latest), service-catalog disabled
3. run service-catalog playbook
failed at step3

PLAY RECAP *******************************************************************************
localhost                  : ok=11   changed=0    unreachable=0    failed=0   
qe-wmeng36etcd-etcd-1.0408-ln0.qe.rhcloud.com : ok=43   changed=2    unreachable=0    failed=0   
qe-wmeng36etcd-master-1.0408-ln0.qe.rhcloud.com : ok=94   changed=31   unreachable=0    failed=1   
qe-wmeng36etcd-node-registry-router-1.0408-ln0.qe.rhcloud.com : ok=50   changed=2    unreachable=0    failed=0   
qe-wmeng36etcd-node-registry-router-2.0408-ln0.qe.rhcloud.com : ok=50   changed=2    unreachable=0    failed=0   


INSTALLER STATUS *******************************************************************************
Initialization             : Complete
Service Catalog Install    : In Progress
	This phase can be restarted by running: playbooks/byo/openshift-cluster/service-catalog.yml

# curl -k https://apiserver.kube-service-catalog.svc/healthz
[+]ping ok
[+]poststarthook/generic-apiserver-start-informers ok
[+]poststarthook/start-service-catalog-apiserver-informers ok
[-]etcd failed: reason withheld
healthz check failed

# oc describe pod apiserver-fnnn8 -n kube-service-catalog
      --etcd-servers
      https://qe-wmeng36etcd-master-1:2379

# cat /etc/origin/master/master-config.yaml
etcdClientInfo:
  ca: master.etcd-ca.crt
  certFile: master.etcd-client.crt
  keyFile: master.etcd-client.key
  urls:
  - https://qe-wmeng36etcd-etcd-1:2379

Comment 3 Michael Gugino 2018-04-10 20:24:47 UTC
PR Created: https://github.com/openshift/openshift-ansible/pull/7891

Comment 5 Weihua Meng 2018-04-16 23:54:59 UTC
PR change not in latest RPM 
openshift-ansible-3.7.44-1.git.0.dbb912c.el7.noarch

Comment 9 Weihua Meng 2018-05-08 11:54:38 UTC
Fixed.
openshift-ansible-3.7.46-1.git.0.37f607e.el7

Steps:
1. install OCP3.6 with external etcd
2. upgrade to 3.7, service-catalog disabled
3. run service-catalog playbook
succeeded both RPM and containerized install.

PLAY RECAP ********************************************************************************
localhost                  : ok=11   changed=0    unreachable=0    failed=0   
qe-wmengrpm3611322-etcd-1.0508-l5l.qe.rhcloud.com : ok=44   changed=2    unreachable=0    failed=0   
qe-wmengrpm3611322-master-1.0508-l5l.qe.rhcloud.com : ok=156  changed=63   unreachable=0    failed=0   
qe-wmengrpm3611322-nrr-1.0508-l5l.qe.rhcloud.com : ok=51   changed=2    unreachable=0    failed=0   
qe-wmengrpm3611322-nrr-2.0508-l5l.qe.rhcloud.com : ok=51   changed=2    unreachable=0    failed=0   


INSTALLER STATUS ********************************************************************************
Initialization             : Complete
Service Catalog Install    : Complete

[root@qe-wmengrpm3611322-master-1 ~]# oc describe pod apiserver-x8gs8 -n kube-service-catalog

      --etcd-servers
      https://qe-wmengrpm3611322-etcd-1:2379


Kernel Version: 3.10.0-862.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.5 (Maipo)

Comment 12 errata-xmlrpc 2018-05-18 03:54:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1576


Note You need to log in before you can comment on or make changes to this bug.