Bug 1647511
Summary: | Requirement of Liveness or Readiness probe in ds/controller-manager | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jay Boyd <jaboyd> |
Component: | Service Catalog | Assignee: | Jay Boyd <jaboyd> |
Status: | CLOSED ERRATA | QA Contact: | Jian Zhang <jiazha> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.10.0 | CC: | andreas.eger, chezhang, jaboyd, jiazha, mrobson, rbost, steven.barre, suchaudh, zitang |
Target Milestone: | --- | ||
Target Release: | 3.10.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Liveness & Readiness probes have been added for the Service Catalog API Server and Controller Manager. If these pods stop responding OpenShift will restart the pods. Previously there were no probes to monitor the health of Service Catalog.
|
Story Points: | --- |
Clone Of: | 1630324 | Environment: | |
Last Closed: | 2018-12-13 17:09:08 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1630324 | ||
Bug Blocks: |
Comment 1
Jay Boyd
2018-11-07 18:43:59 UTC
fixed in 3.10.z by https://github.com/openshift/openshift-ansible/pull/10629 I install/uninstall the ServiceCatalog successfully via the openshift-ansible release-3.10 branch. It works well as we expected. Verify it. The openshift-ansible info: mac:openshift-ansible jianzhang$ git branch master * release-3.10 mac:openshift-ansible jianzhang$ git log commit 12699eb551747059c7db622cadd9237dde84205b (HEAD -> release-3.10, origin/release-3.10) Author: AOS Automation Release Team <aos-team-art> Date: Sat Dec 1 07:38:28 2018 -0500 Automatic commit of package [openshift-ansible] release [3.10.83-1]. ... When I config another port(such as: 6444) for the controller-manager of the ServiceCatalog, we can see below info: 1) The liveness probe works well. [root@ip-172-18-9-32 ~]# oc describe pods controller-manager-6qr4k ... Normal Created 13s (x3 over 1m) kubelet, ip-172-18-9-32.ec2.internal Created container Warning Unhealthy 13s (x5 over 1m) kubelet, ip-172-18-9-32.ec2.internal Liveness probe failed: Get https://10.128.0.10:6443/healthz: dial tcp 10.128.0.10:6443: getsockopt: connection refused Normal Killing 13s (x2 over 1m) kubelet, ip-172-18-9-32.ec2.internal Killing container with id docker://controller-manager:Container failed liveness probe.. Container will be killed and recreated. Normal Started 12s (x3 over 1m) kubelet, ip-172-18-9-32.ec2.internal Started container [root@ip-172-18-9-32 ~]# oc get pods NAME READY STATUS RESTARTS AGE apiserver-gkfjf 1/1 Running 0 42m controller-manager-6qr4k 0/1 Running 2 1m 2) The pods cannot server the traffic now. The readiness works well. [root@ip-172-18-9-32 ~]# oc get ep NAME ENDPOINTS AGE apiserver 10.128.0.8:6443 42m controller-manager 41m The same operations to the apiserver of ServiceCatalog, it works as we expected. [root@ip-172-18-9-32 ~]# oc exec controller-manager-sqbcz -- service-catalog --version v3.10.83;Upstream:v0.1.19 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3750 |