Bug 1647511
| Summary: | Requirement of Liveness or Readiness probe in ds/controller-manager | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jay Boyd <jaboyd> |
| Component: | Service Catalog | Assignee: | Jay Boyd <jaboyd> |
| Status: | CLOSED ERRATA | QA Contact: | Jian Zhang <jiazha> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 3.10.0 | CC: | andreas.eger, chezhang, jaboyd, jiazha, mrobson, rbost, steven.barre, suchaudh, zitang |
| Target Milestone: | --- | ||
| Target Release: | 3.10.z | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Liveness & Readiness probes have been added for the Service Catalog API Server and Controller Manager. If these pods stop responding OpenShift will restart the pods. Previously there were no probes to monitor the health of Service Catalog.
|
Story Points: | --- |
| Clone Of: | 1630324 | Environment: | |
| Last Closed: | 2018-12-13 17:09:08 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1630324 | ||
| Bug Blocks: | |||
|
Comment 1
Jay Boyd
2018-11-07 18:43:59 UTC
fixed in 3.10.z by https://github.com/openshift/openshift-ansible/pull/10629 I install/uninstall the ServiceCatalog successfully via the openshift-ansible release-3.10 branch. It works well as we expected. Verify it.
The openshift-ansible info:
mac:openshift-ansible jianzhang$ git branch
master
* release-3.10
mac:openshift-ansible jianzhang$ git log
commit 12699eb551747059c7db622cadd9237dde84205b (HEAD -> release-3.10, origin/release-3.10)
Author: AOS Automation Release Team <aos-team-art>
Date: Sat Dec 1 07:38:28 2018 -0500
Automatic commit of package [openshift-ansible] release [3.10.83-1].
...
When I config another port(such as: 6444) for the controller-manager of the ServiceCatalog, we can see below info:
1) The liveness probe works well.
[root@ip-172-18-9-32 ~]# oc describe pods controller-manager-6qr4k
...
Normal Created 13s (x3 over 1m) kubelet, ip-172-18-9-32.ec2.internal Created container
Warning Unhealthy 13s (x5 over 1m) kubelet, ip-172-18-9-32.ec2.internal Liveness probe failed: Get https://10.128.0.10:6443/healthz: dial tcp 10.128.0.10:6443: getsockopt: connection refused
Normal Killing 13s (x2 over 1m) kubelet, ip-172-18-9-32.ec2.internal Killing container with id docker://controller-manager:Container failed liveness probe.. Container will be killed and recreated.
Normal Started 12s (x3 over 1m) kubelet, ip-172-18-9-32.ec2.internal Started container
[root@ip-172-18-9-32 ~]# oc get pods
NAME READY STATUS RESTARTS AGE
apiserver-gkfjf 1/1 Running 0 42m
controller-manager-6qr4k 0/1 Running 2 1m
2) The pods cannot server the traffic now. The readiness works well.
[root@ip-172-18-9-32 ~]# oc get ep
NAME ENDPOINTS AGE
apiserver 10.128.0.8:6443 42m
controller-manager 41m
The same operations to the apiserver of ServiceCatalog, it works as we expected.
[root@ip-172-18-9-32 ~]# oc exec controller-manager-sqbcz -- service-catalog --version
v3.10.83;Upstream:v0.1.19
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3750 |