Bug 1637737
Summary: | Service catalog controller segmentation fault | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Robert Bost <rbost> |
Component: | Service Catalog | Assignee: | Jay Boyd <jaboyd> |
Status: | CLOSED ERRATA | QA Contact: | Jian Zhang <jiazha> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3.10.0 | CC: | chezhang, cshereme, dyan, jaboyd, jfan, jiazha, rbost, ssadhale, zitang |
Target Milestone: | --- | ||
Target Release: | 3.11.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Previously if a Service Instance failed provisioning for the maximum reconciliation period (default is 7 days) the Service Catalog controller manager pod would crash trying to finalize the state of the failed instance. This is now properly handled and the instance is set to a failed provisioning status.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-11-20 03:10:46 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Robert Bost
2018-10-09 22:03:22 UTC
Is this blocking the customer? Is the Service Catalog controller manager pod constantly in a panic/restart/panic/restart state? IE the "bad" instance may need to be deleted. You indicated it's always reproducible - what are the steps to reproduce? Looks like this may be addressed by upstream https://github.com/kubernetes-incubator/service-catalog/pull/2259 This looks to only happen when the reconciliationRetryDuration is exceeded which is 7 days. So I imagine someone tried to provision an instance and the broker failed with a retry-able error and we kept retrying (with an exponential backoff) for 7 days? correction from comment #8 - fixed in 3.11.z in atomic-enterprise-service-catalog-3.11.0-0.30.0 The version info: [root@ip-172-18-0-56 ~]# oc exec controller-manager-x8jfr -- service-catalog --version v3.11.36;Upstream:v0.1.35 The Service Catalog works well, I did not find the crash after a day's running, and I recreated it. LGTM, verify it. [root@ip-172-18-0-56 ~]# oc get pods NAME READY STATUS RESTARTS AGE apiserver-bkhst 1/1 Running 0 1h controller-manager-x8jfr 1/1 Running 0 1h Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3537 |