Bug 1637737 - Service catalog controller segmentation fault
Summary: Service catalog controller segmentation fault
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Service Catalog
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.11.z
Assignee: Jay Boyd
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-09 22:03 UTC by Robert Bost
Modified: 2021-12-10 17:52 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously if a Service Instance failed provisioning for the maximum reconciliation period (default is 7 days) the Service Catalog controller manager pod would crash trying to finalize the state of the failed instance. This is now properly handled and the instance is set to a failed provisioning status.
Clone Of:
Environment:
Last Closed: 2018-11-20 03:10:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3676961 0 None None None 2018-11-02 23:33:53 UTC
Red Hat Product Errata RHBA-2018:3537 0 None None None 2018-11-20 03:11:30 UTC

Description Robert Bost 2018-10-09 22:03:22 UTC
Description of problem:
builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:1712
/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:1699
/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:856
/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:713
/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:275
/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:241
/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller.go:239
/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller.go:357
/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller.go:374
/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller.go:283
/usr/lib/golang/src/runtime/asm_amd64.s:2337
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0x158ccb0]

goroutine 200 [running]:
github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x111
panic(0x18cbdc0, 0x357c7f0)
	/usr/lib/golang/src/runtime/panic.go:491 +0x283
github.com/kubernetes-incubator/service-catalog/pkg/controller.(*controller).processProvisionFailure(0xc420704380, 0xc4202be480, 0x0, 0xc420a7e5a0, 0x1b8b801, 0x41, 0xc420a7e5a0)
	/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:1712 +0x40
github.com/kubernetes-incubator/service-catalog/pkg/controller.(*controller).processTerminalProvisionFailure(0xc420704380, 0xc4202be480, 0x0, 0xc420a7e5a0, 0x1b8b801, 0x41, 0xc420a7e5a0)
	/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:1699 +0x5b
github.com/kubernetes-incubator/service-catalog/pkg/controller.(*controller).processServiceInstancePollingFailureRetryTimeout(0xc420704380, 0xc4202be480, 0x0, 0x1b2e1aa, 0x7)
	/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:856 +0x25b
github.com/kubernetes-incubator/service-catalog/pkg/controller.(*controller).pollServiceInstance(0xc420704380, 0xc4206f3080, 0x1b2af7a, 0x4)
	/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:713 +0x742
github.com/kubernetes-incubator/service-catalog/pkg/controller.(*controller).reconcileServiceInstance(0xc420704380, 0xc4206f3080, 0x0, 0xc4206f3080)
	/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:275 +0x2f5
github.com/kubernetes-incubator/service-catalog/pkg/controller.(*controller).reconcileServiceInstanceKey(0xc420704380, 0xc4203aa5c0, 0x1c, 0xc4207bbc98, 0xc420534300)
	/builddir/build/BUILD/atomic-enterprise-service-catalog-git-1446.727628e/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:241 +0x2fc
github.com/kubernetes-incubator/service-catalog/pkg/controller.(*controller).(github.com/kubernetes-incubator/service-catalog/pkg/controller.reconcileServiceInstanceKey)-fm(0xc4203aa5c0, 0x1c, 0xc42056c700, 0x18220a0)


Version-Release number of selected component (if applicable):
registry.access.redhat.com/openshift3/ose-service-catalog:v3.10.34


How reproducible: Always for customer environment.

Comment 2 Jay Boyd 2018-10-10 13:30:34 UTC
Is this blocking the customer?  Is the Service Catalog controller manager pod constantly in a panic/restart/panic/restart state?  IE the "bad" instance may need to be deleted.

You indicated it's always reproducible - what are the steps to reproduce?

Looks like this may be addressed by upstream https://github.com/kubernetes-incubator/service-catalog/pull/2259

Comment 3 Jay Boyd 2018-10-10 13:57:25 UTC
This looks to only happen when the reconciliationRetryDuration is exceeded which is  7 days.  So I imagine someone tried to provision an instance and the broker failed with a retry-able error and we kept retrying (with an exponential backoff) for 7 days?

Comment 10 Jay Boyd 2018-11-01 00:10:55 UTC
correction from comment #8 - fixed in 3.11.z in atomic-enterprise-service-catalog-3.11.0-0.30.0

Comment 12 Jian Zhang 2018-11-02 07:01:13 UTC
The version info:
[root@ip-172-18-0-56 ~]# oc exec controller-manager-x8jfr -- service-catalog --version
v3.11.36;Upstream:v0.1.35

The Service Catalog works well, I did not find the crash after a day's running, and I recreated it. LGTM, verify it.

[root@ip-172-18-0-56 ~]# oc get pods
NAME                       READY     STATUS    RESTARTS   AGE
apiserver-bkhst            1/1       Running   0          1h
controller-manager-x8jfr   1/1       Running   0          1h

Comment 14 errata-xmlrpc 2018-11-20 03:10:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3537


Note You need to log in before you can comment on or make changes to this bug.