Bug 1555245

Summary: service-catalog crashed with segment fault when update serviceinstance to an invalid plan then delete project
Product: OpenShift Container Platform Reporter: Zihan Tang <zitang>
Component: Service CatalogAssignee: Jay Boyd <jaboyd>
Status: CLOSED ERRATA QA Contact: Zihan Tang <zitang>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.9.0CC: chezhang, jaboyd, jiazha, zhsun, zitang
Target Milestone: ---   
Target Release: 3.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Plan ID was being pulled from the Spec which is not right, it should always come from the InProgressProperties during update & delete. Consequence: If plan was invalid, it may result in a seg fault. Fix: Get the plan from the InProgressProperties Result:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-07-30 19:10:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Zihan Tang 2018-03-14 09:29:59 UTC
Description of problem:
After update a serviceinstance to an invalid plan , then delete the project , the service-catalog crashed with error : 
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x150e58d]


Version-Release number of selected component (if applicable):
service-catalog: v0.1.9
asb: v1.1.16

How reproducible:
Always

Steps to Reproduce:
1. provision a postgresql with dev plan in namespace 'post-1'
2. update the serviceinstance plan from 'dev' to 'dev-123' , then describing the serviceinstance will get message  as expected.
''The instance references a ClusterServicePlan that does not exist. References a non-existent ClusterServicePlan''
3. delete the project 'post-1'
[root@host-172-16-120-9 ~]# oc delete project post-1


Actual results:
service-catalog crashed at step3
[root@host-172-16-120-130 ~]# oc get pod 
NAME                       READY     STATUS             RESTARTS   AGE
apiserver-q7bzj            1/1       Running            1          1d
controller-manager-h27sh   0/1       CrashLoopBackOff   27         1h

[root@host-172-16-120-130 ~]# oc logs -f controller-manager-h27sh
.......
/usr/lib/golang/src/runtime/asm_amd64.s:2337
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x150e58d]

goroutine 113 [running]:
github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/builddir/build/BUILD/atomic-openshift-git-0.e1a30c3/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x111
panic(0x17ca640, 0x31ad9b0)
	/usr/lib/golang/src/runtime/panic.go:491 +0x283
github.com/kubernetes-incubator/service-catalog/pkg/controller.(*controller).prepareServiceInstanceLastOperationRequest(0xc4203fb9e0, 0xc4204ce240, 0xc42033e4b0, 0x0, 0xc42026ea60, 0x16, 0x30375a0)
	/builddir/build/BUILD/atomic-openshift-git-0.e1a30c3/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:1382 +0xdd
github.com/kubernetes-incubator/service-catalog/pkg/controller.(*controller).pollServiceInstance(0xc4203fb9e0, 0xc4204ce240, 0x1a0af5b, 0x4)
	/builddir/build/BUILD/atomic-openshift-git-0.e1a30c3/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:569 +0x2b5
github.com/kubernetes-incubator/service-catalog/pkg/controller.(*controller).reconcileServiceInstance(0xc4203fb9e0, 0xc42087c400, 0x0, 0xc42087c400)
	/builddir/build/BUILD/atomic-openshift-git-0.e1a30c3/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:240 +0x279
github.com/kubernetes-incubator/service-catalog/pkg/controller.(*controller).reconcileServiceInstanceKey(0xc4203fb9e0, 0xc420018da0, 0x17, 0xc420763c98, 0x0)
	/builddir/build/BUILD/atomic-openshift-git-0.e1a30c3/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller_instance.go:224 +0x2fc
github.com/kubernetes-incubator/service-catalog/pkg/controller.(*controller).(github.com/kubernetes-incubator/service-catalog/pkg/controller.reconcileServiceInstanceKey)-fm(0xc420018da0, 0x17, 0xc4206e0120, 0x17270c0)
	/builddir/build/BUILD/atomic-openshift-git-0.e1a30c3/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller.go:179 +0x3e
github.com/kubernetes-incubator/service-catalog/pkg/controller.worker.func1.1(0x3037a20, 0xc4206e0120, 0xc4203d77c0, 0x1, 0xf, 0x1a17402, 0xf, 0x42c000)
	/builddir/build/BUILD/atomic-openshift-git-0.e1a30c3/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller.go:229 +0xe7
github.com/kubernetes-incubator/service-catalog/pkg/controller.worker.func1()
	/builddir/build/BUILD/atomic-openshift-git-0.e1a30c3/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller.go:246 +0x8d
github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc42067e480)
	/builddir/build/BUILD/atomic-openshift-git-0.e1a30c3/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x5e
github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc42067e480, 0x3b9aca00, 0x0, 0xc42067e401, 0xc42049c600)
	/builddir/build/BUILD/atomic-openshift-git-0.e1a30c3/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134 +0xbd
github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc42067e480, 0x3b9aca00, 0xc42049c600)
	/builddir/build/BUILD/atomic-openshift-git-0.e1a30c3/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
github.com/kubernetes-incubator/service-catalog/pkg/controller.createWorker.func1(0x3037a20, 0xc4206e0120, 0x1a17402, 0xf, 0xf, 0xc4203fb901, 0xc4203d77c0, 0xc42049c600, 0xc42027f8b0)
	/builddir/build/BUILD/atomic-openshift-git-0.e1a30c3/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller.go:208 +0x91
created by github.com/kubernetes-incubator/service-catalog/pkg/controller.createWorker
	/builddir/build/BUILD/atomic-openshift-git-0.e1a30c3/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller.go:207 +0xbe



Expected results:
delete project successfully , and service-catalog works well.

Additional info:
after step 2 , if I update the serviceinstance plan back to a valid one , then the project can be deleted and service-catalog do not crash.

Comment 1 Zhang Cheng 2018-03-14 09:35:37 UTC
We draft this bug as two reasons:
1. service-catalog should not crash although this is a negative case
2. we understand that update plan of serviceinstance should base on web console allowed (such as dev/prod), but have similar bug got fix in previous https://bugzilla.redhat.com/show_bug.cgi?id=1507595

Comment 2 Zhang Cheng 2018-03-14 09:38:19 UTC
I set target release to 3.10 as two reasons, please correct me if you think 3.9.0 or 3.9.z is better.
1. Normal user cannot update a serviceinstance to a invalid plan from web console
2. Will not hit this issue if user not try to delete project although user update a serviceinstance to a invalid plan from backend.

Comment 3 Jay Boyd 2018-04-11 18:58:16 UTC
There has been a lot of change in the instance & binding reconcilation control loops upstream.  I tried to reproduce this in master and I can not reproduce a crash.  When I deleted the namespace the instance was removed.  No crash.   I believe this root issue has been fixed, could you please see if you can reproduce in 3.10?

Comment 4 Zihan Tang 2018-04-12 01:52:39 UTC
I'll try to reproduce when 3.10 image is ready. If couldn't produce , we can marked as fixed in 3.10.

Comment 5 Zihan Tang 2018-04-13 06:30:50 UTC
I use service catalog v0.1.13, I can still reproduce the bug. the controller-manager crashed with error
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x15320f8]

step:
1. provision apb in dev plan
2. edit serviceinstace plan to dev123
3. delete project

Comment 6 Jay Boyd 2018-04-16 18:28:33 UTC
I reproduced.  The key is to use a Broker that implements an async delete.  I'm reviewing an upstream fix.

Comment 7 Jay Boyd 2018-05-17 14:27:13 UTC
will be fixed upstream by https://github.com/kubernetes-incubator/service-catalog/pull/1941  I'm hoping to get that merged today so we can pick it up in openshift.

Comment 8 Jay Boyd 2018-05-18 21:17:20 UTC
This is in v0.1.19 of upstream service catalog and will be picked up with OpenShift builds done AFTER May 18 17:00 US Eastern.  I believe this will be ​3.10.0-0.48.0 or newer

Comment 10 Zihan Tang 2018-05-23 09:26:05 UTC
image is ready ,change it ot ON_QA

Comment 11 Zihan Tang 2018-05-23 09:29:05 UTC
Verified
v3.10.0-0.50.0;Upstream:v0.1.19

Comment 13 errata-xmlrpc 2018-07-30 19:10:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816