Bug 1507595
| Summary: | Plan can't restore to the previous good state or update to another acceptable plan | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Qixuan Wang <qixuan.wang> |
| Component: | Service Broker | Assignee: | Jeff Peeler <jpeeler> |
| Status: | CLOSED ERRATA | QA Contact: | Zihan Tang <zitang> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.7.0 | CC: | aos-bugs, chezhang, mstaeble, pmorie, smunilla, wsun, zitang |
| Target Milestone: | --- | ||
| Target Release: | 3.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
There were several problems related to updates: spec changes for instances were blocked even if there wasn't an on going operation, deleting a service instance that was updated to an invalid service plan would cause a crash, and instances weren't updated properly if a previous update had failed.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-12-13 19:26:48 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
https://github.com/kubernetes-incubator/service-catalog/issues/1487 tracks the issue with updating a ServiceInstance after a failed update. https://github.com/kubernetes-incubator/service-catalog/issues/1499 tracks controller-manager crashing when deleting a ServiceInstance with a plan name of a non-existent plan. Upstream PRs: https://github.com/kubernetes-incubator/service-catalog/pull/1501 https://github.com/kubernetes-incubator/service-catalog/pull/1502 Fixed in origin with: https://github.com/openshift/origin/pull/17166 Tested on OCP(openshift v3.7.0-0.196.0, kubernetes v1.7.6+a08f5eeb62, etcd 3.2.8, brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-service-catalog:v3.7.0-0.196.0.0, brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-ansible-service-broker:v3.7.0-0.196.0.0) The following 5,6 are not allowed to downgrade, so plan can rollback. That's correct. However, 2,4 are preventing restore from an invalid plan, is this expected? 1. [Edit] spec: dev->dev-123 [Describe] Message: The instance references a ClusterServicePlan that does not exist. spec: dev-123, status:dev 2. [Edit] spec: dev-123->dev/prod # serviceinstances "rh-rhscl-postgresql-apb-mh5s2" was not valid: # * spec: Forbidden: Another update for this service instance is in progress [root@host-172-16-120-8 ~]# oc edit serviceinstance rh-rhscl-postgresql-apb-mh5s2 error: serviceinstances "rh-rhscl-postgresql-apb-mh5s2" is invalid A copy of your changes has been stored to "/tmp/oc-edit-huczk.yaml" error: Edit cancelled, no valid changes were saved. [Describe] Message: The instance references a ClusterServicePlan that does not exist. spec: dev-123, status:dev 3. [Edit] spec: prod->prod-456 [Describe] Message: The instance references a ClusterServicePlan that does not exist. spec: prod-456, status:prod 4. [Edit] spec: prod-456->dev/prod # serviceinstances "rh-rhscl-postgresql-apb-xq2ns" was not valid: # * spec: Forbidden: Another update for this service instance is in progress [root@host-172-16-120-8 ~]# oc edit serviceinstance rh-rhscl-postgresql-apb-xq2ns error: serviceinstances "rh-rhscl-postgresql-apb-xq2ns" is invalid A copy of your changes has been stored to "/tmp/oc-edit-zrq5w.yaml" error: Edit cancelled, no valid changes were saved. Downgrade and rollback 5. [Edit] spec: prod->dev [Describe] Message: plan update not possible, spec:dev, status:prod 6. [Edit] spec: dev->prod [Describe] Message: The instance is being updated asynchronously, spec:prod, status:prod The failure of 2 and 4 is not expected. This bug was unfortunately not addressed completely. The failures are captured upstream in https://github.com/kubernetes-incubator/service-catalog/issues/1533. Version-Release number of selected component (if applicable): openshift v3.9.0-0.19.0 kubernetes v1.9.0-beta1 etcd 3.2.8 ose-ansible-service-broker:v3.9 ose-service-catalog:v3.9 Now we support plan rollback from a bad state (dev-123 -> dev, or prod456 -> prod) and downgrade (prod -> dev). I found plan can't be updated from an nonexistent one to another valid plan, for example: 1) dev-123 -> prod (x) -> dev (x) 2) prod-456 -> dev (x) -> prod (x) I'm not finding the previous comment to be true with the latest code:
$ kubectl get serviceinstances -n test-ns -o yaml
apiVersion: v1
items:
- apiVersion: servicecatalog.k8s.io/v1beta1
kind: ServiceInstance
metadata:
creationTimestamp: 2018-01-22T17:26:27Z
finalizers:
- kubernetes-incubator/service-catalog
generation: 1
name: ups-instance
namespace: test-ns
resourceVersion: "816"
selfLink: /apis/servicecatalog.k8s.io/v1beta1/namespaces/test-ns/serviceinstances/ups-instance
uid: 60b76eec-ff99-11e7-9b7f-0242ac110005
spec:
clusterServiceClassExternalName: user-provided-service
clusterServicePlanExternalName: invalid-default
externalID: 2542f01d-751b-45a5-ba5c-5d0986c42f08
parameters:
param-1: value-1
param-2: value-2
updateRequests: 0
status:
asyncOpInProgress: false
conditions:
- lastTransitionTime: 2018-01-22T17:26:27Z
message: 'The instance references a ClusterServicePlan that does not exist.
References a non-existent ClusterServicePlan (K8S: "" ExternalName: "invalid-default")
on ClusterServiceClass (K8S: "4f6e6cf6-ffdd-425f-a2c7-3c9258ad2468" ExternalName:
"user-provided-service") or there is more than one (found: 0)'
reason: ReferencesNonexistentServicePlan
status: "False"
type: Ready
deprovisionStatus: NotRequired
orphanMitigationInProgress: false
reconciledGeneration: 0
kind: List
metadata:
resourceVersion: ""
selfLink: ""
Next edit to "default" plan.
$ kubectl get serviceinstances -n test-ns -o yaml
apiVersion: v1
items:
- apiVersion: servicecatalog.k8s.io/v1beta1
kind: ServiceInstance
metadata:
creationTimestamp: 2018-01-22T17:26:27Z
finalizers:
- kubernetes-incubator/service-catalog
generation: 2
name: ups-instance
namespace: test-ns
resourceVersion: "821"
selfLink: /apis/servicecatalog.k8s.io/v1beta1/namespaces/test-ns/serviceinstances/ups-instance
uid: 60b76eec-ff99-11e7-9b7f-0242ac110005
spec:
clusterServiceClassExternalName: user-provided-service
clusterServiceClassRef:
name: 4f6e6cf6-ffdd-425f-a2c7-3c9258ad2468
clusterServicePlanExternalName: default
clusterServicePlanRef:
name: 86064792-7ea2-467b-af93-ac9694d96d52
externalID: 2542f01d-751b-45a5-ba5c-5d0986c42f08
parameters:
param-1: value-1
param-2: value-2
updateRequests: 0
status:
asyncOpInProgress: false
conditions:
- lastTransitionTime: 2018-01-22T17:27:42Z
message: The instance was provisioned successfully
reason: ProvisionedSuccessfully
status: "True"
type: Ready
deprovisionStatus: Required
externalProperties:
clusterServicePlanExternalID: 86064792-7ea2-467b-af93-ac9694d96d52
clusterServicePlanExternalName: default
parameterChecksum: 4fa544b50ca7a33fe5e8bc0780f1f36aa0c2c7098242db27bc8a3e21f4b4ab55
parameters:
param-1: value-1
param-2: value-2
orphanMitigationInProgress: false
reconciledGeneration: 2
kind: List
metadata:
resourceVersion: ""
selfLink: ""
Will look at confirming with openshift next.
Verified using the latest downstream image.
openshift v3.9.0-0.41.0
kubernetes v1.9.1+a0ce1bc657
ASB : 1.1.9 ;
brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-ansible-service-broker:v3.9
Service-catalog : 0.1.3
brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-service-catalog:v3.9
update instance :
dev -> dev123 -> prod
prod -> prod123 ->dev
This will succeed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3748 |
Description of problem: Do the following negative testing of update can cause controller-manager crash, and ServiceInstance can't break away from the bad plan. 1) Update plan to an non-exist one, then rollback to the previous good state (e.g. dev->invalid->dev) 2) Update plan to an non-exist one, then update to another optional plan (e.g. dev-invalid->prod) 3) Downgrade plan (e.g. prod->dev->prod) Version-Release number of selected component (if applicable): openshift v3.7.0-0.184.0 kubernetes v1.7.6+a08f5eeb62 etcd 3.2.8 ose-service-catalog v3.7.0-0.185.0.0 ose-ansible-service-broker v3.7.0-0.185.0.0 How reproducible: Always Steps to Reproduce: 1. Provision a postgreSQL APB on web UI, choose development plan. 2. The ClusterServiceClass has Plan Updatable: true by default. 3. Edit the ServiceInstance, update plan from dev to an invalid one, check ServiceInstance status and Broker log. 4. Restore the invalid plan to dev, check ServiceInstance status and Broker log. 5. Deprovision ServiceInstance. Actual results: 3. Plan: dev -> dev-abc [root@qe-chezhang-1030master-etcd-1 ~]# oc edit serviceinstance dh-rhscl-postgresql-apb-b7gbq serviceinstance "dh-rhscl-postgresql-apb-b7gbq" edited [root@qe-chezhang-1030master-etcd-1 ~]# oc describe serviceinstance dh-rhscl-postgresql-apb-b7gbq Name: dh-rhscl-postgresql-apb-b7gbq Namespace: qwang4 Labels: <none> Annotations: <none> API Version: servicecatalog.k8s.io/v1beta1 Kind: ServiceInstance Metadata: Creation Timestamp: 2017-10-30T14:38:22Z Finalizers: kubernetes-incubator/service-catalog Generate Name: dh-rhscl-postgresql-apb- Generation: 2 Resource Version: 92565 Self Link: /apis/servicecatalog.k8s.io/v1beta1/namespaces/qwang4/serviceinstances/dh-rhscl-postgresql-apb-b7gbq UID: fa91d52c-bd7f-11e7-bc55-0a580a800004 Spec: Cluster Service Class External Name: dh-rhscl-postgresql-apb Cluster Service Class Ref: Name: 27793015fe45db2fbc1deb7372cc4036 Cluster Service Plan External Name: dev-abc External ID: 6333e58b-33fc-4e4c-9670-85208a0c58b4 Parameters From: Secret Key Ref: Key: parameters Name: dh-rhscl-postgresql-apb-parameterszx24a Update Requests: 0 User Info: Groups: system:cluster-admins system:authenticated UID: Username: system:admin Status: Async Op In Progress: false Conditions: Last Transition Time: 2017-10-30T14:40:39Z Message: The instance references a ClusterServicePlan that does not exist. References a non-existent ClusterServicePlan (K8S: "" ExternalName: "dev-abc") on ClusterServiceClass (K8S: "27793015fe45db2fbc1deb7372cc4036" ExternalName: "dh-rhscl-postgresql-apb") or there is more than one (found: 0) Reason: ReferencesNonexistentServicePlan Status: False Type: Ready External Properties: Cluster Service Plan External Name: dev Parameter Checksum: f511137c0021f5169de49e662f0ec2830219a26e50968a57f2faa280408dfaa7 Parameters: Postgresql _ Database: <redacted> Postgresql _ User: <redacted> Postgresql _ Version: <redacted> User Info: Extra: Scopes . Authorization . Openshift . Io: user:full Groups: system:authenticated:oauth system:authenticated UID: Username: qwang Orphan Mitigation In Progress: false Reconciled Generation: 1 Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 2m 2m 1 service-catalog-controller-manager Warning ErrorWithParameters Failed to prepare ServiceInstance parameters nil: secrets "dh-rhscl-postgresql-apb-parameterszx24a" not found 2m 2m 1 service-catalog-controller-manager Normal Provisioning The instance is being provisioned asynchronously 2m 2m 1 service-catalog-controller-manager Normal ProvisionedSuccessfully The instance was provisioned successfully 37s 18s 13 service-catalog-controller-manager Warning ReferencesNonexistentServicePlan References a non-existent ClusterServicePlan (K8S: "" ExternalName: "dev-abc") on ClusterServiceClass (K8S: "27793015fe45db2fbc1deb7372cc4036" ExternalName: "dh-rhscl-postgresql-apb") or there is more than one (found: 0) 4. Plan: dev-abc -> dev Forbidden to update, but the spec still be updated. [root@qe-chezhang-1030master-etcd-1 ~]# oc edit serviceinstance dh-rhscl-postgresql-apb-b7gbq error: serviceinstances "dh-rhscl-postgresql-apb-b7gbq" is invalid A copy of your changes has been stored to "/tmp/oc-edit-5yy6g.yaml" error: Edit cancelled, no valid changes were saved. [root@qe-chezhang-1030master-etcd-1 ~]# oc describe serviceinstance dh-rhscl-postgresql-apb-b7gbq Name: dh-rhscl-postgresql-apb-b7gbq Namespace: qwang4 Labels: <none> Annotations: <none> API Version: servicecatalog.k8s.io/v1beta1 Kind: ServiceInstance Metadata: Creation Timestamp: 2017-10-30T14:38:22Z Finalizers: kubernetes-incubator/service-catalog Generate Name: dh-rhscl-postgresql-apb- Generation: 2 Resource Version: 92565 Self Link: /apis/servicecatalog.k8s.io/v1beta1/namespaces/qwang4/serviceinstances/dh-rhscl-postgresql-apb-b7gbq UID: fa91d52c-bd7f-11e7-bc55-0a580a800004 Spec: Cluster Service Class External Name: dh-rhscl-postgresql-apb Cluster Service Class Ref: Name: 27793015fe45db2fbc1deb7372cc4036 Cluster Service Plan External Name: dev-abc External ID: 6333e58b-33fc-4e4c-9670-85208a0c58b4 Parameters From: Secret Key Ref: Key: parameters Name: dh-rhscl-postgresql-apb-parameterszx24a Update Requests: 0 User Info: Groups: system:cluster-admins system:authenticated UID: Username: system:admin Status: Async Op In Progress: false Conditions: Last Transition Time: 2017-10-30T14:40:39Z Message: The instance references a ClusterServicePlan that does not exist. References a non-existent ClusterServicePlan (K8S: "" ExternalName: "dev-abc") on ClusterServiceClass (K8S: "27793015fe45db2fbc1deb7372cc4036" ExternalName: "dh-rhscl-postgresql-apb") or there is more than one (found: 0) Reason: ReferencesNonexistentServicePlan Status: False Type: Ready External Properties: Cluster Service Plan External Name: dev Parameter Checksum: f511137c0021f5169de49e662f0ec2830219a26e50968a57f2faa280408dfaa7 Parameters: Postgresql _ Database: <redacted> Postgresql _ User: <redacted> Postgresql _ Version: <redacted> User Info: Extra: Scopes . Authorization . Openshift . Io: user:full Groups: system:authenticated:oauth system:authenticated UID: Username: qwang Orphan Mitigation In Progress: false Reconciled Generation: 1 Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 6m 6m 1 service-catalog-controller-manager Warning ErrorWithParameters Failed to prepare ServiceInstance parameters nil: secrets "dh-rhscl-postgresql-apb-parameterszx24a" not found 6m 6m 1 service-catalog-controller-manager Normal Provisioning The instance is being provisioned asynchronously 6m 6m 1 service-catalog-controller-manager Normal ProvisionedSuccessfully The instance was provisioned successfully 4m 1m 16 service-catalog-controller-manager Warning ReferencesNonexistentServicePlan References a non-existent ClusterServicePlan (K8S: "" ExternalName: "dev-abc") on ClusterServiceClass (K8S: "27793015fe45db2fbc1deb7372cc4036" ExternalName: "dh-rhscl-postgresql-apb") or there is more than one (found: 0) 5. The project of ServiceInstance Hangs in Terminating. Controller-manager gets panic like this: E1030 15:52:42.323667 1 runtime.go:66] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference) /builddir/build/BUILD/atomic-openshift-git-0.e975556/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72 /builddir/build/BUILD/atomic-openshift-git-0.e975556/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65 /builddir/build/BUILD/atomic-openshift-git-0.e975556/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51 /usr/lib/golang/src/runtime/asm_amd64.s:514 /usr/lib/golang/src/runtime/panic.go:489 /usr/lib/golang/src/runtime/panic.go:63 /usr/lib/golang/src/runtime/signal_unix.go:290 /builddir/build/BUILD/atomic-openshift-git-0.e975556/cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog/_output/local/go/src/github.com/kubernetes-incubator/service-catalog/pkg/controller/controller.go:269 Expected results: 3. It's better to guide users how to do a correct update (know the correct one doesn't mean do correctly, so limited operations/options is better I think). 4. A bad plan can be updated to a good one . 5. Controller-manager can't crash. Additional info: