Bug 1503829
| Summary: | Cannot force delete ServiceInstances when deprovision fails | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | David Zager <dzager> |
| Component: | Service Broker | Assignee: | Paul Morie <pmorie> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Jian Zhang <jiazha> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 3.7.0 | CC: | admin, aos-bugs, chezhang, dzager, jaboyd, jdesousa, jiazha, mstaeble, pmorie, qixuan.wang, vlaad, wmeng |
| Target Milestone: | --- | ||
| Target Release: | 3.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: |
closed as working as intended, we published a document about dealing with stuck resources: https://access.redhat.com/articles/3441161
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-08-29 21:25:14 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
David Zager
2017-10-18 20:49:41 UTC
This is definitely a bug - I created https://github.com/kubernetes-incubator/service-catalog/issues/1437 for this. Fix will be delivered into origin in https://github.com/openshift/origin/pull/17075. Users can't deprovision a serviceinstance on web UI? Just found "delete" button, and asb log won't show "DEPROVISION" process when I click it. Thanks for your info. I'm going to follow steps you provided. I can answer question #2 on @mstaeble's behalf as best I can. While nothing can be guaranteed about a ServiceInstance that ends up in the Unknown or Error state, what you are seeing with Pods and RCs remaining is related to https://bugzilla.redhat.com/show_bug.cgi?id=1508969. Presumably, with the fake command in the deprovision playbook, the broker is responding to the deprovision request with a 500 Internal Server Error status. The service catalog does not know at that point what the state of the instance is in the broker, hence the Ready/Unknown condition. The service catalog will continue to make deprovision requests for the ServiceInstance to the broker until either (1) a response is received that tells the service catalog definitively what the state of the instance is in the broker or (2) the reconciliation retry duration elapses. @David @Matthew Thanks your clarify! The question #2 of this bug will depend on the https://bugzilla.redhat.com/show_bug.cgi?id=1508969 This looks to me to be working as expected. When you delete the ServiceInstance, the service catalog attempts to deprovision the resource on the broker. The deprovision request is failing. The service catalog will retry the deprovision request until the reconciliation retry duration elapses. The service catalog will not remove the finalizer for the ServiceInstance unless the deprovision request completes successfully. This is done so that the user has an opportunity to see that the ServiceInstance was not deprovisioned and coordinate with the broker directly to delete whatever resources need to be deleted.
In Step 3, you say
> the serviceinstance was kept as expected, but I could not force-delete it
What commands were made to force-delete the ServiceInstance? And what was the output from the attempted force-delete?
@Matthew Yeah... I see, but as this https://github.com/kubernetes-incubator/service-catalog/issues/1437 described: > Instead, we should leave the instance in a failed state until the user force-deletes it, I think here the "force-delete" just mean the "oc delete xxx" commands used by a user. That "force-delete" in step 3 means this too. Sorry for not clear here. So, for the above example, if I want to delete the serviceinstance, how to do that? In my opinion, as a user or administrator, if I want to delete a resource created by the provision, the resource should be deleted. A force-delete is an "oc delete" with parameters "--grace-period=0" and "--force".
For example,
oc delete serviceinstance dh-rhscl-mysql-apb-dbwdr \
-n instance6 \
--grace-period=0 \
--force
I think the failed serviceinstance should be deleted by manually. The current test result not looks good to us. I'm changing the status to ASSIGNED. You can move back if we have a mistake. Thanks. There is an issue with force deleting ServiceInstances and ServiceBindings that is being tracked upstream with https://github.com/kubernetes-incubator/service-catalog/issues/1551. However, the basic intention of this bug is that a ServiceInstance that cannot be deprovisioned successfully should not be removed from storage until it is force deleted. That basic intention is working as expected. Update tile of bug for better trace the current issue. Furthermore, I hit a similar issue, and not sure if can be covered in pr: https://github.com/kubernetes-incubator/service-catalog/issues/1551 1. Create a serviceinstance 2. Edit clusterservicebroker url to a invalid value 3. Delete the serviceinstance 4. Force delete serviceinstance Actual result: Cannot force delete the serviceinstance Expect result: Force delete should be succeed. Is this still a bug, given the clarification made in https://bugzilla.redhat.com/show_bug.cgi?id=1541350#c5 ? Changed status to ON_QA since Paul's clarification. Per Paul's description, we need to delete the "finalizers" first before deleting a failed serviceinstance.
1, A failed serviceinstnace.
[root@host-172-16-120-7 ~]# oc get serviceinstance -n jian
NAME AGE
dh-hello-test-apb-tn8nk 14m
dh-hello-test-apb-xpkls 12m
[root@host-172-16-120-7 ~]# oc describe serviceinstance dh-hello-test-apb-tn8nk -n jian
Name: dh-hello-test-apb-tn8nk
Namespace: jian
Labels: <none>
Annotations: <none>
API Version: servicecatalog.k8s.io/v1beta1
Kind: ServiceInstance
Metadata:
Creation Timestamp: 2018-02-27T07:00:10Z
Finalizers:
kubernetes-incubator/service-catalog
Generate Name: dh-hello-test-apb-
Generation: 1
Resource Version: 43725
Self Link: /apis/servicecatalog.k8s.io/v1beta1/namespaces/jian/serviceinstances/dh-hello-test-apb-tn8nk
UID: d9f0ed18-1b8b-11e8-8928-0a580a800004
Spec:
Cluster Service Class External Name: dh-hello-test-apb
Cluster Service Class Ref:
Name: 0a8f417c71d090c39dc2ba73f538c148
Cluster Service Plan External Name: faildeprovision
Cluster Service Plan Ref:
Name: befb688fbb048a00128951e8b68913b4
External ID: ba841d04-713c-4b49-a758-f425de1ec28b
Update Requests: 0
User Info:
Extra:
Scopes . Authorization . Openshift . Io:
user:full
Groups:
system:authenticated:oauth
system:authenticated
UID:
Username: jiazha
Status:
Async Op In Progress: false
Conditions:
Last Transition Time: 2018-02-27T07:00:11Z
Message: Provision call failed: Error occurred during provision. Please contact administrator if it persists.
Reason: ProvisionCallFailed
Status: False
Type: Ready
Last Transition Time: 2018-02-27T07:01:13Z
Message: Provision call failed: Error occurred during provision. Please contact administrator if it persists.
Reason: ProvisionCallFailed
Status: True
Type: Failed
Deprovision Status: Required
Orphan Mitigation In Progress: false
Reconciled Generation: 1
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Provisioning 14m service-catalog-controller-manager The instance is being provisioned asynchronously
Warning ProvisionCallFailed 13m (x2 over 13m) service-catalog-controller-manager Provision call failed: Error occurred during provision. Please contact administrator if it persists.
2, Delete the "finalizers". Delete the below content:
finalizers:
- kubernetes-incubator/service-catalog
[root@host-172-16-120-7 ~]# oc edit serviceinstance dh-hello-test-apb-tn8nk -n jian
serviceinstance "dh-hello-test-apb-tn8nk" edited
3, Delete this serviceinstnace
[root@host-172-16-120-7 ~]# oc delete serviceinstance dh-hello-test-apb-tn8nk -n jian
serviceinstance "dh-hello-test-apb-tn8nk" deleted
4, Check it if still exist.
[root@host-172-16-120-7 ~]# oc get serviceinstance -n jian
No resources found.
So, LGTM. Changed the status to "VERIFIED".
Furthermore, we have a doc bug to trace this clarification. Here: https://bugzilla.redhat.com/show_bug.cgi?id=1548618
Assigning back to Paul as I actually had nothing to do with this issue. |