Bug 1504927
Summary: | ASB Failed provision marked successful even on pod error | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | David Zager <dzager> |
Component: | Service Broker | Assignee: | Jesus M. Rodriguez <jesusr> |
Status: | CLOSED ERRATA | QA Contact: | Zhang Cheng <chezhang> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.7.0 | CC: | aos-bugs, jesusr, shurley, wmeng |
Target Milestone: | --- | ||
Target Release: | 3.7.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause:
Ignoring errors from the provisioning of APBs.
Consequence:
Failed provisions are marked as successful.
Fix:
We propagate the error to the appropriate level in the broker, which will send the error message to the service-catalog.
Result:
Failed provisions will now be marked as failed.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2017-11-28 22:18:47 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
David Zager
2017-10-20 19:09:54 UTC
Seems we are allowing an error to be marked as done: https://github.com/openshift/ansible-service-broker/blob/master/pkg/apb/ext_creds.go#L70-L72 } else if podCompleted && err.Error() == failedToExec.Error() { log.Notice("[%s] APB completed", podname) return nil, nil So we need to decide what's really a "failed" provision or not. So the ext_creds.go tries to exec into the failed pod: $ oc exec apb-7bcfd4a5-8e15-4883-8cb7-39d2511b52 b6 broker-bind-creds --namespace=dh-fail-apb-prov-gm9w1 error: cannot exec into a container in a completed pod; current phase is Failed And looking at ext_creds, we will mark the podCompleted podCompleted := strings.Contains(string(output), "current phase is Succeeded") || strings.Contains(string(output), "cannot exec into a container in a completed pod") Notice when I ran oc exec I get "completed pod". So this combined with the if condition from comment #1 we will end up marking it successful. I believe the problem is that we get this same condition when we are waiting for the pod to come up initially while it is still provisioning. So we'll have to balance that with an actual failed provision. [2017-10-21T04:46:47.425Z] [INFO] Successfully created apb sandbox: [ apb-7bcfd4a5-8e15-4883-8cb7-39d2511b52b6 ], with edit permissions in namespace dh-fail-apb-prov-gm9w1 [2017-10-21T04:46:47.426Z] [NOTICE] Creating pod "apb-7bcfd4a5-8e15-4883-8cb7-39d2511b52b6" in the dh-fail-apb-prov-gm9w1 namespace [2017-10-21T04:46:47.485Z] [DEBUG] Calling monitorOutput on apb-7bcfd4a5-8e15-4883-8cb7-39d2511b52b6 [2017-10-21T04:46:47.772Z] [INFO] error: unable to upgrade connection: pod does not exist [2017-10-21T04:46:47.772Z] [WARNING] [apb-7bcfd4a5-8e15-4883-8cb7-39d2511b52b6] Retry attempt 1: Failed to exec into the container [2017-10-21T04:46:47.772Z] [WARNING] [apb-7bcfd4a5-8e15-4883-8cb7-39d2511b52b6] Retry attempt 1: exec into apb-7bcfd4a5-8e15-4883-8cb7-39d2511b52b6 failed [2017-10-21T04:46:51.258Z] [DEBUG] service_id: 3b18d5937f0cbb99ee6693993a525be7 [2017-10-21T04:46:51.259Z] [DEBUG] plan_id: e969ce59fdde47f0fda9c871a5e2ef0b [2017-10-21T04:46:51.259Z] [DEBUG] operation: 61d15ee4-fe22-4ad9-b044-c325e9123a5d 172.17.0.3 - - [21/Oct/2017:04:46:51 +0000] "GET /ansible-service-broker/v2/service_instances/1fb8eff3-e365-46cc-af40-9891df0811f9/last_operation?operation=61d15ee4-fe22-4ad9-b044-c325e9123a 5d&plan_id=e969ce59fdde47f0fda9c871a5e2ef0b&service_id=3b18d5937f0cbb99ee6693993a525be7 HTTP/1.1" 200 29 [2017-10-21T04:46:52.997Z] [NOTICE] [apb-7bcfd4a5-8e15-4883-8cb7-39d2511b52b6] APB completed [2017-10-21T04:46:52.998Z] [INFO] Destroying APB sandbox... Fixed by PR 534: https://github.com/openshift/ansible-service-broker/pull/534 *** Bug 1470851 has been marked as a duplicate of this bug. *** PR was not merged in latest image v3.7.0-0.191.0.0, I will retest while PR is merged. Verified and LGTM with svc-catalog & asb image v3.7.0-0.194.0.0 Logs in asb: [2017-11-05T15:42:55.213Z] [NOTICE] [apb-5fce1aec-0625-4e34-bca9-1bd968361774] APB failed [2017-11-05T15:42:55.213Z] [INFO] Destroying APB sandbox... [2017-11-05T15:42:55.584Z] [ERROR] apb::provision error occurred [2017-11-05T15:42:55.584Z] [ERROR] APB failed [2017-11-05T15:42:55.584Z] [ERROR] broker::Provision error occurred. [2017-11-05T15:42:55.584Z] [ERROR] APB failed [2017-11-05T15:42:55.584Z] [ERROR] Provision job reporting error: APB failed Logs in controller-manager: I1105 15:41:34.708240 1 controller.go:326] ServiceInstance "fail-prov/rh-fail-apb-v8vdd": Creating client for ClusterServiceBroker ansible-service-broker, URL: https://asb.openshift-ansible-service-broker.svc:1338/ansible-service-broker I1105 15:41:34.708470 1 controller_instance.go:1174] ServiceInstance "fail-prov/rh-fail-apb-v8vdd": Polling last operation I1105 15:41:34.725917 1 controller_instance.go:1303] ServiceInstance "fail-prov/rh-fail-apb-v8vdd": Poll returned "in progress" : Response description: <nil> I1105 15:41:34.725951 1 controller_instance.go:1406] ServiceInstance "fail-prov/rh-fail-apb-v8vdd": Last operation not completed (still in progress) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188 |