Description of problem: `apb test` is failing to retrieve test results due to a change in the kubernetes client. How reproducible: 100% Steps to Reproduce: 1. create an apb test (or `git clone git:cfchase/test-apb.git`) 2. cd into test apb directory 3. run `apb test` Actual results: ...<snip>... Pod phase Failed without returning test results Unable to retrieve test result. Deleting project apb-test-mediawiki-apb Project deleted Expected results: Test successfully passed Additional info: https://github.com/ansibleplaybookbundle/ansible-playbook-bundle/issues/190
https://github.com/ansibleplaybookbundle/ansible-playbook-bundle/pull/196
It looks like this failed because this was run on hello-world-apb, which does not have any tests. (under playbooks you would have test.yml). For testing, I created an apb in my own repo https://github.com/cfchase/test-apb. You can use that one or create your own basic tests. Steps to Reproduce: 1. configure OpenShift, broker, and apb tool, etc 2. git clone git:cfchase/test-apb.git 3. cd test-apb 4. apb test
Verified following below steps will succeed. 1. git clone git:cfchase/test-apb.git 2. build the image and push it to dockerhub. 3. cofigure OpenShift ASB, add the dockerhub registry - type: dockerhub name: zitang url: docker.io org: zitangbj tag: white_list: [.*apb$] 4. then cd to test-apb 5. run: apb test --tag docker.io/zitangbj/test-apb succeed: Successfully built APB image: docker.io/zitangbj/test-apb Creating project apb-test-test-apb-4u866 Created project Creating service account in apb-test-test-apb-4u866 Created service account Creating role binding for apb-test-test-apb-4u866 in apb-test-test-apb-4u866 Created Role Binding Creating pod with image docker.io/zitangbj/test-apb in apb-test-test-apb-4u866 Created Pod Test successfully passed Project deleted If do not push the apb to a registry and run " apb test", it will fail. The test pod log: NAME READY STATUS RESTARTS AGE apb-test-test-apb-dpwb2hggnf 0/1 ImagePullBackOff 0 40m [root@host-172-16-120-49 ~]# oc logs apb-test-test-apb-dpwb2hggnf Error from server (BadRequest): container "apb-test-test-apb-dpwb2" in pod "apb-test-test-apb-dpwb2hggnf" is waiting to start: trying and failing to pull image [root@host-172-16-120-49 ~]# oc describe pod apb-test-test-apb-dpwb2hggnf ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 40m default-scheduler Successfully assigned apb-test-test-apb-dpwb2hggnf to 172.16.120.107 Normal SuccessfulMountVolume 40m kubelet, 172.16.120.107 MountVolume.SetUp succeeded for volume "apb-test-test-apb-dpwb2-token-2wnlj" Normal Pulling 40m kubelet, 172.16.120.107 pulling image "test-apb" Warning Failed 40m kubelet, 172.16.120.107 Failed to pull image "test-apb": rpc error: code = Unknown desc = Error: image library/test-apb:latest not found Warning Failed 40m kubelet, 172.16.120.107 Error: ErrImagePull Normal SandboxChanged 39m (x21 over 40m) kubelet, 172.16.120.107 Pod sandbox changed, it will be killed and re-created. Warning Failed 5m (x141 over 38m) kubelet, 172.16.120.107 Error: ImagePullBackOff Normal BackOff 15s (x162 over 38m) kubelet, 172.16.120.107 Back-off pulling image "test-apb" I'm not sure whether you mean to run 'apb test --tag'. But just run 'apb test' outside the cluster , it will failed.
I'll look into the apb test failure. It sounds like we need it to push to the internal registry as part of the apb test command.
https://github.com/ansibleplaybookbundle/ansible-playbook-bundle/pull/226
apb test should work as expected now without using the --tag argument.
cchase, I use the latest apb image and rpm package , the apb test still failed. step: 1. git clone the test-apb; 2. login to the server; 3. run apb test. [root@localhost test-apb]# apb test Found registry IP at: 172.30.80.74:5000 Finished writing dockerfile. Building APB using tag: [172.30.80.74:5000/openshift/test-apb] Successfully built APB image: 172.30.80.74:5000/openshift/test-apb Error accessing the docker API. Is the daemon running? Exception occurred! 500 Server Error: Internal Server Error ("Get https://172.30.80.74:5000/v1/users/: dial tcp 172.30.80.74:5000: i/o timeout") the local docker is started.
It looks like a problem with the push. Are you running APB on one host and pushing to a broker/docker registry on a remote host? I get similar with minishift: $ apb test Found registry IP at: 172.30.1.1:5000 Finished writing dockerfile. Building APB using tag: [172.30.1.1:5000/openshift/test-apb] Successfully built APB image: 172.30.1.1:5000/openshift/test-apb Error accessing the docker API. Is the daemon running? Exception occurred! 500 Server Error: Internal Server Error ("Get http://172.30.1.1:5000/v1/users/: dial tcp 172.30.1.1:5000: getsockopt: no route to host") If I run: eval $(minishift docker-env) Which sets: DOCKER_CERT_PATH=/home/jmontleo/.minishift/certs DOCKER_HOST=tcp://192.168.42.253:2376 DOCKER_TLS_VERIFY=1 It should work: $ apb test Finished writing dockerfile. Building APB using tag: [172.30.1.1:5000/openshift/test-apb] Successfully built APB image: 172.30.1.1:5000/openshift/test-apb Pushing the image, this could take a minute... Successfully pushed image: 172.30.1.1:5000/openshift/test-apb Creating project apb-test-test-apb-9qdv2 Created project Creating service account in apb-test-test-apb-9qdv2 Created service account Creating role binding for apb-test-test-apb-9qdv2 in apb-test-test-apb-9qdv2 Created Role Binding Creating pod with image 172.30.1.1:5000/openshift/test-apb in apb-test-test-apb-9qdv2 Created Pod Test successfully passed Deleting project apb-test-test-apb-9qdv2 Project deleted Note, at this time I don't see a way to make this work with the containerized apb tool because it simultaneously needs to run remotely to access the docker registry and locally to access the apb.yml. I'm trying to figure out a way around this situation, but for now I would expect this to work only with the rpm install.
This alias similar to this will work: alias apb='docker run --rm --privileged \ -v $PWD:/mnt -v $HOME/.kube:/.kube \ -v $HOME/.minishift/certs:/.minishift/certs \ -e DOCKER_TLS_VERIFY="1" \ -e DOCKER_HOST="tcp://192.168.42.253:2376" \ -e DOCKER_CERT_PATH="/.minishift/certs" \ -e MINISHIFT_REGISTRY="172.30.1.1:5000" \ -u $UID docker.io/ansibleplaybookbundle/apb-tools:canary' Erik Nelson also create a script that will fill in a lot of this for you. Run eval $(minishift docker-env) and follow the instructions for using the script: https://github.com/ansibleplaybookbundle/ansible-playbook-bundle/blob/master/docs/apb_cli.md#running-from-a-container Be aware of https://bugzilla.redhat.com/show_bug.cgi?id=1548543 if using other images besides canary at present.
(In reply to Jason Montleon from comment #11) > It looks like a problem with the push. Are you running APB on one host and > pushing to a broker/docker registry on a remote host? > > I get similar with minishift: > $ apb test > Found registry IP at: 172.30.1.1:5000 > Finished writing dockerfile. > Building APB using tag: [172.30.1.1:5000/openshift/test-apb] > Successfully built APB image: 172.30.1.1:5000/openshift/test-apb > Error accessing the docker API. Is the daemon running? > Exception occurred! 500 Server Error: Internal Server Error ("Get > http://172.30.1.1:5000/v1/users/: dial tcp 172.30.1.1:5000: getsockopt: no > route to host") > > yes, I run apb test outside the cluster. If I run in the openshift cluster, openshift v3.9.0-0.50.0 ASB: 1.1.13 apb: 1.1.9-1 [root@host-172-16-120-99 ~]# rpm -qa | grep apb apb-1.1.9-1.el7.noarch I got the following output. [root@host-172-16-120-99 test-apb]# apb test Found registry IP at: 172.31.240.227:5000 Finished writing dockerfile. Building APB using tag: [172.31.240.227:5000/openshift/test-apb] Successfully built APB image: 172.31.240.227:5000/openshift/test-apb Exception occurred! 'NoneType' object has no attribute 'split' checking the image is build successfully. [root@host-172-16-120-99 ~]# docker images REPOSITORY TAG IMAGE ID CREATED SIZE 172.31.240.227:5000/openshift/test-apb latest 6d314e4e6641 46 seconds ago 669 MB
Looks like it's failing on the apb push with a not so great error message. It seems like all failures since comment 9 have been related to the specific environment and `apb push` and are not related directly to the `apb test` command. It's happening during apb test since it automatically pushes now. To confirm this, can QA try to run `apb build` and `apb push` before trying out `apb test`.
Are you testing as system:admin? This account is not going to work because it does not have a token, which apb requires. Nonetheless, we should probably have some better error handling code around the authorization that asserts you are not system:admin, and throws an explicit error if that is the case. I will submit a PR for this; in the meantime, can you test with a correctly permissioned user? One with cluster-admin will do.
I added the cluster role cluster-admin to user admin and ran "apb test" on apb-test. Everything ran as expected.
PR with explicit error: https://github.com/ansibleplaybookbundle/ansible-playbook-bundle/pull/233
(In reply to Erik Nelson from comment #16) > Are you testing as system:admin? > > This account is not going to work because it does not have a token, which > apb requires. Nonetheless, we should probably have some better error > handling code around the authorization that asserts you are not > system:admin, and throws an explicit error if that is the case. I will > submit a PR for this; in the meantime, can you test with a correctly > permissioned user? One with cluster-admin will do. Yes , I tested as system:admin. I use a normal user with cluster-admin , and it succeed. [root@host-172-16-120-10 test-apb]# oc config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE default/172-16-120-10:8443/system:admin 172-16-120-10:8443 system:admin/172-16-120-10:8443 default * default/172-16-120-10:8443/zitang 172-16-120-10:8443 zitang/172-16-120-10:8443 default default/host-8-248-181-host-centralci-eng-rdu2-redhat-com:8443/system:admin host-8-248-181-host-centralci-eng-rdu2-redhat-com:8443 system:admin/172-16-120-10:8443 default [root@host-172-16-120-10 test-apb]# apb test Found registry IP at: 172.31.229.52:5000 Finished writing dockerfile. Building APB using tag: [172.31.229.52:5000/openshift/test-apb] Successfully built APB image: 172.31.229.52:5000/openshift/test-apb Pushing the image, this could take a minute... Successfully pushed image: 172.31.229.52:5000/openshift/test-apb Creating project apb-test-test-apb-aao5e Created project Creating service account in apb-test-test-apb-aao5e Created service account Creating role binding for apb-test-test-apb-aao5e in apb-test-test-apb-aao5e Created Role Binding Creating pod with image 172.31.229.52:5000/openshift/test-apb in apb-test-test-apb-aao5e Created Pod Test successfully passed Deleting project apb-test-test-apb-aao5e Project deleted Change it to MODIFIED , waiting for apb-tools-v3.9.0-6 ready to double check.
Version: apb-1.1.12 Run apb test in cluster succeed as expected. 1. Using 'cluster-admin' user [root@host-172-16-120-75 test-apb]# apb test Found registry IP at: 172.31.21.58:5000 Finished writing dockerfile. Building APB using tag: [172.31.21.58:5000/openshift/test-apb] Successfully built APB image: 172.31.21.58:5000/openshift/test-apb Pushing the image, this could take a minute... Successfully pushed image: 172.31.21.58:5000/openshift/test-apb Creating project apb-test-test-apb-02wm5 Created project Creating service account in apb-test-test-apb-02wm5 Created service account Creating role binding for apb-test-test-apb-02wm5 in apb-test-test-apb-02wm5 Created Role Binding Creating pod with image 172.31.21.58:5000/openshift/test-apb in apb-test-test-apb-02wm5 Created Pod Test successfully passed Deleting project apb-test-test-apb-02wm5 Project deleted 2. using system:admin will get message. Exception occurred! No api key found in kubeconfig. NOTE: system:admin*cannot* be used with apb, since it does not have a token. But outside the cluster it still failed. Successfully built APB image: 172.31.21.58:5000/openshift/test-apb Error accessing the docker API. Is the daemon running? Exception occurred! 500 Server Error: Internal Server Error ("Get https://172.31.21.58:5000/v1/users/: dial tcp 172.31.21.58:5000: i/o timeout") "172.31.21.58:5000" is the internal IP, [root@host-172-16-120-75 test-apb]# oc get svc -n default NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE docker-registry ClusterIP 172.31.21.58 <none> 5000/TCP 16h kubernetes ClusterIP 172.31.0.1 <none> 443/TCP,53/UDP,53/TCP 16h registry-console ClusterIP 172.31.153.109 <none> 9000/TCP 16h router ClusterIP 172.31.241.160 <none> 80/TCP,443/TCP,1936/TCP 16h [root@dhcp-140-42 test-apb]# oc get route -n default NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD docker-registry docker-registry-default.apps.0228-15x.qe.rhcloud.com docker-registry <all> passthrough None Not sure whether using host 'docker-registry-default.apps.0228-15x.qe.rhcloud.com ' will work or not. If this bug only aims to work inside the cluster, I'll marked as VERIFIED, and open another bug to trace the outside the cluster issue.
Can I get some clarification on the status of this bug? Reading through this is appears to me that the PR ernelson submitted *did* resolve your problem. You *cannot* be system:admin when using the APB tooling and Erik's PR displayed an error message: Exception occurred! No api key found in kubeconfig. NOTE: system:admin*cannot* be used with apb, since it does not have a token. I also see your comment: Version: apb-1.1.12 Run apb test in cluster succeed as expected. Which tells me that this is no longer a bug? You mentioned that you had trouble working outside the cluster which is documented in: https://bugzilla.redhat.com/show_bug.cgi?id=1526147. You mentioned in Comment #22 that you would mark this as VERIFIED if it is not intended to work outside the cluster so I believe we should move this bug to VERIFIED. Can I get your thoughts?
@Dylan , thanks for your clarification, there're 2 parts of this bug: 1. run `apb test ` in cluster succeed. 2. run `apb test` outside the cluster failed. Based on the recent discussion about 'apb tools' , we do NOT clarify that 'apb tools' only works inside the cluster, so `apb test` is the same and we'd better to verify step2. As your comment #22, my understanding is that this bug depend on bug 1526147, if bug 1526147 fixed, I can directly verify step2. I'll add a 'depend on' of bug 1526147. If you do not need any other code change about this bug, please mark as modified, I'll verify until bug 1526147 fixed. Please correct me if any wrong .
@zitang, You are correct except there is one point I want to make clear. When you do `apb test` you are indirectly calling `apb push` as step one to put the image onto the cluster. This is the known bug you referenced and so I am in agreement that this one should `depend_on` bz1526147. I will also add a note in that bug that `apb test` relies on `apb push` and will have the same limitations. I will move this to modified but the code has also already been merged so I would expect that this can be put back to ON_QA.
Changing the target release to 3.10 since the bug that this depends_on is targeted for 3.10 and will not be fixed in 3.9.
https://github.com/ansibleplaybookbundle/ansible-playbook-bundle/pull/233
I keep seeing "Missing PR Link" in my emails but the above PR is the correct one. Just FYI if anyone is waiting on that.
We have documented a workaround for working with remote clusters here: https://github.com/ansibleplaybookbundle/ansible-playbook-bundle/blob/master/docs/developers.md#alternative-to-using-apb-push Instead of using `apb push` the developer can follow this documentation approach to populate their image onto the OpenShift cluster.
*** This bug has been marked as a duplicate of bug 1526147 ***