Bug 1537599
| Summary: | apb test fails to retrieve test results | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | cchase |
| Component: | Service Broker | Assignee: | Dylan Murray <dymurray> |
| Status: | CLOSED DUPLICATE | QA Contact: | Zihan Tang <zitang> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.9.0 | CC: | aos-bugs, cchase, chezhang, dymurray, ernelson, jmatthew, jmontleo, zitang |
| Target Milestone: | --- | ||
| Target Release: | 3.10.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: |
undefined
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-04-25 17:24:08 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1526147 | ||
| Bug Blocks: | |||
|
Description
cchase
2018-01-23 15:11:10 UTC
It looks like this failed because this was run on hello-world-apb, which does not have any tests. (under playbooks you would have test.yml). For testing, I created an apb in my own repo https://github.com/cfchase/test-apb. You can use that one or create your own basic tests. Steps to Reproduce: 1. configure OpenShift, broker, and apb tool, etc 2. git clone git:cfchase/test-apb.git 3. cd test-apb 4. apb test Verified following below steps will succeed.
1. git clone git:cfchase/test-apb.git
2. build the image and push it to dockerhub.
3. cofigure OpenShift ASB, add the dockerhub registry
- type: dockerhub
name: zitang
url: docker.io
org: zitangbj
tag:
white_list: [.*apb$]
4. then cd to test-apb
5. run: apb test --tag docker.io/zitangbj/test-apb
succeed:
Successfully built APB image: docker.io/zitangbj/test-apb
Creating project apb-test-test-apb-4u866
Created project
Creating service account in apb-test-test-apb-4u866
Created service account
Creating role binding for apb-test-test-apb-4u866 in apb-test-test-apb-4u866
Created Role Binding
Creating pod with image docker.io/zitangbj/test-apb in apb-test-test-apb-4u866
Created Pod
Test successfully passed
Project deleted
If do not push the apb to a registry and run " apb test", it will fail.
The test pod log:
NAME READY STATUS RESTARTS AGE
apb-test-test-apb-dpwb2hggnf 0/1 ImagePullBackOff 0 40m
[root@host-172-16-120-49 ~]# oc logs apb-test-test-apb-dpwb2hggnf
Error from server (BadRequest): container "apb-test-test-apb-dpwb2" in pod "apb-test-test-apb-dpwb2hggnf" is waiting to start: trying and failing to pull image
[root@host-172-16-120-49 ~]# oc describe pod apb-test-test-apb-dpwb2hggnf
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 40m default-scheduler Successfully assigned apb-test-test-apb-dpwb2hggnf to 172.16.120.107
Normal SuccessfulMountVolume 40m kubelet, 172.16.120.107 MountVolume.SetUp succeeded for volume "apb-test-test-apb-dpwb2-token-2wnlj"
Normal Pulling 40m kubelet, 172.16.120.107 pulling image "test-apb"
Warning Failed 40m kubelet, 172.16.120.107 Failed to pull image "test-apb": rpc error: code = Unknown desc = Error: image library/test-apb:latest not found
Warning Failed 40m kubelet, 172.16.120.107 Error: ErrImagePull
Normal SandboxChanged 39m (x21 over 40m) kubelet, 172.16.120.107 Pod sandbox changed, it will be killed and re-created.
Warning Failed 5m (x141 over 38m) kubelet, 172.16.120.107 Error: ImagePullBackOff
Normal BackOff 15s (x162 over 38m) kubelet, 172.16.120.107 Back-off pulling image "test-apb"
I'm not sure whether you mean to run 'apb test --tag'. But just run 'apb test' outside the cluster , it will failed.
I'll look into the apb test failure. It sounds like we need it to push to the internal registry as part of the apb test command. apb test should work as expected now without using the --tag argument. cchase,
I use the latest apb image and rpm package , the apb test still failed.
step:
1. git clone the test-apb;
2. login to the server;
3. run apb test.
[root@localhost test-apb]# apb test
Found registry IP at: 172.30.80.74:5000
Finished writing dockerfile.
Building APB using tag: [172.30.80.74:5000/openshift/test-apb]
Successfully built APB image: 172.30.80.74:5000/openshift/test-apb
Error accessing the docker API. Is the daemon running?
Exception occurred! 500 Server Error: Internal Server Error ("Get https://172.30.80.74:5000/v1/users/: dial tcp 172.30.80.74:5000: i/o timeout")
the local docker is started.
It looks like a problem with the push. Are you running APB on one host and pushing to a broker/docker registry on a remote host?
I get similar with minishift:
$ apb test
Found registry IP at: 172.30.1.1:5000
Finished writing dockerfile.
Building APB using tag: [172.30.1.1:5000/openshift/test-apb]
Successfully built APB image: 172.30.1.1:5000/openshift/test-apb
Error accessing the docker API. Is the daemon running?
Exception occurred! 500 Server Error: Internal Server Error ("Get http://172.30.1.1:5000/v1/users/: dial tcp 172.30.1.1:5000: getsockopt: no route to host")
If I run:
eval $(minishift docker-env)
Which sets:
DOCKER_CERT_PATH=/home/jmontleo/.minishift/certs
DOCKER_HOST=tcp://192.168.42.253:2376
DOCKER_TLS_VERIFY=1
It should work:
$ apb test
Finished writing dockerfile.
Building APB using tag: [172.30.1.1:5000/openshift/test-apb]
Successfully built APB image: 172.30.1.1:5000/openshift/test-apb
Pushing the image, this could take a minute...
Successfully pushed image: 172.30.1.1:5000/openshift/test-apb
Creating project apb-test-test-apb-9qdv2
Created project
Creating service account in apb-test-test-apb-9qdv2
Created service account
Creating role binding for apb-test-test-apb-9qdv2 in apb-test-test-apb-9qdv2
Created Role Binding
Creating pod with image 172.30.1.1:5000/openshift/test-apb in apb-test-test-apb-9qdv2
Created Pod
Test successfully passed
Deleting project apb-test-test-apb-9qdv2
Project deleted
Note, at this time I don't see a way to make this work with the containerized apb tool because it simultaneously needs to run remotely to access the docker registry and locally to access the apb.yml. I'm trying to figure out a way around this situation, but for now I would expect this to work only with the rpm install.
This alias similar to this will work: alias apb='docker run --rm --privileged \ -v $PWD:/mnt -v $HOME/.kube:/.kube \ -v $HOME/.minishift/certs:/.minishift/certs \ -e DOCKER_TLS_VERIFY="1" \ -e DOCKER_HOST="tcp://192.168.42.253:2376" \ -e DOCKER_CERT_PATH="/.minishift/certs" \ -e MINISHIFT_REGISTRY="172.30.1.1:5000" \ -u $UID docker.io/ansibleplaybookbundle/apb-tools:canary' Erik Nelson also create a script that will fill in a lot of this for you. Run eval $(minishift docker-env) and follow the instructions for using the script: https://github.com/ansibleplaybookbundle/ansible-playbook-bundle/blob/master/docs/apb_cli.md#running-from-a-container Be aware of https://bugzilla.redhat.com/show_bug.cgi?id=1548543 if using other images besides canary at present. (In reply to Jason Montleon from comment #11) > It looks like a problem with the push. Are you running APB on one host and > pushing to a broker/docker registry on a remote host? > > I get similar with minishift: > $ apb test > Found registry IP at: 172.30.1.1:5000 > Finished writing dockerfile. > Building APB using tag: [172.30.1.1:5000/openshift/test-apb] > Successfully built APB image: 172.30.1.1:5000/openshift/test-apb > Error accessing the docker API. Is the daemon running? > Exception occurred! 500 Server Error: Internal Server Error ("Get > http://172.30.1.1:5000/v1/users/: dial tcp 172.30.1.1:5000: getsockopt: no > route to host") > > yes, I run apb test outside the cluster. If I run in the openshift cluster, openshift v3.9.0-0.50.0 ASB: 1.1.13 apb: 1.1.9-1 [root@host-172-16-120-99 ~]# rpm -qa | grep apb apb-1.1.9-1.el7.noarch I got the following output. [root@host-172-16-120-99 test-apb]# apb test Found registry IP at: 172.31.240.227:5000 Finished writing dockerfile. Building APB using tag: [172.31.240.227:5000/openshift/test-apb] Successfully built APB image: 172.31.240.227:5000/openshift/test-apb Exception occurred! 'NoneType' object has no attribute 'split' checking the image is build successfully. [root@host-172-16-120-99 ~]# docker images REPOSITORY TAG IMAGE ID CREATED SIZE 172.31.240.227:5000/openshift/test-apb latest 6d314e4e6641 46 seconds ago 669 MB Looks like it's failing on the apb push with a not so great error message. It seems like all failures since comment 9 have been related to the specific environment and `apb push` and are not related directly to the `apb test` command. It's happening during apb test since it automatically pushes now. To confirm this, can QA try to run `apb build` and `apb push` before trying out `apb test`. Are you testing as system:admin? This account is not going to work because it does not have a token, which apb requires. Nonetheless, we should probably have some better error handling code around the authorization that asserts you are not system:admin, and throws an explicit error if that is the case. I will submit a PR for this; in the meantime, can you test with a correctly permissioned user? One with cluster-admin will do. I added the cluster role cluster-admin to user admin and ran "apb test" on apb-test. Everything ran as expected. PR with explicit error: https://github.com/ansibleplaybookbundle/ansible-playbook-bundle/pull/233 (In reply to Erik Nelson from comment #16) > Are you testing as system:admin? > > This account is not going to work because it does not have a token, which > apb requires. Nonetheless, we should probably have some better error > handling code around the authorization that asserts you are not > system:admin, and throws an explicit error if that is the case. I will > submit a PR for this; in the meantime, can you test with a correctly > permissioned user? One with cluster-admin will do. Yes , I tested as system:admin. I use a normal user with cluster-admin , and it succeed. [root@host-172-16-120-10 test-apb]# oc config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE default/172-16-120-10:8443/system:admin 172-16-120-10:8443 system:admin/172-16-120-10:8443 default * default/172-16-120-10:8443/zitang 172-16-120-10:8443 zitang/172-16-120-10:8443 default default/host-8-248-181-host-centralci-eng-rdu2-redhat-com:8443/system:admin host-8-248-181-host-centralci-eng-rdu2-redhat-com:8443 system:admin/172-16-120-10:8443 default [root@host-172-16-120-10 test-apb]# apb test Found registry IP at: 172.31.229.52:5000 Finished writing dockerfile. Building APB using tag: [172.31.229.52:5000/openshift/test-apb] Successfully built APB image: 172.31.229.52:5000/openshift/test-apb Pushing the image, this could take a minute... Successfully pushed image: 172.31.229.52:5000/openshift/test-apb Creating project apb-test-test-apb-aao5e Created project Creating service account in apb-test-test-apb-aao5e Created service account Creating role binding for apb-test-test-apb-aao5e in apb-test-test-apb-aao5e Created Role Binding Creating pod with image 172.31.229.52:5000/openshift/test-apb in apb-test-test-apb-aao5e Created Pod Test successfully passed Deleting project apb-test-test-apb-aao5e Project deleted Change it to MODIFIED , waiting for apb-tools-v3.9.0-6 ready to double check. Version: apb-1.1.12
Run apb test in cluster succeed as expected.
1. Using 'cluster-admin' user
[root@host-172-16-120-75 test-apb]# apb test
Found registry IP at: 172.31.21.58:5000
Finished writing dockerfile.
Building APB using tag: [172.31.21.58:5000/openshift/test-apb]
Successfully built APB image: 172.31.21.58:5000/openshift/test-apb
Pushing the image, this could take a minute...
Successfully pushed image: 172.31.21.58:5000/openshift/test-apb
Creating project apb-test-test-apb-02wm5
Created project
Creating service account in apb-test-test-apb-02wm5
Created service account
Creating role binding for apb-test-test-apb-02wm5 in apb-test-test-apb-02wm5
Created Role Binding
Creating pod with image 172.31.21.58:5000/openshift/test-apb in apb-test-test-apb-02wm5
Created Pod
Test successfully passed
Deleting project apb-test-test-apb-02wm5
Project deleted
2. using system:admin will get message.
Exception occurred! No api key found in kubeconfig. NOTE: system:admin*cannot* be used with apb, since it does not have a token.
But outside the cluster it still failed.
Successfully built APB image: 172.31.21.58:5000/openshift/test-apb
Error accessing the docker API. Is the daemon running?
Exception occurred! 500 Server Error: Internal Server Error ("Get https://172.31.21.58:5000/v1/users/: dial tcp 172.31.21.58:5000: i/o timeout")
"172.31.21.58:5000" is the internal IP,
[root@host-172-16-120-75 test-apb]# oc get svc -n default
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
docker-registry ClusterIP 172.31.21.58 <none> 5000/TCP 16h
kubernetes ClusterIP 172.31.0.1 <none> 443/TCP,53/UDP,53/TCP 16h
registry-console ClusterIP 172.31.153.109 <none> 9000/TCP 16h
router ClusterIP 172.31.241.160 <none> 80/TCP,443/TCP,1936/TCP 16h
[root@dhcp-140-42 test-apb]# oc get route -n default
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
docker-registry docker-registry-default.apps.0228-15x.qe.rhcloud.com docker-registry <all> passthrough None
Not sure whether using host 'docker-registry-default.apps.0228-15x.qe.rhcloud.com ' will work or not.
If this bug only aims to work inside the cluster, I'll marked as VERIFIED, and open another bug to trace the outside the cluster issue.
Can I get some clarification on the status of this bug? Reading through this is appears to me that the PR ernelson submitted *did* resolve your problem. You *cannot* be system:admin when using the APB tooling and Erik's PR displayed an error message: Exception occurred! No api key found in kubeconfig. NOTE: system:admin*cannot* be used with apb, since it does not have a token. I also see your comment: Version: apb-1.1.12 Run apb test in cluster succeed as expected. Which tells me that this is no longer a bug? You mentioned that you had trouble working outside the cluster which is documented in: https://bugzilla.redhat.com/show_bug.cgi?id=1526147. You mentioned in Comment #22 that you would mark this as VERIFIED if it is not intended to work outside the cluster so I believe we should move this bug to VERIFIED. Can I get your thoughts? @Dylan , thanks for your clarification, there're 2 parts of this bug: 1. run `apb test ` in cluster succeed. 2. run `apb test` outside the cluster failed. Based on the recent discussion about 'apb tools' , we do NOT clarify that 'apb tools' only works inside the cluster, so `apb test` is the same and we'd better to verify step2. As your comment #22, my understanding is that this bug depend on bug 1526147, if bug 1526147 fixed, I can directly verify step2. I'll add a 'depend on' of bug 1526147. If you do not need any other code change about this bug, please mark as modified, I'll verify until bug 1526147 fixed. Please correct me if any wrong . @zitang, You are correct except there is one point I want to make clear. When you do `apb test` you are indirectly calling `apb push` as step one to put the image onto the cluster. This is the known bug you referenced and so I am in agreement that this one should `depend_on` bz1526147. I will also add a note in that bug that `apb test` relies on `apb push` and will have the same limitations. I will move this to modified but the code has also already been merged so I would expect that this can be put back to ON_QA. Changing the target release to 3.10 since the bug that this depends_on is targeted for 3.10 and will not be fixed in 3.9. I keep seeing "Missing PR Link" in my emails but the above PR is the correct one. Just FYI if anyone is waiting on that. We have documented a workaround for working with remote clusters here: https://github.com/ansibleplaybookbundle/ansible-playbook-bundle/blob/master/docs/developers.md#alternative-to-using-apb-push Instead of using `apb push` the developer can follow this documentation approach to populate their image onto the OpenShift cluster. *** This bug has been marked as a duplicate of bug 1526147 *** |