Description of problem: After installation, found all deploy pod is in error status [root@ip-172-18-6-45 ~]# oc get pods NAME READY STATUS RESTARTS AGE docker-registry-1-deploy 0/1 Error 0 7m registry-console-1-deploy 0/1 Error 0 7m router-1-deploy 0/1 Error 0 8m [root@ip-172-18-6-45 ~]# oc logs -f docker-registry-2-deploy --> Scaling docker-registry-2 to 1 error: couldn't scale docker-registry-2 to 1: replicationcontrollers "docker-registry-2" is forbidden: User "system:serviceaccount:default:deployer" cannot get replicationcontrollers/scale in the namespace "default": User "system:serviceaccount:default:deployer" cannot get replicationcontrollers/scale in project "default" Currently can be workaround with `oc policy add-role-to-user admin system:serviceaccount:default:deployer -n default ` Version-Release number of the following components: rpm -q openshift-ansible openshift-ansible-3.10.0-0.21.0.git.0.0b1d180.el7.noarch.rpm rpm -q ansible ansible-2.4.2.0-2.el7.noarch ansible --version How reproducible: Always Steps to Reproduce: 1. Set up installation 2. Check pod status in default namespaces 3. Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated 2. [root@ip-172-18-6-45 ~]# oc get pods NAME READY STATUS RESTARTS AGE docker-registry-1-deploy 0/1 Error 0 7m registry-console-1-deploy 0/1 Error 0 7m router-1-deploy 0/1 Error 0 8m Expected results: All pod should be working well in default namespace Additional info: Please attach logs from ansible-playbook with the -vvv flag
Similar to bug 1566357
Is this 100% reproducible or only happens sometimes?
permissions were added in https://github.com/openshift/origin/pull/19137/commits/b66098cd4978a4f2897c150d96b0da2b441a33b8#diff-1d38b34f638332aebf97978ff71d9664R533
Can we see output of oc version? I'm 100% sure you're running API server 1.9.1 against new deployer image which was build after 1.10 rebase landed. That is causing the problems. You need to upgrade the API server.
*** Bug 1566357 has been marked as a duplicate of this bug. ***
This is 100% reproducible in my cluster. oc v3.10.0-0.21.0 kubernetes v1.10.0+b81c8f8 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ip-172-31-33-11.us-west-2.compute.internal:8443 openshift v3.10.0-0.14.0 kubernetes v1.9.1+a0ce1bc657 Why does this puddle not have the rebase? It landed last Monday in origin.
*** Bug 1568031 has been marked as a duplicate of this bug. ***
https://github.com/openshift/openshift-ansible/pull/7964/commits/d1861a0280b4f1dc4651fff0b55b9eb6177278a4 addresses this but need to sort out image logistics for origin
*** Bug 1565442 has been marked as a duplicate of this bug. ***
Marking this as urgent as duplicate of bug 1568031 as it blocks performance testing of 3.10
Oh, I think this issue is related with https://bugzilla.redhat.com/show_bug.cgi?id=1565442 , since the image in installation script is not updated, so the version will be same with https://bugzilla.redhat.com/show_bug.cgi?id=1566814#c6 and if the image updated, this issue should be gone.
Actually this bug should be a dup of 1565442, but not close 1565442 as a dup. This issue is blocking all the testing. installer is trying to use "ose" image to start master static pod, but no such image any more on registry. # journalctl -f -u atomic-openshift-node.service |grep E0 Apr 16 22:59:27 qe-smoke310-mrre-1 atomic-openshift-node[22075]: E0416 22:59:27.496745 22075 pod_workers.go:186] Error syncing pod a2126bf858dec6e7c76ce2bbf8325184 ("master-controllers-qe-smoke310-mrre-1_kube-system(a2126bf858dec6e7c76ce2bbf8325184)"), skipping: failed to "StartContainer" for "controllers" with ImagePullBackOff: "Back-off pulling image \"registry.reg-aws.openshift.com:443/openshift3/ose:v3.10.0-0.22.0\"" Apr 16 22:59:27 qe-smoke310-mrre-1 atomic-openshift-node[22075]: E0416 22:59:27.503245 22075 pod_workers.go:186] Error syncing pod 37410126583f9d3795a9d1dfa2c4c1fc ("master-api-qe-smoke310-mrre-1_kube-system(37410126583f9d3795a9d1dfa2c4c1fc)"), skipping: failed to "StartContainer" for "api" with ImagePullBackOff: "Back-off pulling image \"registry.reg-aws.openshift.com:443/openshift3/ose:v3.10.0-0.22.0\"" Apr 16 22:59:28 qe-smoke310-mrre-1 atomic-openshift-node[22075]: E0416 22:59:28.222772 22075 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:460: Failed to list *v1.Node: Get https://qe-smoke310-mrre-1:8443/api/v1/nodes?fieldSelector=metadata.name%3Dqe-smoke310-mrre-1&limit=500&resourceVersion=0: dial tcp 10.240.0.21:8443: getsockopt: connection refused # curl -H "Authorization: Bearer $(oc --config=~/.kube/reg-aws whoami -t)" https://registry.reg-aws.openshift.com/v2/openshift3/ose/tags/list | python -m json.tool | grep v3.10 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 7703 0 7703 0 0 39178 0 --:--:-- --:--:-- --:--:-- 39101 "v3.10.0-0.14.0", "v3.10.0-0.14.0.0", "v3.10.0-0.13.0.0", "v3.10.0", "v3.10", "v3.10.0-0.13.0", # curl -H "Authorization: Bearer $(oc --config=~/.kube/reg-aws whoami -t)" https://registry.reg-aws.openshift.com/v2/openshift3/ose-control-plane/tags/list | python -m json.tool | grep v3.10 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 211 100 211 0 0 1219 0 --:--:-- --:--:-- --:--:-- 1226 "v3.10.0", "v3.10.0-0.16.0.0", "v3.10.0-0.20.0.0", "v3.10.0-0.21.0", "v3.10", "v3.10.0-0.15.0", "v3.10.0-0.15.0.0", "v3.10.0-0.16.0", "v3.10.0-0.20.0", "v3.10.0-0.21.0.0" # cat /etc/origin/node/pods/apiserver.yaml |grep image image: registry.reg-aws.openshift.com:443/openshift3/ose:v3.10.0-0.22.0
https://github.com/openshift/openshift-ansible/pull/8003
Fixed. openshift-ansible-3.10.0-0.27.0.git.0.abed3b7.el7 Operating System: Red Hat Enterprise Linux Server 7.5 (Maipo) CPE OS Name: cpe:/o:redhat:enterprise_linux:7.5:GA:server Kernel: Linux 3.10.0-862.el7.x86_64