Bug 1560820

Summary: OCP `oc cluster up` uses origin image for some processes and returns origin version info
Product: OpenShift Container Platform Reporter: Xingxing Xia <xxia>
Component: MasterAssignee: Michal Fojtik <mfojtik>
Status: CLOSED ERRATA QA Contact: Wang Haoran <haowang>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.10.0CC: aos-bugs, jokerman, mmccomas, spadgett, wewang, wzheng, xiuwang, xxia
Target Milestone: ---   
Target Release: 3.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-07-30 19:11:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Get this log when set Loglevel=5
none
oc_cluster_up_error none

Description Xingxing Xia 2018-03-27 03:15:07 UTC
Description of problem:
oc cluster up fails and shows "Error: could not run "install-router": Docker run error rc=255".

Version-Release number of selected component (if applicable):
oc v3.10.0-alpha.0+d1afaef-376 (latest Origin version)
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO

How reproducible:
Always

Steps to Reproduce:
1. Launch Origin instance in EC2
2. Pull origin image (due to bug 1559291)
3. Run (with or without --service-catalog, will meet same error)
# oc cluster up --public-hostname=ec2-*aws.com --service-catalog

Actual results:
3. It outputs:
Starting OpenShift using openshift/origin:v3.10 ...
I0327 02:39:56.380347    1429 config.go:38] Running "create-master-config"
I0327 02:40:02.137918    1429 config.go:45] Running "create-node-config"
I0327 02:40:04.566231    1429 flags.go:31] Running "create-kubelet-flags"
I0327 02:40:06.745955    1429 run_kubelet.go:48] Running "start-kubelet"
I0327 02:40:07.428353    1429 run_self_hosted.go:157] Waiting for the kube-apiserver to be ready.
I0327 02:40:32.432984    1429 apply_template.go:77] Installing "kube-proxy"
I0327 02:40:32.432985    1429 apply_template.go:77] Installing "openshift-apiserver"
I0327 02:40:32.433000    1429 apply_template.go:77] Installing "kube-dns"
I0327 02:40:41.905788    1429 interface.go:41] Finished installing "kube-proxy" "kube-dns" "openshift-apiserver"
I0327 02:41:20.996580    1429 run_self_hosted.go:186] openshift-apiserver available
I0327 02:41:20.997171    1429 apply_template.go:77] Installing "openshift-controller-manager"
I0327 02:41:24.983664    1429 interface.go:41] Finished installing "openshift-controller-manager"
I0327 02:41:25.174541    1429 apply_list.go:48] Installing "openshift/centos7"
I0327 02:41:25.174648    1429 apply_list.go:48] Installing "openshift/jenkins pipeline persistent"
I0327 02:41:25.174707    1429 apply_list.go:48] Installing "openshift/mongodb"
I0327 02:41:25.177798    1429 apply_list.go:48] Installing "openshift/nodejs quickstart"
I0327 02:41:25.179535    1429 apply_list.go:48] Installing "openshift/mariadb"
I0327 02:41:25.179759    1429 apply_list.go:48] Installing "openshift/cakephp quickstart"
I0327 02:41:25.179867    1429 apply_list.go:48] Installing "openshift/django quickstart"
I0327 02:41:25.179970    1429 apply_list.go:48] Installing "openshift/rails quickstart"
I0327 02:41:25.180066    1429 apply_list.go:48] Installing "openshift/mysql"
I0327 02:41:25.180157    1429 apply_list.go:48] Installing "openshift/postgresql"
I0327 02:41:25.180252    1429 apply_list.go:48] Installing "openshift/dancer quickstart"
I0327 02:41:25.180502    1429 apply_list.go:48] Installing "openshift-infra/template service broker registration"
I0327 02:41:25.180604    1429 apply_list.go:48] Installing "openshift/sample pipeline"
I0327 02:41:25.180719    1429 apply_list.go:48] Installing "kube-system/heapster standalone"
I0327 02:41:25.180837    1429 apply_list.go:48] Installing "kube-system/prometheus"
I0327 02:41:25.180994    1429 apply_list.go:48] Installing "openshift-infra/template service broker rbac"
I0327 02:41:25.181093    1429 apply_list.go:48] Installing "openshift-infra/service catalog"
I0327 02:41:25.181220    1429 apply_list.go:48] Installing "openshift-infra/web console server template"
I0327 02:41:25.181335    1429 apply_list.go:48] Installing "openshift-infra/template service broker apiserver"
I0327 02:41:25.199356    1429 registry_install.go:56] Running "openshift-image-registry"
scc "privileged" added to: ["system:serviceaccount:default:registry"]
I0327 02:42:26.291100    1429 interface.go:41] Finished installing "openshift/centos7" "openshift/jenkins pipeline persistent" "openshift/mongodb" "openshift/mariadb" "openshift/cakephp quickstart" "openshift/django quickstart" "openshift/rails quickstart" "openshift/mysql" "openshift/postgresql" "openshift/dancer quickstart" "openshift/nodejs quickstart" "openshift/sample pipeline" "kube-system/heapster standalone" "kube-system/prometheus" "openshift-infra/template service broker registration" "openshift-infra/service catalog" "openshift-infra/template service broker rbac" "openshift-infra/web console server template" "openshift-infra/template service broker apiserver" "openshift-image-registry"
I0327 02:42:26.294981    1429 admin.go:51] Running "install-router"
Error: FAIL
   Error: could not run "install-router": Docker run error rc=255
   Caused By:
     Error: Docker run error rc=255
     Details:
       Image: openshift/origin:v3.10
       Entrypoint: [oc]
       Command: [adm router --host-ports=true --loglevel=8 --config=/var/lib/origin/openshift.local.config/master/admin.kubeconfig --host-network=true --images=openshift/origin-${component}:v3.10 --default-cert=/var/lib/origin/openshift.local.config/master/router.pem]

PS: after it failed, run:
# oc cluster down
then run:
# oc cluster up --use-existing-config
It can succeed, but not router pod (and web console pod) created, this means previous failure indeed relates to router installing:
# oc get po --all-namespaces --config openshift.local.clusterup/oc-cluster-up-openshift-apiserver/admin.kubeconfig
NAMESPACE                      NAME                                 READY     STATUS      RESTARTS   AGE
default                        docker-registry-1-8rz7h              1/1       Running     0          8m
default                        persistent-volume-setup-hs5rt        0/1       Completed   0          10m
kube-dns                       kube-dns-7jfxb                       1/1       Running     0          10m
kube-proxy                     kube-proxy-hb8hw                     1/1       Running     0          11m
kube-system                    kube-controller-manager-localhost    1/1       Running     0          9m
kube-system                    kube-scheduler-localhost             1/1       Running     0          10m
kube-system                    master-api-localhost                 1/1       Running     0          10m
kube-system                    master-etcd-localhost                1/1       Running     0          10m
openshift-apiserver            openshift-apiserver-nx9w5            1/1       Running     0          11m
openshift-controller-manager   openshift-controller-manager-bvhfp   1/1       Running     0          10m

Expected results:
3. It should succeed.

Additional info:

Comment 1 Xingxing Xia 2018-04-04 06:06:28 UTC
OCP oc v3.10.0-0.15.0 meets same issue.

Comment 2 Michal Fojtik 2018-04-05 11:04:55 UTC
Can you locate the `openshift.local.clusterup/logs/install-router-001.*` and attach them to this BZ? So we can see the error output from the command.

I can't reproduce this on OSX or on linux.

Comment 3 Xingxing Xia 2018-04-08 09:39:35 UTC
Hmm, today met several other errors to answer the needinfo. Will continue check next day :)

Comment 5 Wenjing Zheng 2018-04-10 06:10:14 UTC
I can reproduce the same issue like comment #4, attaching log in attachment now.

Comment 6 Wenjing Zheng 2018-04-10 06:11:12 UTC
Created attachment 1419695 [details]
Get this log when set Loglevel=5

Comment 7 XiuJuan Wang 2018-04-25 06:59:02 UTC
Created attachment 1426452 [details]
oc_cluster_up_error

Comment 8 XiuJuan Wang 2018-04-25 07:18:15 UTC
The error has changed with 
oc v3.10.0-0.28.0
kubernetes v1.10.0+b81c8f8
In the button of the error log, prompt no 'write-flags' option for cmd ' openshift start node', but the 'openshift start node' v3.10 and v3.9 has included this option.

Comment 12 errata-xmlrpc 2018-07-30 19:11:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816