Bug 1564809
Summary: | install failed due to sdn pods crash | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Weihua Meng <wmeng> |
Component: | Installer | Assignee: | Scott Dodson <sdodson> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Weihua Meng <wmeng> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 3.10.0 | CC: | aos-bugs, ccoleman, dma, ghuang, jokerman, mifiedle, mmccomas, sdodson |
Target Milestone: | --- | ||
Target Release: | 3.10.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: |
undefined
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-04-13 12:34:08 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Weihua Meng
2018-04-08 02:17:57 UTC
As now we use kubelet dynamic configuration, but the file /etc/origin/node/node-config.yaml is not generated by the openshift-node sync pod. 1. The log of sync pod in openshift-node is empty, it difficult to debug why not generate node-config file. [root@ip-172-18-14-207 node]# oc get po -n openshift-node NAME READY STATUS RESTARTS AGE sync-b4bgl 1/1 Running 0 5m sync-gv4bm 1/1 Running 0 21m [root@ip-172-18-14-207 node]# oc logs sync-b4bgl [root@ip-172-18-14-207 node]# 2. walkaround to make the sdn pod running, we can create the file by "oc extract --config=/etc/origin/node/node.kubeconfig "cm/${BOOTSTRAP_CONFIG_NAME}" -n openshift-node --to=/etc/origin/node --confirm" The "cm/${BOOTSTRAP_CONFIG_NAME}" is: [root@ip-172-18-14-207 node]# oc get configmap NAME DATA AGE node-config-compute 1 1h node-config-infra 1 1h node-config-master 1 1h In sync.yaml, The "BOOTSTRAP_CONFIG_NAME" define in "/etc/sysconfig/atomic-openshift-node" other than "/etc/sysconfig/origin-node" This is blocking a lot of installation, user have to interact the installation to run the above workaround command manually before the installer reach the retries. So I added TestBlocker keyword back to request this bug ASAP. (In reply to DeShuai Ma from comment #2) > In sync.yaml, The "BOOTSTRAP_CONFIG_NAME" define in > "/etc/sysconfig/atomic-openshift-node" other than > "/etc/sysconfig/origin-node" It is not enough to modify /etc/sysconfig/origin-node to /etc/sysconfig/atomic-openshift-node in roles/openshift_node_group/files/sync.yaml is not enough, also have to mount /etc/sysconfig/atomic-openshift-node into sync pod. *** Bug 1565494 has been marked as a duplicate of this bug. *** sdn pods are running now. and nodes are ready status openshift-ansible-3.10.0-0.20.0.git.0.37bab0f.el7.noarch.rpm # openshift version openshift v3.10.0-0.20.0 kubernetes v1.10.0+b81c8f8 etcd 3.2.16 # oc get pods -n openshift-sdn NAME READY STATUS RESTARTS AGE ovs-57pp4 1/1 Running 0 14m ovs-6fvt9 1/1 Running 0 14m ovs-n2t8x 1/1 Running 0 20m sdn-bxbs6 1/1 Running 0 14m sdn-mp7vc 1/1 Running 0 20m sdn-pdvnh 1/1 Running 0 14m # oc get nodes NAME STATUS ROLES AGE VERSION qe-wmeng20r75n1al-master-etcd-1 Ready master 33m v1.10.0+b81c8f8 qe-wmeng20r75n1al-nrr-1 Ready compute 27m v1.10.0+b81c8f8 qe-wmeng20r75n1al-nrr-2 Ready compute 27m v1.10.0+b81c8f8 Fixed. openshift-ansible-3.10.0-0.20.0.git.0.37bab0f.el7.noarch.rpm # openshift version openshift v3.10.0-0.20.0 kubernetes v1.10.0+b81c8f8 etcd 3.2.16 |