Bug 1565494

Summary: Nodes didn't get ready for installation on Atomic Host
Product: OpenShift Container Platform Reporter: Gan Huang <ghuang>
Component: InstallerAssignee: Scott Dodson <sdodson>
Status: CLOSED DUPLICATE QA Contact: Johnny Liu <jialiu>
Severity: high Docs Contact:
Priority: high    
Version: 3.10.0CC: aos-bugs, dma, jokerman, mmccomas, wmeng
Target Milestone: ---Keywords: TestBlocker
Target Release: 3.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 09:18:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Gan Huang 2018-04-10 06:47:02 UTC
Description of problem:
Trigger installation against Atomic Host, installer failed at task: Verify that the web console is running due to the nodes didn't get ready.

Version-Release number of the following components:
openshift-ansible-3.10.0-0.16.0.git.0.8925606.el7.noarch.rpm

How reproducible:
always

Steps to Reproduce:
1. Trigger installation against Atomic Host

Actual results:
# oc get nodes
NAME                                     STATUS     ROLES     AGE       VERSION
qe-ghuangatomic-master-etcd-1            NotReady   master    2h        v1.9.1+a0ce1bc657
qe-ghuangatomic-node-registry-router-1   NotReady   compute   2h        v1.9.1+a0ce1bc657

# oc describe node qe-ghuangatomic-node-registry-router-1
Name:               qe-ghuangatomic-node-registry-router-1
Roles:              compute
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=n1-standard-1
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=us-central1
                    failure-domain.beta.kubernetes.io/zone=us-central1-a
                    kubernetes.io/hostname=qe-ghuangatomic-node-registry-router-1
                    node-role.kubernetes.io/compute=true
Annotations:        volumes.kubernetes.io/controller-managed-attach-detach=true
Taints:             <none>
CreationTimestamp:  Tue, 10 Apr 2018 03:58:05 +0000
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Mon, 01 Jan 0001 00:00:00 +0000   Tue, 10 Apr 2018 03:58:05 +0000   RouteCreated                 openshift-sdn cleared kubelet-set NoRouteCreated
  OutOfDisk            False   Tue, 10 Apr 2018 06:26:59 +0000   Tue, 10 Apr 2018 03:58:05 +0000   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure       False   Tue, 10 Apr 2018 06:26:59 +0000   Tue, 10 Apr 2018 03:58:05 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Tue, 10 Apr 2018 06:26:59 +0000   Tue, 10 Apr 2018 03:58:05 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  Ready                False   Tue, 10 Apr 2018 06:26:59 +0000   Tue, 10 Apr 2018 03:58:05 +0000   KubeletNotReady              runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:
  InternalIP:  10.240.0.43
  ExternalIP:  35.192.180.81
  Hostname:    qe-ghuangatomic-node-registry-router-1
Capacity:
 cpu:     1
 memory:  3623136Ki
 pods:    10
Allocatable:
 cpu:     1
 memory:  3520736Ki
 pods:    10
System Info:
 Machine ID:                 3d593dcf9954c2bd97b2285b1e49e1f3
 System UUID:                3D593DCF-9954-C2BD-97B2-285B1E49E1F3
 Boot ID:                    477ae558-50a7-4fca-9659-87e44f76b42c
 Kernel Version:             3.10.0-855.el7.x86_64
 OS Image:                   Red Hat Enterprise Linux Server 7.4 (Maipo)
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://1.13.1
 Kubelet Version:            v1.9.1+a0ce1bc657
 Kube-Proxy Version:         v1.9.1+a0ce1bc657
ExternalID:                  6962681093331980054
Non-terminated Pods:         (3 in total)
  Namespace                  Name          CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ---------                  ----          ------------  ----------  ---------------  -------------
  openshift-node             sync-2zcjc    0 (0%)        0 (0%)      0 (0%)           0 (0%)
  openshift-sdn              ovs-cqkhs     100m (10%)    200m (20%)  200Mi (5%)       300Mi (8%)
  openshift-sdn              sdn-xhpvb     100m (10%)    0 (0%)      200Mi (5%)       0 (0%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ------------  ----------  ---------------  -------------
  200m (20%)    200m (20%)  400Mi (11%)      300Mi (8%)
Events:         <none>

logs and inventory will be attached.

Expected results:

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 3 Gan Huang 2018-04-10 09:18:04 UTC
# oc get pods --all-namespaces
NAMESPACE               NAME                                               READY     STATUS             RESTARTS   AGE
default                 docker-registry-1-deploy                           0/1       Pending            0          5h
default                 registry-console-1-deploy                          0/1       Pending            0          5h
default                 router-1-deploy                                    0/1       Pending            0          5h
kube-system             master-api-qe-ghuangatomic-master-etcd-1           1/1       Running            0          5h
kube-system             master-controllers-qe-ghuangatomic-master-etcd-1   1/1       Running            0          5h
kube-system             master-etcd-qe-ghuangatomic-master-etcd-1          1/1       Running            0          5h
openshift-node          sync-2zcjc                                         1/1       Running            0          5h
openshift-node          sync-bmln8                                         1/1       Running            0          5h
openshift-sdn           ovs-cqkhs                                          1/1       Running            0          5h
openshift-sdn           ovs-s6nzz                                          1/1       Running            0          5h
openshift-sdn           sdn-ml44w                                          0/1       CrashLoopBackOff   69         5h
openshift-sdn           sdn-xhpvb                                          0/1       CrashLoopBackOff   64         5h
openshift-web-console   webconsole-6bd4c96bf5-f7j86                        0/1       Pending            0          5h

# oc logs sdn-ml44w -n openshift-sdn
User "sa" set.
Context "default/qe-ghuangatomic-master-etcd-1:8443/system:admin" modified.
I0410 09:05:38.532034       1 start_node.go:310] Reading node configuration from /etc/origin/node/node-config.yaml
F0410 09:05:38.532198       1 start_node.go:162] open /etc/origin/node/node-config.yaml: no such file or directory

Looks dup with BZ#1564809 from the logs, closing now.

*** This bug has been marked as a duplicate of bug 1564809 ***