1568585 – CRI-O Install fails - Failed to start node service

Bug 1568585 - CRI-O Install fails - Failed to start node service

Summary: CRI-O Install fails - Failed to start node service

Keywords:
Status:	CLOSED DUPLICATE of bug 1564805
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Installer
Sub Component:
Version:	3.9.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	3.9.z
Assignee:	Scott Dodson
QA Contact:	Vikas Laad
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-04-17 20:21 UTC by Vikas Laad
Modified:	2018-04-18 20:47 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-04-18 20:47:20 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
ansible log with -vvv (2.29 MB, text/plain) 2018-04-17 20:25 UTC, Vikas Laad	no flags	Details
View All

Description Vikas Laad 2018-04-17 20:21:35 UTC

Description of problem:
Install fails when I try to run deploy_cluster playbook with attached inventory from release 3.9 branch.

-- Unit run-7075.scope has begun starting up.
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.802162    7062 mount_linux.go:210] Detected OS with systemd
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: W0417 20:19:06.802293    7062 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.805094    7062 node.go:294] Starting openshift-sdn network plugin
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.807020    7062 server.go:236] Version: v1.9.1+a0ce1bc657
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.807069    7062 feature_gate.go:220] feature gates: &{{} map[]}
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: W0417 20:19:06.807151    7062 server.go:334] --require-kubeconfig is deprecated. Set --kubeconfig without using --require-kubeconfig.
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.807280    7062 aws.go:1000] Building AWS cloudprovider
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: W0417 20:19:06.827039    7062 subnets.go:55] Could not find an allocated subnet for node: ip-172-31-18-92.us-west-2.compute.internal, Waiting...
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.943091    7062 tags.go:76] AWS cloud filtering on ClusterID: vlaad-3922
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.943154    7062 server.go:359] Successfully initialized cloud provider: "aws" from the config file: "/etc/origin/cloudprovider/aws.conf"
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.943168    7062 server.go:596] cloud provider determined current node name to be ip-172-31-18-92.us-west-2.compute.internal
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.945632    7062 manager.go:151] cAdvisor running in container: "/sys/fs/cgroup/cpu,cpuacct/system.slice/atomic-openshift-node.service"
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-master-api[18026]: I0417 20:19:06.949168   18026 get.go:238] Starting watch for /api/v1/pods, rv=896 labels= fields=spec.nodeName=ip-172-31-33-104.us-west-2.compute.internal tim
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-master-api[18026]: I0417 20:19:06.949831   18026 get.go:238] Starting watch for /api/v1/services, rv=896 labels= fields= timeout=7m38s
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-master-api[18026]: I0417 20:19:06.949926   18026 get.go:238] Starting watch for /api/v1/nodes, rv= labels= fields=metadata.name=ip-172-31-33-104.us-west-2.compute.internal timeo
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.966258    7062 fs.go:140] Filesystem UUIDs: map[de4def96-ff72-4eb9-ad5e-0847257d1866:/dev/xvda2 fd645616-08ab-4598-b9b8-e467fa5353e9:/dev/dm-0]
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.966282    7062 fs.go:141] Filesystem partitions: map[/dev/xvda2:{mountpoint:/ major:202 minor:2 fsType:xfs blockSize:0} /dev/mapper/docker_vg-lvol0:{m
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.970300    7062 manager.go:225] Machine: {NumCores:4 CpuFrequency:2300190 MemoryCapacity:16656269312 HugePages:[{PageSize:1048576 NumPages:0} {PageSize
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.972205    7062 manager.go:231] Version: {KernelVersion:3.10.0-862.el7.x86_64 ContainerOsVersion:Red Hat Enterprise Linux Server 7.4 (Maipo) DockerVers
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.972874    7062 server.go:482] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.973797    7062 container_manager_linux.go:242] container manager verified user specified cgroup-root exists: /
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.973823    7062 container_manager_linux.go:247] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: Kubelet
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.973955    7062 container_manager_linux.go:266] Creating device plugin manager: false
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.974010    7062 server.go:596] cloud provider determined current node name to be ip-172-31-18-92.us-west-2.compute.internal
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.974032    7062 server.go:757] Using root directory: /var/lib/origin/openshift.local.volumes
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.974084    7062 kubelet.go:401] cloud provider determined current node name to be ip-172-31-18-92.us-west-2.compute.internal
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.974117    7062 kubelet.go:314] Watching apiserver
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: W0417 20:19:06.979375    7062 kubelet_network.go:132] Hairpin mode set to "promiscuous-bridge" but container runtime is "remote", ignoring
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.979398    7062 kubelet.go:572] Hairpin mode set to "none"
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: W0417 20:19:06.979456    7062 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.979469    7062 plugins.go:190] Loaded network plugin "cni"
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-master-api[18026]: I0417 20:19:06.977388   18026 get.go:238] Starting watch for /api/v1/pods, rv=896 labels= fields=spec.nodeName=ip-172-31-18-92.us-west-2.compute.internal time
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-master-api[18026]: I0417 20:19:06.978318   18026 get.go:238] Starting watch for /api/v1/nodes, rv= labels= fields=metadata.name=ip-172-31-18-92.us-west-2.compute.internal timeou
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-master-api[18026]: I0417 20:19:06.979156   18026 get.go:238] Starting watch for /api/v1/services, rv=896 labels= fields= timeout=9m39s
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: I0417 20:19:06.979529    7062 remote_runtime.go:43] Connecting to runtime service /var/run/crio/crio.sock
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: W0417 20:19:06.979555    7062 util_unix.go:75] Using "/var/run/crio/crio.sock" as endpoint is deprecated, please consider using full url format "unix:///var/run/crio
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: W0417 20:19:06.979630    7062 util_unix.go:75] Using "/var/run/crio/crio.sock" as endpoint is deprecated, please consider using full url format "unix:///var/run/crio
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: E0417 20:19:06.979848    7062 remote_runtime.go:69] Version from runtime service failed: rpc error: code = Unavailable desc = grpc: the connection is unavailable
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: E0417 20:19:06.979898    7062 kuberuntime_manager.go:172] Get runtime version failed: rpc error: code = Unavailable desc = grpc: the connection is unavailable
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal atomic-openshift-node[7062]: F0417 20:19:06.979912    7062 server.go:173] failed to run Kubelet: failed to create kubelet: rpc error: code = Unavailable desc = grpc: the connection is unavailabl
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal systemd[1]: atomic-openshift-node.service: main process exited, code=exited, status=255/n/a
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal dnsmasq[29124]: setting upstream servers from DBus
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal dnsmasq[29124]: using nameserver 172.31.0.2#53
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal dbus[27747]: [system] Rejected send message, 0 matched rules; type="method_return", sender=":1.11" (uid=0 pid=29124 comm="/usr/sbin/dnsmasq -k ") interface="(unset)" member="(unset)" error name=
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal systemd[1]: Failed to start OpenShift Node.
-- Subject: Unit atomic-openshift-node.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit atomic-openshift-node.service has failed.
-- 
-- The result is failed.
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal systemd[1]: Unit atomic-openshift-node.service entered failed state.
Apr 17 20:19:06 ip-172-31-18-92.us-west-2.compute.internal systemd[1]: atomic-openshift-node.service failed.

Version-Release number of the following components:
openshift-ansible head is 9c78eedfcc5fabfd44703fd2dafa1ba1e1b071d8

rpm -q ansible
ansible-2.4.3.0-1.el7ae.noarch

ansible --version
ansible 2.4.3.0
  config file = /root/openshift-ansible/ansible.cfg
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Feb 20 2018, 09:19:12) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]

How reproducible:
when runtime is cri-o

Steps to Reproduce:
1. Run deploy cluster playbook with attached inventory

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

Expected results:
Playbook should complete.

Additional info:
Attached.

Comment 2 Vikas Laad 2018-04-17 20:25:43 UTC

Created attachment 1423251 [details]
ansible log with -vvv

Comment 7 Scott Dodson 2018-04-18 20:47:20 UTC


*** This bug has been marked as a duplicate of bug 1564805 ***

Note You need to log in before you can comment on or make changes to this bug.