Bug 1377175

Summary: Atomic-openshift-node service fails to start due to missing config file
Product: OpenShift Container Platform Reporter: Bhaskarakiran <byarlaga>
Component: InstallerAssignee: Devan Goodwin <dgoodwin>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Johnny Liu <jialiu>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.3.0CC: aos-bugs, byarlaga, jokerman, mmccomas, mzywusko, torres.paul
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-10-27 14:22:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Bhaskarakiran 2016-09-19 05:43:48 UTC
Description of problem:
=======================

Trying to setup openshift 3.3 cluster and the node service fails to start due to missing /etc/origin/node/node-config.yaml file.

[root@dhcp41-215 ~]# systemctl start atomic-openshift-node
Job for atomic-openshift-node.service failed because the control process exited with error code. See "systemctl status atomic-openshift-node.service" and "journalctl -xe" for details.
[root@dhcp41-215 ~]# 
[root@dhcp41-215 ~]# systemctl status atomic-openshift-node
● atomic-openshift-node.service - Atomic OpenShift Node
   Loaded: loaded (/usr/lib/systemd/system/atomic-openshift-node.service; disabled; vendor preset: disabled)
   Active: activating (auto-restart) (Result: exit-code) since Mon 2016-09-19 11:12:25 IST; 480ms ago
     Docs: https://github.com/openshift/origin
  Process: 31473 ExecStart=/usr/bin/openshift start node --config=${CONFIG_FILE} $OPTIONS (code=exited, status=255)
 Main PID: 31473 (code=exited, status=255)

Sep 19 11:12:25 dhcp41-215.lab.eng.blr.redhat.com systemd[1]: Failed to start Atomic OpenShift Node.
Sep 19 11:12:25 dhcp41-215.lab.eng.blr.redhat.com systemd[1]: Unit atomic-openshift-node.service entered failed state.
Sep 19 11:12:25 dhcp41-215.lab.eng.blr.redhat.com systemd[1]: atomic-openshift-node.service failed.
[root@dhcp41-215 ~]# rpm -qa |grep atomic-openshift-node
tuned-profiles-atomic-openshift-node-3.3.0.31-1.git.0.aede597.el7.x86_64
atomic-openshift-node-3.3.0.31-1.git.0.aede597.el7.x86_64
[root@dhcp41-215 ~]# 

Picked up the RPMS from :

http://download-node-02.eng.bos.redhat.com/rcm-guest/puddles/RHAOS/AtomicOpenShift-errata/3.3/latest/

Version-Release number of selected component (if applicable):
=============================================================
3.3

How reproducible:
================
100%

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Bhaskarakiran 2016-09-19 05:46:34 UTC
[root@dhcp41-215 ~]# rpm -ql atomic-openshift-node
/etc/origin/.config_managed
/etc/origin/node
/etc/origin/node/node-config.yaml
/etc/sysconfig/atomic-openshift-node
/etc/systemd/system.conf.d/origin-accounting.conf
/usr/lib/systemd/system/atomic-openshift-node.service
[root@dhcp41-215 ~]# ls -l /etc/origin/node/node-config.yaml
ls: cannot access /etc/origin/node/node-config.yaml: No such file or directory
[root@dhcp41-215 ~]#

Comment 2 Bhaskarakiran 2016-09-19 08:55:35 UTC
I am not trying to start the service manually. Even with atomic-openshift-installer the error thrown out is the same.

Comment 4 Scott Dodson 2016-10-11 15:11:41 UTC
Can you provide the logs from the install process?

Comment 5 Devan Goodwin 2016-10-14 13:08:50 UTC
Afraid we need some more info here, can you attach the full ansible install log, inventory file, and any known steps to reproduce?

Comment 6 Paul Torres 2017-04-26 22:35:09 UTC
Hello!, I've the same issue, here is the result of "$ journalctl -xe"



-- Unit atomic-openshift-node.service has failed.
-- 
-- The result is failed.
Apr 26 18:33:39 instance-test.localdomain systemd[1]: Unit atomic-openshift-node.service entered failed state.
Apr 26 18:33:39 instance-test.localdomain systemd[1]: atomic-openshift-node.service failed.
Apr 26 18:33:44 instance-test.localdomain systemd[1]: atomic-openshift-node.service holdoff time over, scheduling restart.
Apr 26 18:33:45 instance-test.localdomain systemd[1]: Starting Atomic OpenShift Node...
-- Subject: Unit atomic-openshift-node.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit atomic-openshift-node.service has begun starting up.
Apr 26 18:33:45 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:45.169779   38371 plugins.go:71] No cloud provider specified.
Apr 26 18:33:45 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:45.169840   38371 common.go:54] Starting with configured hostname 'instance-test' (IP
Apr 26 18:33:45 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:45.169997   38371 common.go:79] Initializing single-tenant plugin for instance-test (
Apr 26 18:33:45 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:45.170033   38371 start_node.go:288] Starting node instance-test (v3.2.1.30)
Apr 26 18:33:45 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:45.170913   38371 start_node.go:297] Connecting to API server https://instance-test:8
Apr 26 18:33:45 instance-test.localdomain docker-current[6356]: time="2017-04-26T18:33:45.173086358-04:00" level=info msg="{Action=_ping, LoginUID=4294967295, PID=38371}
Apr 26 18:33:45 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:45.173932   38371 node.go:131] Connecting to Docker at unix:///var/run/docker.sock
Apr 26 18:33:45 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:45.198732   38371 manager.go:132] cAdvisor running in container: "/"
Apr 26 18:33:45 instance-test.localdomain atomic-openshift-node[38371]: W0426 18:33:45.230419   38371 subnets.go:112] Could not find an allocated subnet for node: instan
Apr 26 18:33:45 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:45.325207   38371 fs.go:109] Filesystem partitions: map[/dev/vda1:{mountpoint:/ major
Apr 26 18:33:45 instance-test.localdomain atomic-openshift-node[38371]: E0426 18:33:45.327181   38371 fs.go:264] Stat fs failed. Error: Invalid dmsetup status output: 0 
Apr 26 18:33:45 instance-test.localdomain atomic-openshift-node[38371]: W0426 18:33:45.731872   38371 subnets.go:112] Could not find an allocated subnet for node: instan
Apr 26 18:33:46 instance-test.localdomain atomic-openshift-node[38371]: W0426 18:33:46.233008   38371 subnets.go:112] Could not find an allocated subnet for node: instan
Apr 26 18:33:46 instance-test.localdomain atomic-openshift-node[38371]: W0426 18:33:46.734254   38371 subnets.go:112] Could not find an allocated subnet for node: instan
Apr 26 18:33:47 instance-test.localdomain atomic-openshift-node[38371]: W0426 18:33:47.235628   38371 subnets.go:112] Could not find an allocated subnet for node: instan
Apr 26 18:33:47 instance-test.localdomain atomic-openshift-node[38371]: W0426 18:33:47.736966   38371 subnets.go:112] Could not find an allocated subnet for node: instan
Apr 26 18:33:48 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:48.079182   38371 manager.go:169] Machine: {NumCores:4 CpuFrequency:2394454 MemoryCap
Apr 26 18:33:48 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:48.104801   38371 manager.go:175] Version: {KernelVersion:3.10.0-327.10.1.el7.x86_64 
Apr 26 18:33:48 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:48.105445   38371 server.go:344] Using root directory: /var/lib/origin/openshift.loca
Apr 26 18:33:48 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:48.105601   38371 server.go:683] Watching apiserver
Apr 26 18:33:48 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:48.129869   38371 plugins.go:127] Loaded network plugin "redhat/openshift-ovs-subnet"
Apr 26 18:33:48 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:48.129907   38371 kubelet.go:380] Hairpin mode set to "none"
Apr 26 18:33:48 instance-test.localdomain atomic-openshift-node[38371]: W0426 18:33:48.241143   38371 subnets.go:112] Could not find an allocated subnet for node: instan
Apr 26 18:33:48 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:48.250646   38371 manager.go:212] Setting dockerRoot to /var/lib/docker
Apr 26 18:33:48 instance-test.localdomain atomic-openshift-node[38371]: I0426 18:33:48.250667   38371 plugins.go:56] Registering credential provider: .dockercfg
Apr 26 18:33:48 instance-test.localdomain atomic-openshift-node[38371]: W0426 18:33:48.742438   38371 subnets.go:112] Could not find an allocated subnet for node: instan
Apr 26 18:33:49 instance-test.localdomain atomic-openshift-node[38371]: W0426 18:33:49.243748   38371 subnets.go:112] Could not find an allocated subnet for node: instan
Apr 26 18:33:49 instance-test.localdomain atomic-openshift-node[38371]: W0426 18:33:49.744993   38371 subnets.go:112] Could not find an allocated subnet for node: insta