1510358 – [3.9] Setting multiple values for OPTIONS with containerized installation fails to start node

Bug 1510358 - [3.9] Setting multiple values for OPTIONS with containerized installation fails to start node

Summary: [3.9] Setting multiple values for OPTIONS with containerized installation fai...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Installer
Sub Component:
Version:	3.6.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	---
Target Release:	3.9.0
Assignee:	Fabian von Feilitzsch
QA Contact:	Johnny Liu
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1539091 1539094
TreeView+	depends on / blocked

Reported:	2017-11-07 09:37 UTC by Kenjiro Nakayama
Modified:	2018-06-18 18:27 UTC (History)
CC List:	7 users (show)
Fixed In Version:	openshift-ansible-3.9.0-0.31.0.git.0.e0a0ad8.el7
Doc Type:	Bug Fix
Doc Text:	The OPTIONS value in /etc/sysconfig/atomic-openshift-node now works properly when multiple options are specified with containerized installations.
Clone Of:
Clones:	1539091 1539094 (view as bug list)
Environment:
Last Closed:	2018-06-18 18:01:14 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Kenjiro Nakayama 2017-11-07 09:37:03 UTC

Description of problem:
---
- When we configured multiple variables in openshift_node_env_vars such as `openshift_node_env_vars={"OPTIONS": "--loglevel=3 --disable=proxy"}`, installer fails due to OpenShift Node fails to start.
- More precisely, when we set multiple variables for OPTIONS in /etc/sysconfig/atomic-openshift-node, containerized OpenShift Node fails to start.

Version-Release number of the following components:
---
  # openshift version
  openshift v3.6.173.0.49
  kubernetes v1.6.1+5115d708d7
  etcd 3.2.1

  (installer)
  ansible-2.3.1.0-3.el7.noarch
  openshift-ansible-3.6.173.0.48-1.git.0.1609d30.el7.noarch

How reproducible: 100%

Steps to Reproduce:
---
1. Configured following valriables in ansible inventory file.

~~~
[OSEv3:vars]
containerized=true
openshift_node_env_vars={"OPTIONS": "--loglevel=3 --disable=proxy"}
~~~

2. Start installation
  # ansible-playbook     /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml

Actual results:
---

Failed to install due to the following error:

ansible:
~~~
  1. Hosts:    knakayam-ose36-master1.example.com
     Play:     Configure containerized nodes
     Task:     restart node
     Message:  Unable to restart service atomic-openshift-node: Job for atomic-openshift-node.service failed because the control process exited with error code. See "systemctl status atomic-openshift-node.service" and "journalctl -xe" for details.
~~~

OpenShift Node journal log
~~~
Nov 07 13:13:37 knakayam-ose36-master1 atomic-openshift-node[27780]: Error: invalid argument "3 --disable=proxy" for --loglevel=3 --disable=proxy: strconv.ParseInt: parsing "3 --disable=proxy": invalid syntax
Nov 07 18:21:35 knakayam-ose36-master1 atomic-openshift-node[6797]: Usage:
Nov 07 18:21:35 knakayam-ose36-master1 atomic-openshift-node[6797]: openshift start node [options]
... snip ...
~~~

Expected results:
---
- OpenShift Node starts and installation successfully finished.

Additional info:
---
Proposal patch: https://github.com/openshift/origin/pull/17212

Comment 1 Kenjiro Nakayama 2017-11-10 08:00:49 UTC

As for the fix, the customer experienced on OCP 3.6, so would like to request backport it to 3.6.

Comment 3 Kenjiro Nakayama 2018-01-03 08:43:48 UTC

Is there any update on this fix to update for the customer?

Comment 4 Scott Dodson 2018-01-25 15:31:42 UTC

Fix via https://github.com/openshift/origin/pull/17212 for 3.8/3.9 needs to be tested and then backported to ose 3.7 and 3.6 if still necessary.

Comment 5 Johnny Liu 2018-01-29 07:51:28 UTC

Verified this bug with openshift-ansible-3.9.0-0.31.0.git.0.e0a0ad8.el7.noarch, and PASS.


Setting openshift_node_env_vars={"OPTIONS": "--loglevel=3 --disable=proxy"} in inventory host file, trigger a docker containerized install. The setting is written into /etc/sysconfig/atomic-openshift-node.

# cat /etc/sysconfig/atomic-openshift-node
OPTIONS=--loglevel=3 --disable=proxy
CONFIG_FILE=/etc/origin/node/node-config.yaml
IMAGE_VERSION=v3.9.0

node service is running well.

BTW, once set the above option to node config file, pod would fail to be deployed.

# oc get po
NAME                       READY     STATUS              RESTARTS   AGE
docker-registry-1-deploy   0/1       Error               0          15m
router-1-deploy            0/1       Error               0          16m


# oc logs router-1-deploy
error: couldn't get deployment router-1: Get https://172.31.0.1:443/api/v1/namespaces/default/replicationcontrollers/router-1: dial tcp 172.31.0.1:443: getsockopt: no route to host

# oc logs docker-registry-1-deploy
error: couldn't get deployment docker-registry-1: Get https://172.31.0.1:443/api/v1/namespaces/default/replicationcontrollers/docker-registry-1: dial tcp 172.31.0.1:443: getsockopt: no route to host

Note You need to log in before you can comment on or make changes to this bug.