Bug 1510358 - [3.9] Setting multiple values for OPTIONS with containerized installation fails to start node
Summary: [3.9] Setting multiple values for OPTIONS with containerized installation fai...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: 3.9.0
Assignee: Fabian von Feilitzsch
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks: 1539091 1539094
TreeView+ depends on / blocked
 
Reported: 2017-11-07 09:37 UTC by Kenjiro Nakayama
Modified: 2018-06-18 18:27 UTC (History)
7 users (show)

Fixed In Version: openshift-ansible-3.9.0-0.31.0.git.0.e0a0ad8.el7
Doc Type: Bug Fix
Doc Text:
The OPTIONS value in /etc/sysconfig/atomic-openshift-node now works properly when multiple options are specified with containerized installations.
Clone Of:
: 1539091 1539094 (view as bug list)
Environment:
Last Closed: 2018-06-18 18:01:14 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Kenjiro Nakayama 2017-11-07 09:37:03 UTC
Description of problem:
---
- When we configured multiple variables in openshift_node_env_vars such as `openshift_node_env_vars={"OPTIONS": "--loglevel=3 --disable=proxy"}`, installer fails due to OpenShift Node fails to start.
- More precisely, when we set multiple variables for OPTIONS in /etc/sysconfig/atomic-openshift-node, containerized OpenShift Node fails to start.

Version-Release number of the following components:
---
  # openshift version
  openshift v3.6.173.0.49
  kubernetes v1.6.1+5115d708d7
  etcd 3.2.1

  (installer)
  ansible-2.3.1.0-3.el7.noarch
  openshift-ansible-3.6.173.0.48-1.git.0.1609d30.el7.noarch

How reproducible: 100%

Steps to Reproduce:
---
1. Configured following valriables in ansible inventory file.

~~~
[OSEv3:vars]
containerized=true
openshift_node_env_vars={"OPTIONS": "--loglevel=3 --disable=proxy"}
~~~

2. Start installation
  # ansible-playbook     /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml

Actual results:
---

Failed to install due to the following error:

ansible:
~~~
  1. Hosts:    knakayam-ose36-master1.example.com
     Play:     Configure containerized nodes
     Task:     restart node
     Message:  Unable to restart service atomic-openshift-node: Job for atomic-openshift-node.service failed because the control process exited with error code. See "systemctl status atomic-openshift-node.service" and "journalctl -xe" for details.
~~~

OpenShift Node journal log
~~~
Nov 07 13:13:37 knakayam-ose36-master1 atomic-openshift-node[27780]: Error: invalid argument "3 --disable=proxy" for --loglevel=3 --disable=proxy: strconv.ParseInt: parsing "3 --disable=proxy": invalid syntax
Nov 07 18:21:35 knakayam-ose36-master1 atomic-openshift-node[6797]: Usage:
Nov 07 18:21:35 knakayam-ose36-master1 atomic-openshift-node[6797]: openshift start node [options]
... snip ...
~~~

Expected results:
---
- OpenShift Node starts and installation successfully finished.

Additional info:
---
Proposal patch: https://github.com/openshift/origin/pull/17212

Comment 1 Kenjiro Nakayama 2017-11-10 08:00:49 UTC
As for the fix, the customer experienced on OCP 3.6, so would like to request backport it to 3.6.

Comment 3 Kenjiro Nakayama 2018-01-03 08:43:48 UTC
Is there any update on this fix to update for the customer?

Comment 4 Scott Dodson 2018-01-25 15:31:42 UTC
Fix via https://github.com/openshift/origin/pull/17212 for 3.8/3.9 needs to be tested and then backported to ose 3.7 and 3.6 if still necessary.

Comment 5 Johnny Liu 2018-01-29 07:51:28 UTC
Verified this bug with openshift-ansible-3.9.0-0.31.0.git.0.e0a0ad8.el7.noarch, and PASS.


Setting openshift_node_env_vars={"OPTIONS": "--loglevel=3 --disable=proxy"} in inventory host file, trigger a docker containerized install. The setting is written into /etc/sysconfig/atomic-openshift-node.

# cat /etc/sysconfig/atomic-openshift-node
OPTIONS=--loglevel=3 --disable=proxy
CONFIG_FILE=/etc/origin/node/node-config.yaml
IMAGE_VERSION=v3.9.0

node service is running well.

BTW, once set the above option to node config file, pod would fail to be deployed.

# oc get po
NAME                       READY     STATUS              RESTARTS   AGE
docker-registry-1-deploy   0/1       Error               0          15m
router-1-deploy            0/1       Error               0          16m


# oc logs router-1-deploy
error: couldn't get deployment router-1: Get https://172.31.0.1:443/api/v1/namespaces/default/replicationcontrollers/router-1: dial tcp 172.31.0.1:443: getsockopt: no route to host

# oc logs docker-registry-1-deploy
error: couldn't get deployment docker-registry-1: Get https://172.31.0.1:443/api/v1/namespaces/default/replicationcontrollers/docker-registry-1: dial tcp 172.31.0.1:443: getsockopt: no route to host


Note You need to log in before you can comment on or make changes to this bug.