Bug 1510358 - [3.9] Setting multiple values for OPTIONS with containerized installation fails to start node
Summary: [3.9] Setting multiple values for OPTIONS with containerized installation fai...
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 3.9.0
Assignee: Fabian von Feilitzsch
QA Contact: Johnny Liu
Depends On:
Blocks: 1539091 1539094
TreeView+ depends on / blocked
Reported: 2017-11-07 09:37 UTC by Kenjiro Nakayama
Modified: 2018-06-18 18:27 UTC (History)
7 users (show)

Fixed In Version: openshift-ansible-3.9.0-0.31.0.git.0.e0a0ad8.el7
Doc Type: Bug Fix
Doc Text:
The OPTIONS value in /etc/sysconfig/atomic-openshift-node now works properly when multiple options are specified with containerized installations.
Clone Of:
: 1539091 1539094 (view as bug list)
Last Closed: 2018-06-18 18:01:14 UTC
Target Upstream Version:

Attachments (Terms of Use)

Description Kenjiro Nakayama 2017-11-07 09:37:03 UTC
Description of problem:
- When we configured multiple variables in openshift_node_env_vars such as `openshift_node_env_vars={"OPTIONS": "--loglevel=3 --disable=proxy"}`, installer fails due to OpenShift Node fails to start.
- More precisely, when we set multiple variables for OPTIONS in /etc/sysconfig/atomic-openshift-node, containerized OpenShift Node fails to start.

Version-Release number of the following components:
  # openshift version
  openshift v3.
  kubernetes v1.6.1+5115d708d7
  etcd 3.2.1


How reproducible: 100%

Steps to Reproduce:
1. Configured following valriables in ansible inventory file.

openshift_node_env_vars={"OPTIONS": "--loglevel=3 --disable=proxy"}

2. Start installation
  # ansible-playbook     /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml

Actual results:

Failed to install due to the following error:

  1. Hosts:    knakayam-ose36-master1.example.com
     Play:     Configure containerized nodes
     Task:     restart node
     Message:  Unable to restart service atomic-openshift-node: Job for atomic-openshift-node.service failed because the control process exited with error code. See "systemctl status atomic-openshift-node.service" and "journalctl -xe" for details.

OpenShift Node journal log
Nov 07 13:13:37 knakayam-ose36-master1 atomic-openshift-node[27780]: Error: invalid argument "3 --disable=proxy" for --loglevel=3 --disable=proxy: strconv.ParseInt: parsing "3 --disable=proxy": invalid syntax
Nov 07 18:21:35 knakayam-ose36-master1 atomic-openshift-node[6797]: Usage:
Nov 07 18:21:35 knakayam-ose36-master1 atomic-openshift-node[6797]: openshift start node [options]
... snip ...

Expected results:
- OpenShift Node starts and installation successfully finished.

Additional info:
Proposal patch: https://github.com/openshift/origin/pull/17212

Comment 1 Kenjiro Nakayama 2017-11-10 08:00:49 UTC
As for the fix, the customer experienced on OCP 3.6, so would like to request backport it to 3.6.

Comment 3 Kenjiro Nakayama 2018-01-03 08:43:48 UTC
Is there any update on this fix to update for the customer?

Comment 4 Scott Dodson 2018-01-25 15:31:42 UTC
Fix via https://github.com/openshift/origin/pull/17212 for 3.8/3.9 needs to be tested and then backported to ose 3.7 and 3.6 if still necessary.

Comment 5 Johnny Liu 2018-01-29 07:51:28 UTC
Verified this bug with openshift-ansible-3.9.0-0.31.0.git.0.e0a0ad8.el7.noarch, and PASS.

Setting openshift_node_env_vars={"OPTIONS": "--loglevel=3 --disable=proxy"} in inventory host file, trigger a docker containerized install. The setting is written into /etc/sysconfig/atomic-openshift-node.

# cat /etc/sysconfig/atomic-openshift-node
OPTIONS=--loglevel=3 --disable=proxy

node service is running well.

BTW, once set the above option to node config file, pod would fail to be deployed.

# oc get po
NAME                       READY     STATUS              RESTARTS   AGE
docker-registry-1-deploy   0/1       Error               0          15m
router-1-deploy            0/1       Error               0          16m

# oc logs router-1-deploy
error: couldn't get deployment router-1: Get dial tcp getsockopt: no route to host

# oc logs docker-registry-1-deploy
error: couldn't get deployment docker-registry-1: Get dial tcp getsockopt: no route to host

Note You need to log in before you can comment on or make changes to this bug.