Bug 1569476 - openshift_node_kubelet_args does not take effect in node service
Summary: openshift_node_kubelet_args does not take effect in node service
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.10.0
Assignee: Scott Dodson
QA Contact: Weihua Meng
URL:
Whiteboard:
: 1575051 1584090 1589629 1589941 (view as bug list)
Depends On:
Blocks: 1639958
TreeView+ depends on / blocked
 
Reported: 2018-04-19 11:05 UTC by Weihua Meng
Modified: 2019-02-01 21:28 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1639958 (view as bug list)
Environment:
Last Closed: 2018-07-09 20:20:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Weihua Meng 2018-04-19 11:05:53 UTC
Description of problem:
openshift_node_kubelet_args does not take effect in node service

https://docs.openshift.com/container-platform/3.9/install_config/install/advanced_install.html#configuring-host-variables

Version-Release number of the following components:
openshift-ansible-3.10.0-0.22.0.git.0.b6ec617.el7

How reproducible:
Always

Steps to Reproduce:
1. install OCP 3.10 with 
openshift_node_kubelet_args={'pods-per-core': ['10'], 'max-pods': ['250'], 'image-gc-high-threshold': ['90'], 'image-gc-low-threshold': ['80']}

2. check node-config.yaml on host

Actual results:
2. those arguments not set as previous OCP versions

Expected results:
those arguments set as previous OCP versions

Comment 1 Scott Dodson 2018-04-19 13:34:57 UTC
Can you please provide your inventory input? We need to know if bootstrapping is in use or not.

Comment 3 Russell Teague 2018-04-19 20:51:00 UTC
Default bootstrapping is in effect.  Looks like these kubelet args need to get into the bootstrap-node-config.yaml.

Comment 4 Weihua Meng 2018-04-23 07:45:16 UTC
hope this issue is address soon as without garbage collection, node disk will be filled up very soon by automation test, resulting in node unusable.

Thanks.

Comment 10 Scott Dodson 2018-05-30 13:45:44 UTC
*** Bug 1584090 has been marked as a duplicate of this bug. ***

Comment 11 Scott Dodson 2018-05-30 14:51:48 UTC
*** Bug 1575051 has been marked as a duplicate of this bug. ***

Comment 12 Johnny Liu 2018-05-31 02:32:04 UTC
This bug is changed to doc bug? How to resolve comment 7?

Comment 15 Scott Dodson 2018-05-31 12:51:48 UTC
We've got to write docs on this, the new way to customize node configuration is by building out node groups which can be defined like this

https://github.com/openshift/release/blob/master/cluster/test-deploy/gcp/vars.yaml#L26

Beyond that there's no ability to customize node configuration.

Comment 18 Johnny Liu 2018-06-01 03:23:03 UTC
I turn the yaml setting into json format, and overwrite the whole openshift_node_groups with following value:
openshift_node_groups=[ {"name": "node-config-master", "labels": ["node-role.kubernetes.io/master=true"], "edits": []}, {"name": "node-config-infra", "labels": ["node-role.kubernetes.io/infra=true"], "edits": []}, {"name": "node-config-compute", "labels": ["node-role.kubernetes.io/compute=true"], "edits": [ {"key": "kubeletArguments.pods-per-core", "value": ["0"]}, {"key": "kubeletArguments.minimum-container-ttl-duration", "value": ["10s"]}, {"key": "kubeletArguments.maximum-dead-containers-per-container", "value": ["1"]},{"key": "kubeletArguments.maximum-dead-containers", "value": ["20"]}, {"key": "kubeletArguments.image-gc-high-threshold", "value": ["80"]}, {"key": "kubeletArguments.image-gc-low-threshold", "value": ["70"]} ]} ]

Comment 19 Scott Dodson 2018-06-01 13:24:27 UTC
Johnny,

Yeah, that should work.

Comment 20 Johnny Liu 2018-06-01 14:58:49 UTC
(In reply to Scott Dodson from comment #19)
> Johnny,
> 
> Yeah, that should work.

I tried that, the setting is not shown in node config file after installation. 

User at least need a working way (even it is not friendly) to set kubelet options, I have no idea how to use the var yaml in comment 15 in INI inventory file, there is no any guide for that, I think customer also would hit the same issue like me.

Comment 22 Scott Dodson 2018-06-12 18:11:34 UTC
*** Bug 1589629 has been marked as a duplicate of this bug. ***

Comment 23 Vadim Rutkovsky 2018-06-13 11:43:35 UTC
(In reply to Johnny Liu from comment #20)
> (In reply to Scott Dodson from comment #19)
> > Johnny,
> > 
> > Yeah, that should work.
> 
> I tried that, the setting is not shown in node config file after
> installation. 
> 
> User at least need a working way (even it is not friendly) to set kubelet
> options, I have no idea how to use the var yaml in comment 15 in INI
> inventory file, there is no any guide for that, I think customer also would
> hit the same issue like me.

The docs should recommend using yaml format for this kind of complicated config.

The feature seems to work fine here, specified the following in inventory/group_vars/OSEv3:

openshift_node_groups:
- name: node-config-all-in-one
  labels:
  - node-role.kubernetes.io/master=true
  - node-role.kubernetes.io/infra=true
  - node-role.kubernetes.io/compute=true
  edits:
  - key: kubeletArguments.pods-per-core
    value:
    - '10'
  - key: kubeletArguments.minimum-container-ttl-duration
    value:
    - '10s'
  - key: kubeletArguments.maximum-dead-containers-per-container
    value:
    - '1'
  - key: kubeletArguments.maximum-dead-containers
    value:
    - '20'
  - key: kubeletArguments.image-gc-high-threshold
    value:
    - '80'
  - key: kubeletArguments.image-gc-low-threshold
    value:
    - '70'

and added "openshift_node_group_name='node-config-all-in-one'" to the inventory

After the deploy /etc/origin/node/node-config.yaml has:

kubeletArguments:
  bootstrap-kubeconfig:
  - /etc/origin/node/bootstrap.kubeconfig
  cert-dir:
  - /etc/origin/node/certificates
  cloud-config:
  - /etc/origin/cloudprovider/aws.conf
  cloud-provider:
  - aws
  enable-controller-attach-detach:
  - 'true'
  feature-gates:
  - RotateKubeletClientCertificate=true,RotateKubeletServerCertificate=true
  image-gc-high-threshold:
  - '80'
  image-gc-low-threshold:
  - '70'
  maximum-dead-containers:
  - '20'
...

Did sync DS got scheduled on this node? What does 'oc logs ds/sync -n openshift-node' show?

Comment 24 Scott Dodson 2018-06-13 12:26:53 UTC
*** Bug 1589941 has been marked as a duplicate of this bug. ***

Comment 25 Johnny Liu 2018-06-15 03:54:02 UTC
Because QE's existing automated installation is based on INI inventory format, it is not easy to change it right way, so I turn the yaml into some json format.


qe_kubelet_args="{'key': 'kubeletArguments.pods-per-core', 'value': ['0']}, {'key': 'kubeletArguments.minimum-container-ttl-duration', 'value': ['10s']}, {'key': 'kubeletArguments.maximum-dead-containers-per-container', 'value': ['1']}, {'key': 'kubeletArguments.maximum-dead-containers', 'value': ['20']}, {'key': 'kubeletArguments.image-gc-high-threshold', 'value': ['80']}, {'key': 'kubeletArguments.image-gc-low-threshold', 'value': ['70']}"

qe_node_group_registry_router="{ 'name': 'qe-registry-router', 'labels': ['role=node', 'registry=enabled', 'router=enabled'], 'edits': [ {{ qe_kubelet_args }} ] }"

qe_node_group_master="{ 'name': 'qe-master', 'labels': ['node-role.kubernetes.io/master=true'], 'edits': [ {{ qe_kubelet_args }} ] }"

openshift_node_groups=[ {{ qe_node_group_registry_router }}, {{ qe_node_group_master }} ]


[nodes]
master openshift_node_group_name='qe-master'
node openshift_node_group_name='qe-registry-router'

Test the above setting with openshift-ansible-3.10.0-0.69.0.git.127.3ca07e5.el7.noarch, the kubelet arguments is set successfully in node config file.

Comment 26 Johnny Liu 2018-06-15 03:56:14 UTC
(In reply to Vadim Rutkovsky from comment #23)
> The docs should recommend using yaml format for this kind of complicated
> config.
That would be big change for the whole doc, because now INI inventory file is used in everywhere in the doc.

Comment 27 Johnny Liu 2018-06-15 07:08:14 UTC
If openshift_node_kubelet_args is not applicable, we should remove related piece of code.
 
$ grep -r "openshift_node_kubelet_args" *
inventory/hosts.example:#openshift_node_kubelet_args={'pods-per-core': ['10'], 'max-pods': ['250'], 'image-gc-high-threshold': ['85'], 'image-gc-low-threshold': ['80']}
openshift-ansible.spec:- Add vsphere section for openshift_node_kubelet_args_dict (ghuang)
roles/openshift_node/tasks/config.yml:    dest: "{{ l2_openshift_node_kubelet_args['config'] }}"
roles/openshift_node/tasks/config.yml:  when: ('config' in l2_openshift_node_kubelet_args) | bool
roles/openshift_node/defaults/main.yml:openshift_node_kubelet_args_dict:
roles/openshift_node/defaults/main.yml:l_node_kubelet_args_default: "{{ openshift_node_kubelet_args_dict[openshift_cloudprovider_kind | default('undefined')] }}"
roles/openshift_node/defaults/main.yml:l_openshift_node_kubelet_args: "{{ openshift_node_kubelet_args | default({}) }}"
roles/openshift_node/defaults/main.yml:# with user-supplied openshift_node_kubelet_args.
roles/openshift_node/defaults/main.yml:# openshift_node_kubelet_args will override the defaults, if keys and/or subkeys
roles/openshift_node/defaults/main.yml:l2_openshift_node_kubelet_args: "{{ l_node_kubelet_args_default | combine(l_openshift_node_kubelet_args, recursive=True) }}"
roles/openshift_node/templates/node.yaml.v1.j2:kubeletArguments: {{  l2_openshift_node_kubelet_args  | default(None) | lib_utils_to_padded_yaml(level=1) }}

Comment 28 Vadim Rutkovsky 2018-06-15 12:19:33 UTC
(In reply to Johnny Liu from comment #27)
> If openshift_node_kubelet_args is not applicable, we should remove related
> piece of code.

Its the same story as with node labels - openshift_node_kubelet_args is applicable, but it would be rewritten by sync DS using info stored in config maps. So in the end it would be best to set these parameters in the configmap so it would persist.

Scott, should we consider deprecating these params in favor of configmaps?


(In reply to Johnny Liu from comment #26)
> (In reply to Vadim Rutkovsky from comment #23)
> > The docs should recommend using yaml format for this kind of complicated
> > config.
> That would be big change for the whole doc, because now INI inventory file
> is used in everywhere in the doc.

Well, both INI and YAML would work, its just recommended to convert OSEv3:vars and such into YAML so that it would be easier to find the syntax error

Comment 29 Scott Dodson 2018-06-15 12:31:54 UTC
(In reply to Vadim Rutkovsky from comment #28)
> (In reply to Johnny Liu from comment #27)
> > If openshift_node_kubelet_args is not applicable, we should remove related
> > piece of code.
> 
> Its the same story as with node labels - openshift_node_kubelet_args is
> applicable, but it would be rewritten by sync DS using info stored in config
> maps. So in the end it would be best to set these parameters in the
> configmap so it would persist.
> 
> Scott, should we consider deprecating these params in favor of configmaps?

Yeah, I think we should remove them and add them to the list of deprecations.

Comment 30 Vadim Rutkovsky 2018-06-20 13:26:29 UTC
Created PR to remove references and deprecate kubelet_args (for master): https://github.com/openshift/openshift-ansible/pull/8866

Comment 31 Vadim Rutkovsky 2018-06-25 08:29:34 UTC
Fix is available in openshift-ansible-3.10.7-1

Comment 32 Weihua Meng 2018-06-25 09:45:57 UTC
It is better to give an example to set parameters such as "image-gc-high-threshold" in v3.10 in hosts.example file

In released 3.9, there is an example 
#openshift_node_kubelet_args={'pods-per-core': ['10'], 'max-pods': ['250'], 'image-gc-high-threshold': ['85'], 'image-gc-low-threshold': ['80']}

In 3.10, I did not find how to set those parameters, only "#openshift_node_kubelet_args is deprecated, use node config edits instead"


we used following parameters to set up clusters, it works.

qe_kubelet_args="[{'key': 'kubeletArguments.pods-per-core', 'value': ['0']}, {'key': 'kubeletArguments.minimum-container-ttl-duration', 'value': ['10s']}, {'key': 'kubeletArguments.maximum-dead-containers-per-container', 'value': ['1']}, {'key': 'kubeletArguments.maximum-dead-containers', 'value': ['20']}, {'key': 'kubeletArguments.image-gc-high-threshold', 'value': ['80']}, {'key': 'kubeletArguments.image-gc-low-threshold', 'value': ['70']}]"

qe_node_group_master="{ 'name': 'qe-master', 'labels': ['node-role.kubernetes.io/master=true'], 'edits': {{ qe_kubelet_args }} }"

ec2-54-174-105-125.compute-1.amazonaws.com ansible_user=root ansible_ssh_user=root ansible_ssh_private_key_file="/home/slave4/workspace/Launch Environment Flexy/private/config/keys/libra.pem" openshift_public_hostname=ec2-54-174-105-125.compute-1.amazonaws.com openshift_node_group_name='qe-master'



root     28020  3.3  0.3 1119836 101940 ?      Ssl  6月24  20:01 /usr/bin/hyperkube kubelet --v=5 --address=0.0.0.0 --allow-privileged=true --anonymous-auth=true --authentication-token-webhook=true --authentication-token-webhook-cache-ttl=5m --authorization-mode=Webhook --authorization-webhook-cache-authorized-ttl=5m --authorization-webhook-cache-unauthorized-ttl=5m --bootstrap-kubeconfig=/etc/origin/node/bootstrap.kubeconfig --cadvisor-port=0 --cert-dir=/etc/origin/node/certificates --cgroup-driver=systemd --client-ca-file=/etc/origin/node/client-ca.crt --cloud-config=/etc/origin/cloudprovider/aws.conf --cloud-provider=aws --cluster-dns=172.18.17.80 --cluster-domain=cluster.local --container-runtime-endpoint=/var/run/dockershim.sock --containerized=true --enable-controller-attach-detach=true --experimental-dockershim-root-directory=/var/lib/dockershim --fail-swap-on=false --feature-gates=RotateKubeletClientCertificate=true,RotateKubeletServerCertificate=true --file-check-frequency=0s --healthz-bind-address= --healthz-port=0 --host-ipc-sources=api --host-ipc-sources=file --host-network-sources=api --host-network-sources=file --host-pid-sources=api --host-pid-sources=file --hostname-override= --http-check-frequency=0s --image-gc-high-threshold=80 --image-gc-low-threshold=70 --image-service-endpoint=/var/run/dockershim.sock --iptables-masquerade-bit=0 --kubeconfig=/etc/origin/node/node.kubeconfig --max-pods=250 --maximum-dead-containers=20 --maximum-dead-containers-per-container=1 --minimum-container-ttl-duration=10s --network-plugin=cni --node-ip= --node-labels=node-role.kubernetes.io/master=true --pod-infra-container-image=registry.reg-aws.openshift.com:443/openshift3/ose-pod:v3.10.6 --pod-manifest-path=/etc/origin/node/pods --pods-per-core=0 --port=10250 --read-only-port=0 --register-node=true --root-dir=/var/lib/origin/openshift.local.volumes --rotate-certificates=true --tls-cert-file= --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA --tls-cipher-suites=TLS_RSA_WITH_AES_128_GCM_SHA256 --tls-cipher-suites=TLS_RSA_WITH_AES_256_GCM_SHA384 --tls-cipher-suites=TLS_RSA_WITH_AES_128_CBC_SHA --tls-cipher-suites=TLS_RSA_WITH_AES_256_CBC_SHA --tls-min-version=VersionTLS12 --tls-private-key-file=

Comment 33 Weihua Meng 2018-06-26 08:43:13 UTC
As I said in comment 32, It is helpful for users if example provided in v3.10 as previous OCP versions.

please consider that.

Tested with openshift-ansible-3.10.8, works with methods in comment 32.

Kernel Version: 3.10.0-693.5.2.el7.x86_64
Red Hat Enterprise Linux Atomic Host release 7.4

Comment 34 Vadim Rutkovsky 2018-06-26 08:56:34 UTC
(In reply to Weihua Meng from comment #33)
> As I said in comment 32, It is helpful for users if example provided in
> v3.10 as previous OCP versions.

This is now mentioned in README.md - see https://github.com/openshift/openshift-ansible/commit/41e4c8dc5b0d47d56b1b7c7e7cbf5dd81b0202c6


Note You need to log in before you can comment on or make changes to this bug.