1839517 – we incorrectly pass --allocate-node-cidrs=true to kcm, but it unexpectedly ignores it

Bug 1839517 - we incorrectly pass --allocate-node-cidrs=true to kcm, but it unexpectedly ignores it

Summary: we incorrectly pass --allocate-node-cidrs=true to kcm, but it unexpectedly ig...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	kube-controller-manager
Sub Component:
Version:	4.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	4.5.0
Assignee:	Maciej Szulik
QA Contact:	zhou ying
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-05-24 13:36 UTC by Dan Winship
Modified:	2020-07-13 17:41 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-07-13 17:41:26 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift origin pull 25037	0	None	closed	Bug 1839517: UPSTREAM: <carry>: kube-controller-manager: allow running bare kube-controller-manager	2020-12-07 06:46:03 UTC
Red Hat Product Errata	RHBA-2020:2409	0	None	None	None	2020-07-13 17:41:39 UTC

Description Dan Winship 2020-05-24 13:36:28 UTC

We run kcm like:

     kube-controller-manager
       --openshift-config=/etc/kubernetes/static-pod-resources/configmaps/config/config.yaml
       --kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/controller-manager-kubeconfig/kubeconfig
       --authentication-kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/controller-manager-kubeconfig/kubeconfig
       --authorization-kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/controller-manager-kubeconfig/kubeconfig
       --client-ca-file=/etc/kubernetes/static-pod-certs/configmaps/client-ca/ca-bundle.crt
       --requestheader-client-ca-file=/etc/kubernetes/static-pod-certs/configmaps/aggregator-client-ca/ca-bundle.crt
       -v=2
       --tls-cert-file=/etc/kubernetes/static-pod-resources/secrets/serving-cert/tls.crt
       --tls-private-key-file=/etc/kubernetes/static-pod-resources/secrets/serving-cert/tls.key

Where the config.yaml is:

    {
      "apiVersion": "kubecontrolplane.config.openshift.io/v1",
      "extendedArguments": {
        "allocate-node-cidrs": [
          "true"
        ],
        "cert-dir": [
          "/var/run/kubernetes"
        ],
        "cloud-provider": [
          "aws"
        ],
        "cluster-cidr": [
          "10.128.0.0/14"
        ],
        "cluster-name": [
          "ci-ln-m910rdb-d5d6b-hthnt"
        ],
        "cluster-signing-cert-file": [
          "/etc/kubernetes/static-pod-certs/secrets/csr-signer/tls.crt"
        ],
        "cluster-signing-key-file": [
          "/etc/kubernetes/static-pod-certs/secrets/csr-signer/tls.key"
        ],
        "configure-cloud-routes": [
          "false"
        ],
        "controllers": [
          "*",
          "-ttl",
          "-bootstrapsigner",
          "-tokencleaner"
        ],
        "enable-dynamic-provisioning": [
          "true"
        ],
        "experimental-cluster-signing-duration": [
          "720h"
        ],
        "feature-gates": [
          "APIPriorityAndFairness=true",
          "RotateKubeletServerCertificate=true",
          "SupportPodPidsLimit=true",
          "NodeDisruptionExclusion=true",
          "ServiceNodeExclusion=true",
          "SCTPSupport=true",
          "LegacyNodeRoleBehavior=false"
        ],
        "flex-volume-plugin-dir": [
          "/etc/kubernetes/kubelet-plugins/volume/exec"
        ],
        "kube-api-burst": [
          "300"
        ],
        "kube-api-qps": [
          "150"
        ],
        "leader-elect": [
          "true"
        ],
        "leader-elect-resource-lock": [
          "configmaps"
        ],
        "leader-elect-retry-period": [
          "3s"
        ],
        "port": [
          "0"
        ],
        "root-ca-file": [
          "/etc/kubernetes/static-pod-resources/configmaps/serviceaccount-ca/ca-bundle.crt"
        ],
        "secure-port": [
          "10257"
        ],
        "service-account-private-key-file": [
          "/etc/kubernetes/static-pod-resources/secrets/service-account-private-key/service-account.key"
        ],
        "service-cluster-ip-range": [
          "172.30.0.0/16"
        ],
        "use-service-account-credentials": [
          "true"
        ]
      },
      "kind": "KubeControllerManagerConfig",
      "serviceServingCert": {
        "certFile": "/etc/kubernetes/static-pod-resources/configmaps/service-ca/ca-bundle.crt"
      }
    }

at startup, kcm logs all the values of the command line arguments, and all of the ones passed in config.yaml show up with the expected values, EXCEPT:

    I0524 12:42:28.881527       1 flags.go:33] FLAG: --allocate-node-cidrs="false"

    I0524 12:42:28.881545       1 flags.go:33] FLAG: --configure-cloud-routes="true"

which are both the opposite of the values we passed. Later logs confirm that those are actually the values it's seeing:

    I0524 12:45:10.966479       1 controllermanager.go:538] Starting "route"
    I0524 12:45:10.966486       1 core.go:239] Will not configure cloud provider routes for allocate-node-cidrs: false, configure-cloud-routes: true.
    W0524 12:45:10.966492       1 controllermanager.go:545] Skipping "route"

    I0524 12:45:11.125267       1 controllermanager.go:538] Starting "nodeipam"
    W0524 12:45:11.125273       1 controllermanager.go:545] Skipping "nodeipam"


So:

  1) We should not be passing --allocate-node-cidrs=true to kcm by default;
     openshift-sdn and ovn-kubernetes both do their own CIDR allocation.

  2) But if we _do_ pass --allocate-node-cidrs=true then kcm ought to be
     obeying it?

Comment 3 zhou ying 2020-06-01 02:19:22 UTC

oc exec  kube-controller-manager-xxxxxx.compute.internal cat  /etc/kubernetes/static-pod-resources/configmaps/config/config.yaml |json_reformat 
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
Defaulting container name to kube-controller-manager.
Use 'oc describe pod/kube-controller-manager-ip-10-0-133-192.us-east-2.compute.internal -n openshift-kube-controller-manager' to see all of the containers in this pod.
{
    "apiVersion": "kubecontrolplane.config.openshift.io/v1",
    "extendedArguments": {
        "allocate-node-cidrs": [
            "true"
        ],



[root@dhcp-140-138 ~]# oc logs kube-controller-manager-ip-10-0-133-192.us-east-2.compute.internal -c kube-controller-manager 
Copying system trust bundle
......
I0601 00:53:25.715136       1 flags.go:33] FLAG: --address="0.0.0.0"
I0601 00:53:25.715140       1 flags.go:33] FLAG: --allocate-node-cidrs="true"



Confirmed with payload: 4.5.0-0.nightly-2020-05-30-025738

Comment 4 errata-xmlrpc 2020-07-13 17:41:26 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409

Note You need to log in before you can comment on or make changes to this bug.