Bug 1839517 - we incorrectly pass --allocate-node-cidrs=true to kcm, but it unexpectedly ignores it
Summary: we incorrectly pass --allocate-node-cidrs=true to kcm, but it unexpectedly ig...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-controller-manager
Version: 4.5
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.5.0
Assignee: Maciej Szulik
QA Contact: zhou ying
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-24 13:36 UTC by Dan Winship
Modified: 2020-07-13 17:41 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-13 17:41:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin pull 25037 0 None closed Bug 1839517: UPSTREAM: <carry>: kube-controller-manager: allow running bare kube-controller-manager 2020-12-07 06:46:03 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:41:39 UTC

Description Dan Winship 2020-05-24 13:36:28 UTC
We run kcm like:

     kube-controller-manager
       --openshift-config=/etc/kubernetes/static-pod-resources/configmaps/config/config.yaml
       --kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/controller-manager-kubeconfig/kubeconfig
       --authentication-kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/controller-manager-kubeconfig/kubeconfig
       --authorization-kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/controller-manager-kubeconfig/kubeconfig
       --client-ca-file=/etc/kubernetes/static-pod-certs/configmaps/client-ca/ca-bundle.crt
       --requestheader-client-ca-file=/etc/kubernetes/static-pod-certs/configmaps/aggregator-client-ca/ca-bundle.crt
       -v=2
       --tls-cert-file=/etc/kubernetes/static-pod-resources/secrets/serving-cert/tls.crt
       --tls-private-key-file=/etc/kubernetes/static-pod-resources/secrets/serving-cert/tls.key

Where the config.yaml is:

    {
      "apiVersion": "kubecontrolplane.config.openshift.io/v1",
      "extendedArguments": {
        "allocate-node-cidrs": [
          "true"
        ],
        "cert-dir": [
          "/var/run/kubernetes"
        ],
        "cloud-provider": [
          "aws"
        ],
        "cluster-cidr": [
          "10.128.0.0/14"
        ],
        "cluster-name": [
          "ci-ln-m910rdb-d5d6b-hthnt"
        ],
        "cluster-signing-cert-file": [
          "/etc/kubernetes/static-pod-certs/secrets/csr-signer/tls.crt"
        ],
        "cluster-signing-key-file": [
          "/etc/kubernetes/static-pod-certs/secrets/csr-signer/tls.key"
        ],
        "configure-cloud-routes": [
          "false"
        ],
        "controllers": [
          "*",
          "-ttl",
          "-bootstrapsigner",
          "-tokencleaner"
        ],
        "enable-dynamic-provisioning": [
          "true"
        ],
        "experimental-cluster-signing-duration": [
          "720h"
        ],
        "feature-gates": [
          "APIPriorityAndFairness=true",
          "RotateKubeletServerCertificate=true",
          "SupportPodPidsLimit=true",
          "NodeDisruptionExclusion=true",
          "ServiceNodeExclusion=true",
          "SCTPSupport=true",
          "LegacyNodeRoleBehavior=false"
        ],
        "flex-volume-plugin-dir": [
          "/etc/kubernetes/kubelet-plugins/volume/exec"
        ],
        "kube-api-burst": [
          "300"
        ],
        "kube-api-qps": [
          "150"
        ],
        "leader-elect": [
          "true"
        ],
        "leader-elect-resource-lock": [
          "configmaps"
        ],
        "leader-elect-retry-period": [
          "3s"
        ],
        "port": [
          "0"
        ],
        "root-ca-file": [
          "/etc/kubernetes/static-pod-resources/configmaps/serviceaccount-ca/ca-bundle.crt"
        ],
        "secure-port": [
          "10257"
        ],
        "service-account-private-key-file": [
          "/etc/kubernetes/static-pod-resources/secrets/service-account-private-key/service-account.key"
        ],
        "service-cluster-ip-range": [
          "172.30.0.0/16"
        ],
        "use-service-account-credentials": [
          "true"
        ]
      },
      "kind": "KubeControllerManagerConfig",
      "serviceServingCert": {
        "certFile": "/etc/kubernetes/static-pod-resources/configmaps/service-ca/ca-bundle.crt"
      }
    }

at startup, kcm logs all the values of the command line arguments, and all of the ones passed in config.yaml show up with the expected values, EXCEPT:

    I0524 12:42:28.881527       1 flags.go:33] FLAG: --allocate-node-cidrs="false"

    I0524 12:42:28.881545       1 flags.go:33] FLAG: --configure-cloud-routes="true"

which are both the opposite of the values we passed. Later logs confirm that those are actually the values it's seeing:

    I0524 12:45:10.966479       1 controllermanager.go:538] Starting "route"
    I0524 12:45:10.966486       1 core.go:239] Will not configure cloud provider routes for allocate-node-cidrs: false, configure-cloud-routes: true.
    W0524 12:45:10.966492       1 controllermanager.go:545] Skipping "route"

    I0524 12:45:11.125267       1 controllermanager.go:538] Starting "nodeipam"
    W0524 12:45:11.125273       1 controllermanager.go:545] Skipping "nodeipam"


So:

  1) We should not be passing --allocate-node-cidrs=true to kcm by default;
     openshift-sdn and ovn-kubernetes both do their own CIDR allocation.

  2) But if we _do_ pass --allocate-node-cidrs=true then kcm ought to be
     obeying it?

Comment 3 zhou ying 2020-06-01 02:19:22 UTC
oc exec  kube-controller-manager-xxxxxx.compute.internal cat  /etc/kubernetes/static-pod-resources/configmaps/config/config.yaml |json_reformat 
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
Defaulting container name to kube-controller-manager.
Use 'oc describe pod/kube-controller-manager-ip-10-0-133-192.us-east-2.compute.internal -n openshift-kube-controller-manager' to see all of the containers in this pod.
{
    "apiVersion": "kubecontrolplane.config.openshift.io/v1",
    "extendedArguments": {
        "allocate-node-cidrs": [
            "true"
        ],



[root@dhcp-140-138 ~]# oc logs kube-controller-manager-ip-10-0-133-192.us-east-2.compute.internal -c kube-controller-manager 
Copying system trust bundle
......
I0601 00:53:25.715136       1 flags.go:33] FLAG: --address="0.0.0.0"
I0601 00:53:25.715140       1 flags.go:33] FLAG: --allocate-node-cidrs="true"



Confirmed with payload: 4.5.0-0.nightly-2020-05-30-025738

Comment 4 errata-xmlrpc 2020-07-13 17:41:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.