Bug 1676784 - [cloud-CA] machineautoscaler couldn't add annotations to machineset
Summary: [cloud-CA] machineautoscaler couldn't add annotations to machineset
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.1.0
Assignee: Jan Chaloupka
QA Contact: sunzhaohua
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-13 08:33 UTC by sunzhaohua
Modified: 2019-06-04 10:44 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-04 10:44:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
cluster-autoscaler-operator log (270.96 KB, text/plain)
2019-02-14 05:47 UTC, sunzhaohua
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 0 None None None 2019-06-04 10:44:07 UTC

Description sunzhaohua 2019-02-13 08:33:14 UTC
Description of problem:
Machineautoscaler couldn't add annotations to machineset

Version-Release number of selected component (if applicable):
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE     STATUS
version   4.0.0-0.nightly-2019-02-12-150919   True        False         93m       Cluster version is 4.0.0-0.nightly-2019-02-12-150919

How reproducible:
Always

Steps to Reproduce:
1. Create machineautoscaler.
apiVersion: autoscaling.openshift.io/v1alpha1
kind: MachineAutoscaler
metadata:
  finalizers:
  - machinetarget.autoscaling.openshift.io
  name: autoscale-us-east-2b
  namespace: openshift-machine-api
spec:
  maxReplicas: 5
  minReplicas: 1
  scaleTargetRef:
    apiVersion: machine.openshift.io/v1beta1
    kind: MachineSet
    name: qe-piqin5-worker-us-east-2b
    
2. Check machineset annotations
3. Check autoscaler logs

Actual results:
Machineset has no annotations. Autoscaler log continues to show "group size not found" 

$ oc logs -f cluster-autoscaler-default-74c7c95b79-bjp8n
E0213 08:06:29.959590       1 utils.go:467] Error while checking node group size openshift-machine-api/qe-piqin5-worker-us-east-2c: group size not found
E0213 08:06:39.979172       1 utils.go:467] Error while checking node group size openshift-machine-api/qe-piqin5-worker-us-east-2a: group size not found
E0213 08:06:39.979257       1 utils.go:467] Error while checking node group size openshift-machine-api/qe-piqin5-worker-us-east-2b: group size not found
E0213 08:06:39.980028       1 utils.go:467] Error while checking node group size openshift-machine-api/qe-piqin5-worker-us-east-2c: group size not found
E0213 08:06:50.000643       1 utils.go:467] Error while checking node group size openshift-machine-api/qe-piqin5-worker-us-east-2a: group size not found
E0213 08:06:50.000730       1 utils.go:467] Error while checking node group size openshift-machine-api/qe-piqin5-worker-us-east-2b: group size not found
E0213 08:06:50.000808       1 utils.go:467] Error while checking node group size openshift-machine-api/qe-piqin5-worker-us-east-2c: group size not found


Expected results:
Machineautoscaler could add annotations to machineset

Additional info:

Comment 1 Jan Chaloupka 2019-02-13 14:02:31 UTC
sunzhaohua,

can you share the following data?:
- list of all machineset objects across all namespaces
- full log from the cluster-autoscaler-operator (from the start of the controller)
- list of all CRDs deployed

Thank you

Comment 2 sunzhaohua 2019-02-14 05:44:46 UTC
Jan Chaloupka,

In the new version I couldn't reproduce this bug. Autoscaler works as expected.
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE     STATUS
version   4.0.0-0.nightly-2019-02-13-204401   True        False         28m       Cluster version is 4.0.0-0.nightly-2019-02-13-204401

In the old version the problem can be reproduced. Attached is the cluster-autoscaler-operator logs.
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE     STATUS
version   4.0.0-0.nightly-2019-02-12-150919   True        False         23h       Cluster version is 4.0.0-0.nightly-2019-02-12-150919

$ oc get machinesets.machine.openshift.io --all-namespaces
NAMESPACE               NAME                          DESIRED   CURRENT   READY     AVAILABLE   AGE
openshift-machine-api   qe-piqin5-worker-us-east-2a   1         1         1         1           23h
openshift-machine-api   qe-piqin5-worker-us-east-2b   1         1         1         1           23h
openshift-machine-api   qe-piqin5-worker-us-east-2c   1         1         1         1           23h

$ oc get crd
NAME                                                                     CREATED AT
alertmanagers.monitoring.coreos.com                                      2019-02-13T06:21:15Z
apiservers.config.openshift.io                                           2019-02-13T06:06:13Z
authentications.config.openshift.io                                      2019-02-13T06:06:14Z
authentications.operator.openshift.io                                    2019-02-13T06:20:33Z
builds.config.openshift.io                                               2019-02-13T06:06:15Z
catalogsourceconfigs.marketplace.redhat.com                              2019-02-13T06:20:37Z
catalogsources.operators.coreos.com                                      2019-02-13T06:17:50Z
clusterautoscalers.autoscaling.openshift.io                              2019-02-13T06:17:47Z
clusterdnses.dns.openshift.io                                            2019-02-13T06:07:41Z
clusteringresses.ingress.openshift.io                                    2019-02-13T06:20:33Z
clusternetworks.network.openshift.io                                     2019-02-13T06:06:43Z
clusteroperators.config.openshift.io                                     2019-02-13T06:05:59Z
clusteroperators.operatorstatus.openshift.io                             2019-02-13T06:06:12Z
clusters.cluster.k8s.io                                                  2019-02-13T06:16:38Z
clusters.clusterregistry.k8s.io                                          2019-02-14T03:34:51Z
clusters.machine.openshift.io                                            2019-02-13T06:16:44Z
clusterserviceversions.operators.coreos.com                              2019-02-13T06:17:47Z
clusterversions.config.openshift.io                                      2019-02-13T06:05:59Z
configs.imageregistry.operator.openshift.io                              2019-02-13T06:20:33Z
configs.samples.operator.openshift.io                                    2019-02-13T06:20:33Z
consoles.config.openshift.io                                             2019-02-13T06:06:16Z
consoles.operator.openshift.io                                           2019-02-13T06:20:33Z
controllerconfigs.machineconfiguration.openshift.io                      2019-02-13T06:16:55Z
credentialsrequests.cloudcredential.openshift.io                         2019-02-13T06:20:33Z
dnsendpoints.multiclusterdns.federation.k8s.io                           2019-02-14T03:34:54Z
dnses.config.openshift.io                                                2019-02-13T06:06:00Z
egressnetworkpolicies.network.openshift.io                               2019-02-13T06:06:44Z
features.config.openshift.io                                             2019-02-13T06:06:17Z
federatedclusters.core.federation.k8s.io                                 2019-02-14T03:34:51Z
federatedconfigmapoverrides.core.federation.k8s.io                       2019-02-14T03:34:51Z
federatedconfigmapplacements.core.federation.k8s.io                      2019-02-14T03:34:51Z
federatedconfigmaps.core.federation.k8s.io                               2019-02-14T03:34:51Z
federateddeploymentoverrides.core.federation.k8s.io                      2019-02-14T03:34:51Z
federateddeploymentplacements.core.federation.k8s.io                     2019-02-14T03:34:51Z
federateddeployments.core.federation.k8s.io                              2019-02-14T03:34:51Z
federatedingresses.core.federation.k8s.io                                2019-02-14T03:34:51Z
federatedingressplacements.core.federation.k8s.io                        2019-02-14T03:34:51Z
federatedjoboverrides.core.federation.k8s.io                             2019-02-14T03:34:51Z
federatedjobplacements.core.federation.k8s.io                            2019-02-14T03:34:52Z
federatedjobs.core.federation.k8s.io                                     2019-02-14T03:34:51Z
federatednamespaceplacements.core.federation.k8s.io                      2019-02-14T03:34:52Z
federatedreplicasetoverrides.core.federation.k8s.io                      2019-02-14T03:34:52Z
federatedreplicasetplacements.core.federation.k8s.io                     2019-02-14T03:34:52Z
federatedreplicasets.core.federation.k8s.io                              2019-02-14T03:34:52Z
federatedsecretoverrides.core.federation.k8s.io                          2019-02-14T03:34:53Z
federatedsecretplacements.core.federation.k8s.io                         2019-02-14T03:34:53Z
federatedsecrets.core.federation.k8s.io                                  2019-02-14T03:34:53Z
federatedserviceaccountplacements.core.federation.k8s.io                 2019-02-14T03:34:54Z
federatedserviceaccounts.core.federation.k8s.io                          2019-02-14T03:34:53Z
federatedserviceplacements.core.federation.k8s.io                        2019-02-14T03:34:54Z
federatedservices.core.federation.k8s.io                                 2019-02-14T03:34:53Z
federatedtypeconfigs.core.federation.k8s.io                              2019-02-14T03:34:54Z
hostsubnets.network.openshift.io                                         2019-02-13T06:06:43Z
images.config.openshift.io                                               2019-02-13T06:06:18Z
infrastructures.config.openshift.io                                      2019-02-13T06:06:00Z
ingresses.config.openshift.io                                            2019-02-13T06:06:00Z
installplans.operators.coreos.com                                        2019-02-13T06:17:48Z
kubeapiservers.operator.openshift.io                                     2019-02-13T06:08:09Z
kubecontrollermanagers.operator.openshift.io                             2019-02-13T06:13:33Z
kubeletconfigs.machineconfiguration.openshift.io                         2019-02-13T06:16:57Z
kubescheduleroperatorconfigs.kubescheduler.operator.openshift.io         2019-02-13T06:12:04Z
machineautoscalers.autoscaling.openshift.io                              2019-02-13T06:17:48Z
machineclasses.cluster.k8s.io                                            2019-02-13T06:16:39Z
machineclasses.machine.openshift.io                                      2019-02-13T06:16:45Z
machineconfigpools.machineconfiguration.openshift.io                     2019-02-13T06:16:56Z
machineconfigs.machineconfiguration.openshift.io                         2019-02-13T06:16:54Z
machinedeployments.cluster.k8s.io                                        2019-02-13T06:16:37Z
machinedeployments.machine.openshift.io                                  2019-02-13T06:16:43Z
machinehealthchecks.healthchecking.openshift.io                          2019-02-13T06:16:40Z
machines.cluster.k8s.io                                                  2019-02-13T06:16:35Z
machines.machine.openshift.io                                            2019-02-13T06:16:41Z
machinesets.cluster.k8s.io                                               2019-02-13T06:16:36Z
machinesets.machine.openshift.io                                         2019-02-13T06:16:42Z
mcoconfigs.machineconfiguration.openshift.io                             2019-02-13T06:16:35Z
multiclusteringressdnsrecords.multiclusterdns.federation.k8s.io          2019-02-14T03:34:55Z
multiclusterservicednsrecords.multiclusterdns.federation.k8s.io          2019-02-14T03:34:55Z
netnamespaces.network.openshift.io                                       2019-02-13T06:06:44Z
network-attachment-definitions.k8s.cni.cncf.io                           2019-02-13T06:06:41Z
networkconfigs.networkoperator.openshift.io                              2019-02-13T06:06:22Z
networks.config.openshift.io                                             2019-02-13T06:06:00Z
oauths.config.openshift.io                                               2019-02-13T06:06:19Z
openshiftapiservers.operator.openshift.io                                2019-02-13T06:18:04Z
openshiftcontrollermanagers.operator.openshift.io                        2019-02-13T06:19:59Z
operatorgroups.operators.coreos.com                                      2019-02-13T06:17:52Z
operatorsources.marketplace.redhat.com                                   2019-02-13T06:20:37Z
projects.config.openshift.io                                             2019-02-13T06:06:21Z
prometheuses.monitoring.coreos.com                                       2019-02-13T06:21:15Z
prometheusrules.monitoring.coreos.com                                    2019-02-13T06:21:15Z
propagatedversions.core.federation.k8s.io                                2019-02-14T03:34:54Z
replicaschedulingpreferences.scheduling.federation.k8s.io                2019-02-14T03:34:55Z
servicecertsigneroperatorconfigs.servicecertsigner.config.openshift.io   2019-02-13T06:08:08Z
servicemonitors.monitoring.coreos.com                                    2019-02-13T06:21:15Z
subscriptions.operators.coreos.com                                       2019-02-13T06:17:49Z
tuneds.tuned.openshift.io                                                2019-02-13T06:20:33Z

Comment 3 sunzhaohua 2019-02-14 05:47:03 UTC
Created attachment 1534677 [details]
cluster-autoscaler-operator log

Comment 4 Jan Chaloupka 2019-02-18 15:10:51 UTC
Thank you for the verification. Based on the logs it looks like the cluster autoscaler operator was looking for machines in different API group. Caused by machine group pivoting we recently performed.

Comment 7 errata-xmlrpc 2019-06-04 10:44:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.