Bug 1691119 - machine-api-operator is not reporting failure using clusteroperator
Summary: machine-api-operator is not reporting failure using clusteroperator
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.1.0
Assignee: Jan Chaloupka
QA Contact: Jianwei Hou
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-20 22:39 UTC by Abhinav Dahiya
Modified: 2019-10-28 09:52 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-04 10:46:16 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 None None None 2019-06-04 10:46:23 UTC

Description Abhinav Dahiya 2019-03-20 22:39:41 UTC
Description of problem:

During my work on bare-metal UPI https://github.com/openshift/installer/pull/1416. The cluster is installed using None platform.

Machine API Operator was failing because it does not recognize None as a platform and was silently failing without reporting the error to its ClusterOperator.

oc --config dev-metal/auth/kubeconfig logs machine-api-operator-5f8d8dc78c-5k7dh -n openshift-machine-api
I0320 16:19:06.600839       1 start.go:39] Version: 0.1.0-256-g92bef467-dirty
I0320 16:19:06.602673       1 leaderelection.go:205] attempting to acquire leader lease  openshift-machine-api/machine-api-operator...
I0320 16:19:06.629429       1 leaderelection.go:214] successfully acquired lease openshift-machine-api/machine-api-operator
I0320 16:19:06.631028       1 operator.go:106] Starting Machine API Operator
I0320 16:19:06.731370       1 operator.go:114] Synced up caches
E0320 16:19:06.734565       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:19:06.742681       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:19:06.755565       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:19:06.778517       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:19:06.822930       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:19:06.906692       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:19:07.069804       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:19:07.392849       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:19:07.608933       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:19:08.036477       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:19:10.599568       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:19:15.725471       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:19:25.975529       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:19:46.459374       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:20:27.428836       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:21:49.359209       1 operator.go:176] Failed getting operator config: no platform provider found on install config
E0320 16:21:49.359491       1 operator.go:162] no platform provider found on install config

oc --config dev-metal/auth/kubeconfig get co
NAME                                  VERSION                           AVAILABLE   PROGRESSING   FAILING   SINCE
authentication                        4.0.0-0.alpha-2019-03-20-094557   True        False         False     50m
cluster-autoscaler                                                      True        False         True      2s
console                               4.0.0-0.alpha-2019-03-20-094557   True        False         False     50m
dns                                   4.0.0-0.alpha-2019-03-20-094557   True        False         False     6h19m
image-registry                        4.0.0-0.alpha-2019-03-20-094557   True        False         False     51m
ingress                               4.0.0-0.alpha-2019-03-20-094557   True        False         False     3h14m
kube-apiserver                        4.0.0-0.alpha-2019-03-20-094557   True        False         False     6h15m
kube-controller-manager               4.0.0-0.alpha-2019-03-20-094557   True        False         False     6h13m
kube-scheduler                        4.0.0-0.alpha-2019-03-20-094557   True        False         False     6h12m
machine-config                        4.0.0-0.alpha-2019-03-20-094557   False       True          True      6h19m
marketplace-operator                  4.0.0-0.alpha-2019-03-20-094557   True        False         False     6h15m
monitoring                            4.0.0-0.alpha-2019-03-20-094557   True        False         False     16m
network                               4.0.0-0.alpha-2019-03-20-094557   True        False         False     6h19m
node-tuning                           4.0.0-0.alpha-2019-03-20-094557   True        False         False     6h15m
openshift-apiserver                   4.0.0-0.alpha-2019-03-20-094557   True        False         False     17m
openshift-cloud-credential-operator   4.0.0-0.alpha-2019-03-20-094557   True        False         False     6h19m
openshift-controller-manager          4.0.0-0.alpha-2019-03-20-094557   True        False         False     50m
openshift-samples                     4.0.0-0.alpha-2019-03-20-094557   True        False         False     6h14m
operator-lifecycle-manager            4.0.0-0.alpha-2019-03-20-094557   True        False         False     6h19m
service-ca                            4.0.0-0.alpha-2019-03-20-094557   True        False         False     137m
service-catalog-apiserver             4.0.0-0.alpha-2019-03-20-094557   True        False         False     50m
service-catalog-controller-manager    4.0.0-0.alpha-2019-03-20-094557   True        False         False     137m
storage                                                                 True        False         False     6h15m


oc --config dev-metal/auth/kubeconfig get co | grep machine-api


Machine API Operator should always report status using ClusterOperator when api is available..

Comment 2 Jan Chaloupka 2019-03-26 10:56:18 UTC
PR merged

Comment 4 Wei Sun 2019-04-10 02:58:17 UTC
Please help check if it could be verified against the latest build.

Comment 5 sunzhaohua 2019-04-23 09:26:49 UTC
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2019-04-22-005054   True        False         76m     Cluster version is 4.1.0-0.nightly-2019-04-22-005054



$ oc logs -f machine-api-operator-7d58d4ddbd-f7d9b
I0423 07:39:20.284433       1 start.go:39] Version: 4.1.0-201904211700-dirty
I0423 07:39:20.287034       1 leaderelection.go:205] attempting to acquire leader lease  openshift-machine-api/machine-api-operator...
I0423 07:39:20.299364       1 leaderelection.go:214] successfully acquired lease openshift-machine-api/machine-api-operator
I0423 07:39:20.302118       1 operator.go:121] Starting Machine API Operator
I0423 07:39:20.402328       1 operator.go:129] Synced up caches
I0423 07:39:20.408679       1 status.go:172] machine-api clusterOperator status does not exist, creating &{{ } {machine-api      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] nil [] } {} {[] [{operator 4.1.0-0.nightly-2019-04-22-005054}] [{ namespaces  openshift-machine-api}] {[] <nil>}}}
I0423 07:39:20.415343       1 event.go:221] Event(v1.ObjectReference{Kind:"ClusterOperator", Namespace:"", Name:"machine-api", UID:"e80dc31f-659a-11e9-a21d-801844eef6b8", APIVersion:"config.openshift.io/v1", ResourceVersion:"2528", FieldPath:""}): type: 'Normal' reason: 'Status upgrade' Progressing towards operator: 4.1.0-0.nightly-2019-04-22-005054
E0423 07:42:37.267388       1 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=63, ErrCode=NO_ERROR, debug=""
W0423 07:42:37.313638       1 reflector.go:270] k8s.io/client-go/informers/factory.go:132: watch of *v1.Deployment ended with: too old resource version: 2562 (4450)
E0423 07:46:48.442485       1 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=85, ErrCode=NO_ERROR, debug=""
E0423 07:58:33.095111       1 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=161, ErrCode=NO_ERROR, debug=""
W0423 07:58:33.123442       1 reflector.go:270] k8s.io/client-go/informers/factory.go:132: watch of *v1.Deployment ended with: too old resource version: 15665 (17029)

$ oc get clusteroperator
NAME                                 VERSION                             AVAILABLE   PROGRESSING   FAILING   SINCE
authentication                       4.1.0-0.nightly-2019-04-22-005054   True        False         False     76m
cloud-credential                     4.1.0-0.nightly-2019-04-22-005054   True        False         False     94m
cluster-autoscaler                   4.1.0-0.nightly-2019-04-22-005054   True        False         False     94m
console                              4.1.0-0.nightly-2019-04-22-005054   True        False         False     78m
dns                                  4.1.0-0.nightly-2019-04-22-005054   True        False         False     94m
image-registry                       4.1.0-0.nightly-2019-04-22-005054   True        False         False     79m
ingress                              4.1.0-0.nightly-2019-04-22-005054   True        False         False     80m
kube-apiserver                       4.1.0-0.nightly-2019-04-22-005054   True        False                   89m
kube-controller-manager              4.1.0-0.nightly-2019-04-22-005054   True        False                   90m
kube-scheduler                       4.1.0-0.nightly-2019-04-22-005054   True        False                   90m
machine-api                          4.1.0-0.nightly-2019-04-22-005054   True        False         False     94m

Comment 7 errata-xmlrpc 2019-06-04 10:46:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.