Bug 1988992 - Worker machine object updated too many times [Azure]
Summary: Worker machine object updated too many times [Azure]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.9
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.9.0
Assignee: Mike Fedosin
QA Contact: sunzhaohua
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-02 08:50 UTC by Jan Chaloupka
Modified: 2021-10-29 12:26 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-29 12:26:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-api-provider-azure pull 227 0 None Merged Bug 1988992: Create event only if the machine was modified 2021-11-02 22:39:52 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:43:58 UTC

Description Jan Chaloupka 2021-08-02 08:50:13 UTC
The recent CI runs of https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing#periodic-ci-openshift-release-master-ci-4.9-e2e-azure-serial fail with (https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-azure-serial/1421911764836028416):

```
1 events happened too frequently

event happened 21 times, something is wrong: ns/openshift-machine-api machine/ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l - reason/Updated Updated machine "ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l"
```

The failures originates from "[sig-arch] events should not repeat pathologically" test introduced in https://github.com/openshift/origin/pull/26323. The test fails if the same event is repeated too many times in a row (at least 20 occurrences).

It might be a valid failure. Or, at least motivation to decrease the number of "Updated machine" events.

Checking machine controller logs from https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-azure-serial/1421911764836028416/artifacts/e2e-azure-serial/gather-extra/artifacts/pods/openshift-machine-api_machine-api-controllers-d8cbc5f4d-m7jhp_machine-controller.log:
```
I0801 20:10:35.665776       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"49186"} "reason"="Updated"
I0801 20:10:40.718697       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"49199"} "reason"="Updated"
I0801 20:10:42.773245       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"49236"} "reason"="Updated"
I0801 20:10:55.379793       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"49236"} "reason"="Updated"
I0801 20:11:25.684447       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"49236"} "reason"="Updated"
I0801 20:11:55.961125       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"49236"} "reason"="Updated"
I0801 20:12:26.296860       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"49236"} "reason"="Updated"
I0801 20:12:26.611072       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"49967"} "reason"="Updated"
I0801 20:12:56.635267       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"49967"} "reason"="Updated"
I0801 20:13:26.959965       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"49967"} "reason"="Updated"
I0801 20:13:57.334110       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"49967"} "reason"="Updated"
I0801 20:14:27.681824       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"49967"} "reason"="Updated"
I0801 20:14:58.058939       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"49967"} "reason"="Updated"
I0801 20:15:22.036123       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"51350"} "reason"="Updated"
I0801 20:15:22.355123       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"51409"} "reason"="Updated"
I0801 20:15:28.349501       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"51409"} "reason"="Updated"
I0801 20:16:12.400193       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"52207"} "reason"="Updated"
I0801 20:17:33.264669       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"52207"} "reason"="Updated"
I0801 20:27:02.482935       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"52207"} "reason"="Updated"
I0801 20:37:23.248666       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"52207"} "reason"="Updated"
I0801 20:47:46.701092       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"ci-op-c8k19g5q-52e47-6kzkj-worker-centralus1-xwg7l","uid":"1413a2eb-c904-4abe-97e8-1c806bb38240","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"52207"} "reason"="Updated"
```

and the corresponding timestamps:
```
20:10:35.665776
20:10:40.718697
20:10:42.773245
20:10:55.379793
20:11:25.684447
20:11:55.961125
20:12:26.296860
20:12:26.611072
20:12:56.635267
20:13:26.959965
20:13:57.334110
20:14:27.681824
20:14:58.058939
20:15:22.036123
20:15:22.355123
20:15:28.349501
20:16:12.400193
20:17:33.264669
20:27:02.482935
20:37:23.248666
20:47:46.701092
```

Please let us know if the amount of machine object updates is acceptable (and thus we need to update the test to ignore the events for the machine controller or increase the threshold) or if the amount can be reduced.

Comment 2 sunzhaohua 2021-08-11 06:49:52 UTC
Verified
clusterversion: 4.9.0-0.nightly-2021-08-07-175228

$ oc logs machine-api-controllers-86cbb699b6-cqqs7 -c machine-controller | grep "Updated machine"
$  

After delete a machine, logs show "Updated machine"
$  oc delete machine zhsun-811-px5nm-worker-centralus1-jwdww
machine.machine.openshift.io "zhsun-811-px5nm-worker-centralus1-jwdww" deleted
$  oc logs machine-api-controllers-86cbb699b6-cqqs7 -c machine-controller | grep "Updated machine"
I0811 06:35:41.473296       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"zhsun-811-px5nm-worker-centralus1-6wc42\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"zhsun-811-px5nm-worker-centralus1-6wc42","uid":"336d9822-243d-48b6-9453-c8f1525478b6","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"98150"} "reason"="Updated"
I0811 06:36:42.145008       1 recorder.go:104] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"zhsun-811-px5nm-worker-centralus1-6wc42\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"zhsun-811-px5nm-worker-centralus1-6wc42","uid":"336d9822-243d-48b6-9453-c8f1525478b6","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"98280"} "reason"="Updated"

Comment 5 errata-xmlrpc 2021-10-18 17:43:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759

Comment 6 Surender Yadav 2021-10-25 15:41:13 UTC
We can see similar failure with recent CI upgrade runs of OCP 4.9 to 4.10 and 4.8 to 4.9 for s390x. 


: [sig-arch] events should not repeat pathologically expand_less	0s
2 events happened too frequently

event happened 34 times, something is wrong: ns/openshift-machine-api machine/libvirt-s390x-0-0-708-wltbm-worker-0-ck7tg - reason/Updated Updated Machine libvirt-s390x-0-0-708-wltbm-worker-0-ck7tg
event happened 34 times, something is wrong: ns/openshift-machine-api machine/libvirt-s390x-0-0-708-wltbm-worker-0-trzd6 - reason/Updated Updated Machine libvirt-s390x-0-0-708-wltbm-worker-0-trzd6

1) 4.9 to 4.10 upgrade --> https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-multiarch-master-nightly-4.10-upgrade-from-nightly-4.9-ocp-remote-libvirt-s390x/1452575786342027264

2) 4.8 to 4.9 upgrade --> https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-multiarch-master-nightly-4.9-upgrade-from-nightly-4.8-ocp-remote-libvirt-s390x/1452288892060307456

Search results --> https://search.ci.openshift.org/?search=reason%2FUpdated+Updated+Machine&maxAge=48h&context=1&type=bug%2Bjunit&name=s390x&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job

Comment 7 Mike Fedosin 2021-10-29 12:26:40 UTC
Folks, I think the issue you see is not related to Azure platform. The bug for Azure has been fixed in cluster-api-provider-azure in 4.9 by this commit: https://github.com/openshift/cluster-api-provider-azure/pull/227

Since you see a similar issue for libvirt, I recommend to create another bz and specify Libvirt platform in the description, so it can be fixed in https://github.com/openshift/cluster-api-provider-libvirt


Note You need to log in before you can comment on or make changes to this bug.