Bug 2018517 - [sig-arch] events should not repeat pathologically expand_less failures - s390x CI
Summary: [sig-arch] events should not repeat pathologically expand_less failures - s3...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.10
Hardware: s390x
OS: Linux
medium
medium
Target Milestone: ---
: 4.11.0
Assignee: Mike Fedosin
QA Contact: Huali Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-10-29 15:04 UTC by Lakshmi Ravichandran
Modified: 2022-08-10 10:39 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 10:39:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-api-provider-libvirt pull 230 0 None open Bug 2018517: Create event only if the machine was modified 2021-11-02 19:12:34 UTC
Github openshift cluster-api-provider-libvirt pull 235 0 None open Bug 2018517: fix the check that machine has been modified 2022-03-11 11:57:15 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:39:33 UTC

Description Lakshmi Ravichandran 2021-10-29 15:04:39 UTC
Description of problem:
We see failures with "[sig-arch] events should not repeat pathologically expand_less failures" in recent CI upgrade runs of OCP 4.9 to 4.10 and 4.8 to 4.9 for s390x. 


: [sig-arch] events should not repeat pathologically expand_less	0s
2 events happened too frequently

event happened 34 times, something is wrong: ns/openshift-machine-api machine/libvirt-s390x-0-0-708-wltbm-worker-0-ck7tg - reason/Updated Updated Machine libvirt-s390x-0-0-708-wltbm-worker-0-ck7tg
event happened 34 times, something is wrong: ns/openshift-machine-api machine/libvirt-s390x-0-0-708-wltbm-worker-0-trzd6 - reason/Updated Updated Machine libvirt-s390x-0-0-708-wltbm-worker-0-trzd6

1) 4.9 to 4.10 upgrade --> https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-multiarch-master-nightly-4.10-upgrade-from-nightly-4.9-ocp-remote-libvirt-s390x/1452575786342027264

2) 4.8 to 4.9 upgrade --> https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-multiarch-master-nightly-4.9-upgrade-from-nightly-4.8-ocp-remote-libvirt-s390x/1452288892060307456

Search results --> https://search.ci.openshift.org/?search=reason%2FUpdated+Updated+Machine&maxAge=48h&context=1&type=bug%2Bjunit&name=s390x&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job

Version-Release number of selected component (if applicable):
ocp 4.10, 4.9

How reproducible:
Monitor the s390x CI jobs here https://prow.ci.openshift.org/?job=periodic*remote*libvirt*s390x for the upgrade jobs.

Steps to Reproduce:
-

Actual results:
The tests should pass.

Expected results:


Additional info:
- Observing this on libvirt platform
- related BZ - https://bugzilla.redhat.com/show_bug.cgi?id=1988992 where it was earlier reported and noticed on Azure platform.
- based on slack triage, looks like the event creation should happen after the update succeeded at https://github.com/openshift/cluster-api-provider-libvirt/blob/master/pkg/cloud/libvirt/actuators/machine/actuator.go#L209

Comment 6 Surender Yadav 2022-02-10 14:52:48 UTC
We still see failures with "[sig-arch] events should not repeat pathologically" in recent CI upgrade runs of OCP 4.10 to 4.11, 4.9 to 4.10 and 4.8 to 4.9 on s390x. 


: [sig-arch] events should not repeat pathologically
2 events happened too frequently

event happened 51 times, something is wrong: ns/openshift-machine-api machine/libvirt-s390x-1-1-1e5-btbz4-worker-0-jbj7n - reason/Updated Updated Machine libvirt-s390x-1-1-1e5-btbz4-worker-0-jbj7n
event happened 47 times, something is wrong: ns/openshift-machine-api machine/libvirt-s390x-1-1-1e5-btbz4-worker-0-sxf6q - reason/Updated Updated Machine libvirt-s390x-1-1-1e5-btbz4-worker-0-sxf6q


1) 4.8 to 4.9 upgrade --> 

2) 4.9 to 4.10 upgrade --> 

3) 4.10 to 4.11 upgrade --> 


Search results --> https://search.ci.openshift.org/?search=reason%2FUpdated+Updated+Machine&maxAge=336h&context=1&type=bug%2Bjunit&name=s390x&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job

Version-Release number of selected component (if applicable):
ocp 4.11, 4.10, 4.9

How reproducible:
Monitor the s390x CI jobs here https://prow.ci.openshift.org/?job=periodic*remote*libvirt*s390x for the upgrade jobs.

Steps to Reproduce:
-

Actual results:
The tests should pass.

Expected results:
We still see failures with "[sig-arch] events should not repeat pathologically" in recent CI upgrade runs of OCP 4.10 to 4.11, 4.9 to 4.10 and 4.8 to 4.9 on s390x. 


: [sig-arch] events should not repeat pathologically
2 events happened too frequently

event happened 51 times, something is wrong: ns/openshift-machine-api machine/libvirt-s390x-1-1-1e5-btbz4-worker-0-jbj7n - reason/Updated Updated Machine libvirt-s390x-1-1-1e5-btbz4-worker-0-jbj7n
event happened 47 times, something is wrong: ns/openshift-machine-api machine/libvirt-s390x-1-1-1e5-btbz4-worker-0-sxf6q - reason/Updated Updated Machine libvirt-s390x-1-1-1e5-btbz4-worker-0-sxf6q


1) 4.8 to 4.9 upgrade --> https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-multiarch-master-nightly-4.9-upgrade-from-nightly-4.8-ocp-remote-libvirt-s390x/1491245642264088576

2) 4.9 to 4.10 upgrade --> https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-multiarch-master-nightly-4.10-upgrade-from-nightly-4.9-ocp-remote-libvirt-s390x/1491487255926149120

3) 4.10 to 4.11 upgrade --> https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-multiarch-master-nightly-4.11-upgrade-from-nightly-4.10-ocp-remote-libvirt-s390x/1491532562751819776


Search results --> https://search.ci.openshift.org/?search=reason%2FUpdated+Updated+Machine&maxAge=336h&context=1&type=bug%2Bjunit&name=s390x&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job

Version-Release number of selected component (if applicable):
ocp 4.11, 4.10, 4.9

How reproducible:
Monitor the s390x CI jobs here https://prow.ci.openshift.org/?job=periodic*remote*libvirt*s390x for the upgrade jobs.

Steps to Reproduce:
-
Actual results:
The test fails

Expected results:
The tests should pass.

Comment 12 Huali Liu 2022-04-11 01:44:46 UTC
Hi Joel, do you think this should be backport to 4.10 and 4.9 to solve the failures with "[sig-arch] events should not repeat pathologically" for 4.9 to 4.10 upgrade and 4.8 to 4.9 upgrade?

Comment 14 errata-xmlrpc 2022-08-10 10:39:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.