Bug 2106216 - Cluster upgrade.[sig-network-edge] Verify DNS availability during and after upgrade success
Summary: Cluster upgrade.[sig-network-edge] Verify DNS availability during and after u...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.11
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.12.0
Assignee: Suleyman Akbas
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-07-12 04:39 UTC by W. Trevor King
Modified: 2022-08-31 10:37 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-31 10:37:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description W. Trevor King 2022-07-12 04:39:00 UTC
Cluster upgrade.[sig-network-edge] Verify DNS availability during and after upgrade success

is failing frequently in CI, see:
https://sippy.ci.openshift.org/sippy-ng/tests/4.12/analysis?test=Cluster%20upgrade.%5Bsig-network-edge%5D%20Verify%20DNS%20availability%20during%20and%20after%20upgrade%20success

and:

$ w3m -dump -cols 200 'https://search.ci.openshift.org/?type=junit&maxAge=24h&search=success+rate+is+less+than+99.+on+the+node' | grep 'failures match' | sort
periodic-ci-openshift-multiarch-master-nightly-4.12-upgrade-from-stable-4.11-ocp-e2e-aws-arm64 (all) - 3 runs, 33% failed, 100% of failures match = 33% impact
periodic-ci-openshift-release-master-ci-4.12-e2e-aws-sdn-upgrade (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
periodic-ci-openshift-release-master-ci-4.12-upgrade-from-stable-4.11-from-stable-4.10-e2e-aws-sdn-upgrade (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
periodic-ci-openshift-release-master-nightly-4.11-e2e-aws-upgrade (all) - 20 runs, 20% failed, 25% of failures match = 5% impact
periodic-ci-openshift-release-master-nightly-4.12-upgrade-from-stable-4.11-e2e-aws-sdn-upgrade (all) - 4 runs, 50% failed, 50% of failures match = 25% impact
pull-ci-openshift-cluster-monitoring-operator-master-e2e-agnostic-upgrade (all) - 6 runs, 50% failed, 33% of failures match = 17% impact
release-openshift-origin-installer-e2e-aws-upgrade (all) - 2 runs, 50% failed, 100% of failures match = 50% impact

Looks like it impacts 4.11 and 4.12.  Although [1] is still looking pretty happy for this test case.  The 4.11.0-rc.1 to rc.2 update [2] hit this:

fail [github.com/openshift/origin/test/e2e/upgrade/dns/dns.go:138]: Unexpected error:
    <*errors.errorString | 0xc0021e5200>: {
        s: "success rate is less than 99% on the node ip-10-0-157-24.us-east-2.compute.internal: [98.39]",

which sounds like a failure for:

  disruption_tests: [sig-network-edge] Verify DNS availability during and after upgrade success

even though it was reported under:

  : [sig-arch][Feature:ClusterUpgrade] Cluster should remain functional during upgrade [Disruptive] [Serial]

and [3] shows some failures too, although without much deep history.

[1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-openshift-release-master-nightly-4.11-e2e-aws-upgrade
[2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/1546606843319554048
[3]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.12-informing#periodic-ci-openshift-release-master-ci-4.12-e2e-aws-sdn-upgrade

Comment 1 Miciah Dashiel Butler Masters 2022-07-12 14:19:00 UTC
Right now, Sippy is showing that the success rate has dropped from 98.9% to 98.4%.  This is a relatively new test, so we don't have a lot of historical data on how much error this is.  So I'm marking this BZ a blocker-, and the team can prioritize after Shift Week.


Note You need to log in before you can comment on or make changes to this bug.