Bug 2084463 - 5 control plane replica tests fail on ephemeral volumes
Summary: 5 control plane replica tests fail on ephemeral volumes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.11
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.11.0
Assignee: Fabio Bertinatto
QA Contact: Wei Duan
URL:
Whiteboard:
: 1999964 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-12 08:13 UTC by Thomas Jungblut
Modified: 2022-08-10 11:11 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 11:11:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift release pull 28830 0 None open Bug 2084463: Disable Generic Ephemeral tests etcd disruptive job 2022-05-26 00:23:32 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:11:42 UTC

Description Thomas Jungblut 2022-05-12 08:13:14 UTC
looking into the 5 control plane replica test and I saw some failures related to in-tree ephemeral volumes:

Failing tests:
[sig-storage] In-tree Volumes [Driver: gcepd] [Testpattern: Generic Ephemeral-volume (default fs) (immediate-binding)] ephemeral should create read-only inline ephemeral volume [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] In-tree Volumes [Driver: gcepd] [Testpattern: Generic Ephemeral-volume (default fs) (immediate-binding)] ephemeral should create read/write inline ephemeral volume [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] In-tree Volumes [Driver: gcepd] [Testpattern: Generic Ephemeral-volume (default fs) (immediate-binding)] ephemeral should support two pods which have the same volume definition [Suite:openshift/conformance/parallel] [Suite:k8s]


example build:
https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-etc[…]er-e2e-gcp-five-control-plane-replicas/1522236579484012544

you can grab any PR build in CEO https://github.com/openshift/cluster-etcd-operator/pulls and check, this is perma failing since Jan 21st.

alternatively ci search:
https://search.ci.openshift.org/?search=ephemeral+should+create+read-only+inline+ephemeral+volume&maxAge=168h&context=1&type=junit&name=.*five-control-plane-replicas&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job

Comment 1 Thomas Jungblut 2022-05-12 08:15:03 UTC
*** Bug 1999964 has been marked as a duplicate of this bug. ***

Comment 2 Fabio Bertinatto 2022-05-19 20:12:31 UTC
These tests are failing because pods cannot start due to NodeAffinity requirements in the volumes.

This is triggered by two events common to this job:

1. There are master nodes in all available zones, but that's not the case for worker nodes.
2. The StorageClass field volumebindingMode is set to Immediate.

This can cause volumes to be provisioned in a zone where there are no worker nodes available. As a result, the pod referring to that volume will be unable to start.

Comment 6 errata-xmlrpc 2022-08-10 11:11:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.