Bug 1694182 - [rebase] Pod readiness gate test is failing
Summary: [rebase] Pod readiness gate test is failing
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.1.0
Assignee: Seth Jennings
QA Contact: Weinan Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-29 17:27 UTC by Clayton Coleman
Modified: 2019-06-04 10:46 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-04 10:46:34 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 None None None 2019-06-04 10:46:42 UTC

Description Clayton Coleman 2019-03-29 17:27:31 UTC
fail [k8s.io/kubernetes/test/e2e/common/pods.go:737]: Expected error:
    <*errors.errorString | 0xc42029b580>: {
        s: "timed out waiting for the condition",
    }
    timed out waiting for the condition
not to have occurred

Is the readiness flag gate even on?  If not, why is this test running?  If it is on, please:

a. verify it should be on
b. ensure the test isn't flaky

Setting high because we need to know why the gate is on or whether it should be off - if that's resolved it can be dropped to medium but is still a CI impacter 1/12 flake rate.

https://openshift-gce-devel.appspot.com/build/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-4.0/6254#openshift-tests-k8sio-pods-should-support-pod-readiness-gates-nodefeaturepodreadinessgate-suiteopenshiftconformanceparallel-suitek8s

Comment 1 Seth Jennings 2019-04-04 15:29:55 UTC
https://github.com/kubernetes/kubernetes/pull/69303

Introduced a change in both the kubelet and e2e.  The current version skew between the e2e (1.13) and kubelet in RHCOS (1.12) is causing this failure.  Once the kubelet is 1.13 in RHCOS (which it is already is, but the pivot takes it back to 1.12 as of yesterday), this will go away.

Comment 2 Seth Jennings 2019-04-04 16:52:35 UTC
ART is pushing the new os container that has the 1.13 based hyperkube right now.  Once this is done we can deploy/upgrade a cluster and verify this is fixed.

Comment 3 Seth Jennings 2019-04-04 18:30:57 UTC
Moving this to POST as a high level indicator that the fix is merged and verification is pending.  Don't want to dump this on QE.  If it works, I'll just close as this was a transient issue caused by rebase version skew.

Comment 4 Seth Jennings 2019-04-04 18:35:38 UTC
PR to re-enable test
https://github.com/openshift/origin/pull/22486

Comment 5 Seth Jennings 2019-04-04 22:59:39 UTC
this is a NodeConformance test.  confirmed blocker.

Comment 6 Seth Jennings 2019-04-09 17:39:20 UTC
origin CI release build 4.0.0-0.alpha-2019-04-09-164546 moved machine-os-content to 1.13 base
https://origin-release.svc.ci.openshift.org/releasestream/4.0.0-0.alpha/release/4.0.0-0.alpha-2019-04-09-164546

Comment 9 Weinan Liu 2019-04-22 10:00:01 UTC
(In reply to Seth Jennings from comment #3)
> Moving this to POST as a high level indicator that the fix is merged and
> verification is pending.  Don't want to dump this on QE.  If it works, I'll
> just close as this was a transient issue caused by rebase version skew.

Hi Seth,
Do you still need QE involved in the verification? If not and you have already verified the rebase issue, would you mind pushing it to VERIFIED?
Thanks!

Comment 11 errata-xmlrpc 2019-06-04 10:46:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.