Bug 1868760 - [4.4] node client cert requests armoring: deny pod's access to /config/master API endpoint
Summary: [4.4] node client cert requests armoring: deny pod's access to /config/master...
Keywords:
Status: VERIFIED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.6.0
Assignee: Michael McCune
QA Contact: sunzhaohua
URL:
Whiteboard: non-multi-arch
: 1868464 (view as bug list)
Depends On:
Blocks: 1876931 1868464
TreeView+ depends on / blocked
 
Reported: 2020-08-13 18:17 UTC by Micah Abbott
Modified: 2020-09-14 08:23 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of: 1868464
Environment:
node client cert requests armoring: [Top Level] node client cert requests armoring: deny pod's access to /config/master API endpoint [Suite:openshift/conformance/parallel]
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift origin pull 25480 None closed Bug 1868760: Drop quay.io/fedora/fedora:32-x86_64 in favor of docker.io/fedora:32 2020-09-14 01:11:13 UTC

Description Micah Abbott 2020-08-13 18:17:16 UTC
+++ This bug was initially created as a clone of Bug #1868464 +++

test:
node client cert requests armoring: deny pod's access to /config/master API endpoint 

is failing frequently in CI, see search results:
https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=node+client+cert+requests+armoring%3A+deny+pod%27s+access+to+%2Fconfig%2Fmaster+API+endpoint

fail [github.com/openshift/origin/test/extended/csrapprover/csrapprover.go:48]: Unexpected error:
    <*errors.errorString | 0xc0002981c0>: {
        s: "timed out waiting for the condition",
    }
    timed out waiting for the condition
occurred

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-remote-libvirt-s390x-4.3/1293578108904935424

--- Additional comment from Seth Jennings on 2020-08-12 18:51:14 UTC ---

failure context

=============
[It] deny pod's access to /config/master API endpoint [Suite:openshift/conformance/parallel]
  /go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/test/extended/csrapprover/csrapprover.go:36
Aug 12 17:31:23.259: INFO: Running 'oc --namespace=e2e-test-cluster-client-cert-bn47n --config=/tmp/configfile787210623 run get-bootstrap-creds --labels name=get-bootstrap-creds --image quay.io/fedora/fedora:32-x86_64 --restart Never --command -- /bin/bash -c sleep infinity'
[AfterEach] node client cert requests armoring:
  /go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/test/extended/util/client.go:101
STEP: Collecting events from namespace "e2e-test-cluster-client-cert-bn47n".
STEP: Found 5 events.
Aug 12 17:34:25.311: INFO: At 0001-01-01 00:00:00 +0000 UTC - event for get-bootstrap-creds: {default-scheduler } Scheduled: Successfully assigned e2e-test-cluster-client-cert-bn47n/get-bootstrap-creds to ci-op-pbbtjczd-416f4-lv9g6-worker-0-hbwd6
Aug 12 17:34:25.311: INFO: At 2020-08-12 17:31:26 +0000 UTC - event for get-bootstrap-creds: {kubelet ci-op-pbbtjczd-416f4-lv9g6-worker-0-hbwd6} Pulling: Pulling image "quay.io/fedora/fedora:32-x86_64"
Aug 12 17:34:25.311: INFO: At 2020-08-12 17:31:38 +0000 UTC - event for get-bootstrap-creds: {kubelet ci-op-pbbtjczd-416f4-lv9g6-worker-0-hbwd6} Pulled: Successfully pulled image "quay.io/fedora/fedora:32-x86_64"
Aug 12 17:34:25.311: INFO: At 2020-08-12 17:31:38 +0000 UTC - event for get-bootstrap-creds: {kubelet ci-op-pbbtjczd-416f4-lv9g6-worker-0-hbwd6} Created: Created container get-bootstrap-creds
Aug 12 17:34:25.311: INFO: At 2020-08-12 17:31:38 +0000 UTC - event for get-bootstrap-creds: {kubelet ci-op-pbbtjczd-416f4-lv9g6-worker-0-hbwd6} Started: Started container get-bootstrap-creds
Aug 12 17:34:25.451: INFO: POD                  NODE                                       PHASE   GRACE  CONDITIONS
Aug 12 17:34:25.451: INFO: get-bootstrap-creds  ci-op-pbbtjczd-416f4-lv9g6-worker-0-hbwd6  Failed         [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2020-08-12 17:31:24 +0000 UTC  } {Ready False 0001-01-01 00:00:00 +0000 UTC 2020-08-12 17:31:24 +0000 UTC ContainersNotReady containers with unready status: [get-bootstrap-creds]} {ContainersReady False 0001-01-01 00:00:00 +0000 UTC 2020-08-12 17:31:24 +0000 UTC ContainersNotReady containers with unready status: [get-bootstrap-creds]} {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2020-08-12 17:31:24 +0000 UTC  }]
Aug 12 17:34:25.451: INFO: 
Aug 12 17:34:25.596: INFO: get-bootstrap-creds[e2e-test-cluster-client-cert-bn47n].container[get-bootstrap-creds].log
standard_init_linux.go:211: exec user process caused "exec format error"

Aug 12 17:34:25.731: INFO: skipping dumping cluster info - cluster too large
Aug 12 17:34:25.934: INFO: Deleted {user.openshift.io/v1, Resource=users  e2e-test-cluster-client-cert-bn47n-user}, err: <nil>
Aug 12 17:34:26.152: INFO: Deleted {oauth.openshift.io/v1, Resource=oauthclients  e2e-client-e2e-test-cluster-client-cert-bn47n}, err: <nil>
Aug 12 17:34:26.339: INFO: Deleted {oauth.openshift.io/v1, Resource=oauthaccesstokens  P8J7qchYTRC8PB-c4PbdZQAAAAAAAAAA}, err: <nil>
[AfterEach] node client cert requests armoring:
  /go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/framework.go:152
Aug 12 17:34:26.339: INFO: Waiting up to 7m0s for all (but 100) nodes to be ready
STEP: Destroying namespace "e2e-test-cluster-client-cert-bn47n" for this suite.
Aug 12 17:34:26.682: INFO: Running AfterSuite actions on all nodes
Aug 12 17:34:26.682: INFO: Running AfterSuite actions on node 1
fail [github.com/openshift/origin/test/extended/csrapprover/csrapprover.go:48]: Unexpected error:
    <*errors.errorString | 0xc0002981c0>: {
        s: "timed out waiting for the condition",
    }
    timed out waiting for the condition
occurred

failed: (3m11s) 2020-08-12T17:34:26 "node client cert requests armoring: deny pod's access to /config/master API endpoint [Suite:openshift/conformance/parallel]"
=============

in particular 

standard_init_linux.go:211: exec user process caused "exec format error"

test suite is e2e-remote-libvirt-s390x-4.3 so this is s390x trying to exec a x86_64 binary

--- Additional comment from Sohan Kunkerkar on 2020-08-12 18:56:57 UTC ---



--- Additional comment from Seth Jennings on 2020-08-12 18:59:04 UTC ---

changed in
4.6 https://github.com/openshift/origin/pull/25087

backported in
4.5 https://bugzilla.redhat.com/show_bug.cgi?id=1846091
4.4 https://bugzilla.redhat.com/show_bug.cgi?id=1862171
4.3 https://bugzilla.redhat.com/show_bug.cgi?id=1867402

xref https://bugzilla.redhat.com/show_bug.cgi?id=1845792

Node team did backports to 4.4 and 4.3 in response to https://bugzilla.redhat.com/show_bug.cgi?id=1867613 but change originated with Cloud team.

--- Additional comment from Seth Jennings on 2020-08-12 19:00:46 UTC ---

Failing against all releases that run this test
https://deck-ci.apps.ci.l2s4.p1.openshiftapps.com/?job=*e2e-remote-libvirt-s390x*

--- Additional comment from Michael McCune on 2020-08-12 20:36:37 UTC ---

i don't think this bug is about the Cloud Compute component, it should probably be addressed to the node team.

--- Additional comment from Seth Jennings on 2020-08-13 14:23:41 UTC ---

Assigned to Cloud because https://bugzilla.redhat.com/show_bug.cgi?id=1845792, the change that introduced this break, was assign to Cloud and Alberto

--- Additional comment from Michael McCune on 2020-08-13 14:35:38 UTC ---

ack, thanks Seth. i'll spend a little more time reviewing those.

Comment 4 Michael McCune 2020-09-04 19:33:00 UTC
*** Bug 1868464 has been marked as a duplicate of this bug. ***

Comment 7 sunzhaohua 2020-09-14 08:23:53 UTC
From the test history, didn't meet this again, move it to Verified.
https://prow.ci.openshift.org/pr-history/?org=openshift&repo=origin&pr=25480


Note You need to log in before you can comment on or make changes to this bug.