Bug 2049671 - system:serviceaccount:openshift-cluster-csi-drivers:aws-ebs-csi-driver-operator trying to GET and DELETE /api/v1/namespaces/openshift-cluster-csi-drivers/configmaps/kube-cloud-config which does not exist
Summary: system:serviceaccount:openshift-cluster-csi-drivers:aws-ebs-csi-driver-operat...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.9
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: 4.11.0
Assignee: Fabio Bertinatto
QA Contact: Penghao Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-02 14:25 UTC by Simon Reber
Modified: 2022-08-10 10:46 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 10:46:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift aws-ebs-csi-driver-operator pull 151 0 None open Bug 2049671: avoid excessive GET and DELETE in ResourcesSync controller 2022-03-16 11:07:26 UTC
Github openshift library-go pull 1328 0 None open Bug 2049671: resourcesynccontroller: avoid requests to non-existent target 2022-03-15 12:22:51 UTC
Red Hat Knowledge Base (Solution) 6696371 0 None None None 2022-02-02 14:42:50 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:46:54 UTC

Description Simon Reber 2022-02-02 14:25:53 UTC
Description of problem:

On a fresh installed OpenShift Container Platform 4.9.15 - Cluster (using IPI installation method on AWS), we are seeing 3898 failed requests from "system:serviceaccount:openshift-cluster-csi-drivers:aws-ebs-csi-driver-operator" to `/api/v1/namespaces/openshift-cluster-csi-drivers/configmaps/kube-cloud-config` using GET and DELETE Method.

$ oc dev_tool audit -f kube-apiserver/ -otop --by=resource --user="system:serviceaccount:openshift-cluster-csi-drivers:aws-ebs-csi-driver-operator"  --failed-only
had 133483 line read failures
count: 3899, first: 2022-02-01T04:35:29+01:00, last: 2022-02-02T14:06:41+01:00, duration: 33h31m12.448447s
3898x                v1/configmaps
1x                   storage.k8s.io/v1/csidrivers

$ oc dev_tool audit -f kube-apiserver/ -otop --by=verb --user="system:serviceaccount:openshift-cluster-csi-drivers:aws-ebs-csi-driver-operator"  --resource="configmaps" --failed-only
had 133483 line read failures
count: 3898, first: 2022-02-01T04:35:29+01:00, last: 2022-02-02T14:06:41+01:00, duration: 33h31m12.448447s

Top 10 "DELETE" (of 1949 total hits):
   1949x [  3.090904ms] [404-1948] /api/v1/namespaces/openshift-cluster-csi-drivers/configmaps/kube-cloud-config [system:serviceaccount:openshift-cluster-csi-drivers:aws-ebs-csi-driver-operator]

Top 10 "GET" (of 1949 total hits):
   1949x [  4.863701ms] [404-1948] /api/v1/namespaces/openshift-config-managed/configmaps/kube-cloud-config [system:serviceaccount:openshift-cluster-csi-drivers:aws-ebs-csi-driver-operator]

Since `openshift-config-managed/configmaps/kube-cloud-config` is only created/needed when using custom service endpoints we should provide a solution that prevents these GET and DELETE requests and only trigger them when the ConfigMap is really created.

Version-Release number of selected component (if applicable):

 - OpenShift Container Platform 4.9.15

How reproducible:

 - Always

Steps to Reproduce:
1. openshift-install create cluster --dir ocpX --log-level debug (basically https://docs.openshift.com/container-platform/4.9/installing/installing_aws/installing-aws-default.html#installing-aws-default)
2. Added custom PKI certificate as per (https://docs.openshift.com/container-platform/4.9/networking/configuring-a-custom-pki.html). No Proxy! Not sure if that has an impact or not, but I doubt.

Actual results:

All is working as expected, but we have a good amount of failed API requests caused by "system:serviceaccount:openshift-cluster-csi-drivers:aws-ebs-csi-driver-operator" towards `/api/v1/namespaces/openshift-cluster-csi-drivers/configmaps/kube-cloud-config` because the ConfigMap does not exist.

Expected results:

The "aws-ebs-csi-driver-operator" should only try to access `/api/v1/namespaces/openshift-cluster-csi-drivers/configmaps/kube-cloud-config` if custom service endpoints are being used and therefore `/api/v1/namespaces/openshift-cluster-csi-drivers/configmaps/kube-cloud-config` is created. Otherwise it should not try to GET and DELETE the `/api/v1/namespaces/openshift-cluster-csi-drivers/configmaps/kube-cloud-config` resource.

Additional Data:

If you wish, I can upload a `must-gather` containing the configuration as well as the Audit logs. But it's very easy to verify with a simple, fresh installation.

Comment 1 Fabio Bertinatto 2022-02-10 17:50:22 UTC
@Simon, thanks for reporting this.

It seems like there's a confusion between the ConfigMap located at the namespace "openshift-config-managed" and the namespace "openshift-cluster-csi-drivers".

When the "openshift-config-managed/kube-cloud-config" ConfigMap exists, the operator will copy it to the "openshift-cluster-csi-drivers" namespace. On the other hand, when the ConfigMap doesn't exist in the "openshift-config-managed" namespace, the operator _needs_ to make sure that it's absent from from the "openshift-cluster-csi-drivers" namespace as well.

There are 2 ways of doing that:

1. Perform a GET and, if the ConfigMap is present, perform a DELETE.
2. Directly perform a DELETE (saving one GET request when the ConfigMap is present).

Currently, the operator follows the second option.

If I understand correctly, you're suggesting the operator should go with the first option?

Comment 12 errata-xmlrpc 2022-08-10 10:46:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.