Bug 1703229

Summary:	High write rate from operator updating config maps, needs to be fixed
Product:	OpenShift Container Platform	Reporter:	Clayton Coleman <ccoleman>
Component:	apiserver-auth	Assignee:	Matt Rogers <mrogers>
Status:	CLOSED ERRATA	QA Contact:	Chuan Yu <chuyu>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	4.1.0	CC:	aos-bugs, eparis, gblomqui, mkhan, nagrawal, vichoudh
Target Milestone:	---
Target Release:	4.1.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:	If this bug is not fixed by Friday, May 3, we will move this to 4.2.
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-06-04 10:48:02 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Clayton Coleman 2019-04-25 20:36:07 UTC

{client="service-ca-operator/v0.0.0 (linux/amd64) kubernetes/$Format",resource="configmaps",scope="namespace",verb="GET"}	1.6111111111111112
{client="service-ca-operator/v0.0.0 (linux/amd64) kubernetes/$Format",resource="configmaps",scope="namespace",verb="PUT"}	1.5370370370370372

This is reads and writes per second, it looks like a hot loop.  Must fix for GA because it impacts total write load on cluster.

Comment 2 Vikas Choudhary 2019-04-29 08:48:13 UTC

Looks like the same issue: https://bugzilla.redhat.com/show_bug.cgi?id=1703232

Working to fix it.

Comment 3 Vikas Choudhary 2019-04-29 13:17:26 UTC

https://github.com/kubernetes/kubernetes/pull/77204
https://github.com/kubernetes-sigs/controller-runtime/pull/412

Comment 4 Vikas Choudhary 2019-04-29 13:20:16 UTC

Currently controller-runtime repo runs leader election with hard-coded, very agressive values. With above PRs, leader election configuration will become configurable. Then we would pass higher time durations from cluster-autoscaler-operator using the options which above PRs are adding.

Comment 5 Mo 2019-04-29 14:10:12 UTC

(In reply to Vikas Choudhary from comment #4)
> Currently controller-runtime repo runs leader election with hard-coded, very
> agressive values. With above PRs, leader election configuration will become
> configurable. Then we would pass higher time durations from
> cluster-autoscaler-operator using the options which above PRs are adding.

Service CA does not use controller-runtime

Comment 9 Chuan Yu 2019-05-05 07:38:22 UTC

Verified on 4.1.0-0.nightly-2019-05-04-210601.

{client="service-ca-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",contentType="application/json",endpoint="https",instance="10.0.148.133:6443",job="apiserver",namespace="default",resource="configmaps",scope="namespace",service="kubernetes",verb="GET"}	0.1
{client="service-ca-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",contentType="application/json",endpoint="https",instance="10.0.148.133:6443",job="apiserver",namespace="default",resource="configmaps",scope="namespace",service="kubernetes",verb="PUT"}	0.1
{client="service-ca-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",contentType="application/json",endpoint="https",instance="10.0.172.173:6443",job="apiserver",namespace="default",resource="configmaps",scope="namespace",service="kubernetes",verb="GET"}	0.3
{client="service-ca-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",contentType="application/json",endpoint="https",instance="10.0.172.173:6443",job="apiserver",namespace="default",resource="configmaps",scope="namespace",service="kubernetes",verb="PUT"}	0.3

Comment 11 errata-xmlrpc 2019-06-04 10:48:02 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758