1441748 – AWS quota problems in Openshift 3.5

Bug 1441748 - AWS quota problems in Openshift 3.5

Summary: AWS quota problems in Openshift 3.5

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Storage
Sub Component:
Version:	3.5.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	3.5.z
Assignee:	Jan Safranek
QA Contact:	Chao Yang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-04-12 15:31 UTC by Hemant Kumar
Modified:	2017-04-26 05:37 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: OpenShift logic for persistent volume attach/detach logic on AWS queried status of each attach/detach operation using separate API calls for each persistent volume. OpenShift could run out of AWS API call quota and would be throttled by AWS. Consequence: Attach/detach operation could get slow when multiple volumes were attached/detached at the same time. Fix: OpenShift uses bulk query to determine status of all attach/detach operations at once. Result: OpenShift is faster to attach/detach a volume.
Clone Of:
Environment:
Last Closed:	2017-04-26 05:37:41 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2017:1129	0	normal	SHIPPED_LIVE	OpenShift Container Platform 3.5, 3.4, 3.3, and 3.2 bug fix update	2017-04-26 09:35:35 UTC

Description Hemant Kumar 2017-04-12 15:31:22 UTC

Description of problem:

We are overshooting the AWS quota in 3.5. I saw following messages in controller logs


Apr 11 18:11:37 ip-172-31-56-130.ec2.internal atomic-openshift-master-controllers[31261]: W0411 18:11:37.130373   31261
retry_handler.go:87] Got RequestLimitExceeded error on AWS request (ec2::DescribeInstances)


We should backport - https://github.com/kubernetes/kubernetes/pull/41306 to 3.5

Comment 1 Jan Safranek 2017-04-13 15:30:24 UTC

> retry_handler.go:87] Got RequestLimitExceeded error on AWS request (ec2::DescribeInstances)

This hurts OpenShift Online because everything related to attach/detach on AWS becomes slow and customers complain that they're waiting too long for their pods to start.

I backported https://github.com/kubernetes/kubernetes/pull/41306 as https://github.com/openshift/ose/pull/702#issuecomment-293927784

Comment 2 Jan Safranek 2017-04-20 14:57:05 UTC

merged into enterprise-3.5

Comment 3 Troy Dawson 2017-04-20 21:29:45 UTC

This has been merged into ocp and is in OCP v3.5.5.8 or newer.

Comment 5 Chao Yang 2017-04-21 06:42:30 UTC

Test is passed on OCP v3.5.5.8
openshift v3.5.5.8
kubernetes v1.5.2+43a9be4
etcd 3.1.0

Create 50 pods using ebs volume and no RequestLimitExceeded error in the log.

loglevel=5

Comment 7 errata-xmlrpc 2017-04-26 05:37:41 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1129

Note You need to log in before you can comment on or make changes to this bug.