Bug 1441748 - AWS quota problems in Openshift 3.5
Summary: AWS quota problems in Openshift 3.5
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.5.z
Assignee: Jan Safranek
QA Contact: Chao Yang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-12 15:31 UTC by Hemant Kumar
Modified: 2017-04-26 05:37 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: OpenShift logic for persistent volume attach/detach logic on AWS queried status of each attach/detach operation using separate API calls for each persistent volume. OpenShift could run out of AWS API call quota and would be throttled by AWS. Consequence: Attach/detach operation could get slow when multiple volumes were attached/detached at the same time. Fix: OpenShift uses bulk query to determine status of all attach/detach operations at once. Result: OpenShift is faster to attach/detach a volume.
Clone Of:
Environment:
Last Closed: 2017-04-26 05:37:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:1129 0 normal SHIPPED_LIVE OpenShift Container Platform 3.5, 3.4, 3.3, and 3.2 bug fix update 2017-04-26 09:35:35 UTC

Description Hemant Kumar 2017-04-12 15:31:22 UTC
Description of problem:

We are overshooting the AWS quota in 3.5. I saw following messages in controller logs


Apr 11 18:11:37 ip-172-31-56-130.ec2.internal atomic-openshift-master-controllers[31261]: W0411 18:11:37.130373   31261
retry_handler.go:87] Got RequestLimitExceeded error on AWS request (ec2::DescribeInstances)


We should backport - https://github.com/kubernetes/kubernetes/pull/41306 to 3.5

Comment 1 Jan Safranek 2017-04-13 15:30:24 UTC
> retry_handler.go:87] Got RequestLimitExceeded error on AWS request (ec2::DescribeInstances)

This hurts OpenShift Online because everything related to attach/detach on AWS becomes slow and customers complain that they're waiting too long for their pods to start.

I backported https://github.com/kubernetes/kubernetes/pull/41306 as https://github.com/openshift/ose/pull/702#issuecomment-293927784

Comment 2 Jan Safranek 2017-04-20 14:57:05 UTC
merged into enterprise-3.5

Comment 3 Troy Dawson 2017-04-20 21:29:45 UTC
This has been merged into ocp and is in OCP v3.5.5.8 or newer.

Comment 5 Chao Yang 2017-04-21 06:42:30 UTC
Test is passed on OCP v3.5.5.8
openshift v3.5.5.8
kubernetes v1.5.2+43a9be4
etcd 3.1.0

Create 50 pods using ebs volume and no RequestLimitExceeded error in the log.

loglevel=5

Comment 7 errata-xmlrpc 2017-04-26 05:37:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1129


Note You need to log in before you can comment on or make changes to this bug.