Bug 1461865

Summary:	AWS getInstancesByNodeNames is broken for large clusters
Product:	OpenShift Container Platform	Reporter:	Hemant Kumar <hekumar>
Component:	Node	Assignee:	Hemant Kumar <hekumar>
Status:	CLOSED ERRATA	QA Contact:	Mike Fiedler <mifiedle>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	3.6.0	CC:	aos-bugs, bchilds, dma, eparis, hekumar, jokerman, mifiedle, mmccomas, sdodson, smunilla
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1460388	Environment:
Last Closed:	2017-08-10 05:28:09 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1460388
Bug Blocks:

Description Hemant Kumar 2017-06-15 13:25:52 UTC

+++ This bug was initially created as a clone of Bug #1460388 +++

From CloudTrail logs:

    "errorMessage": "The maximum number of filter values specified on a single call is 200",

This should affect all features (such as Load Balancers, Storage) which rely on this function to work.


I have opened an upstream ticket as well - 
https://github.com/kubernetes/kubernetes/issues/47271

--- Additional comment from Hemant Kumar on 2017-06-09 17:59:37 EDT ---

Assigning this to myself for now.

--- Additional comment from Hemant Kumar on 2017-06-09 18:01:43 EDT ---

We will have to backport the fix to both 3.5 and 3.6 once available.

Comment 1 Hemant Kumar 2017-06-15 13:32:47 UTC

PR opened - https://github.com/openshift/origin/pull/14669

Comment 3 DeShuai Ma 2017-06-29 02:16:17 UTC

How much large clusters this will happen? 200 instances? 
QE don't have so much large cluster, Could you help verify the bug. thanks

Comment 5 Hemant Kumar 2017-06-29 11:50:29 UTC

Yeah, this would be tricky to test directly on a cluster. I myself verified this by changing pagination limit to smaller value(default is 150) and making sure it works as expected.

I don't have a good answer, since pagination limit can't be tuned via command line parameter and requires code change. 

Scott - do we have a dev or opstest cluster which is large enough for testing this?

Comment 6 Hemant Kumar 2017-06-29 11:53:41 UTC

Just to clarify - yes the bug happens on 200+ node cluster. But - the key thing here to test is - we are now fetching instance information in batches of 150 and those get aggregated in a single list and returned, as if caller has made a single call (i.e the whole batching mechanism of opaque to the caller).

Comment 10 Hemant Kumar 2017-06-30 14:05:58 UTC

Creating dummy nodes will not work for volume check that fails because of 200 limit. Those nodes have to have volumes attached to them too, for being considered for getInstancesByNodeNames request.

So once cluster goes 200+ limit, what would break is - if you terminate a node in the cluster then volumes that were attached to it, will not correctly detach.


But the bug also affects load balancers. Here is upstream bug - https://github.com/kubernetes/kubernetes/issues/45050 . The upstream bug seems to suggest that once a cluster has 200+ nodes then AWS created doesn't seem to get any healthy nodes behind it.

Comment 13 Mike Fiedler 2017-07-06 13:54:17 UTC

@hekumar Can you take a look at comment 12?

Comment 14 Hemant Kumar 2017-07-06 14:03:21 UTC

From storage perspective there are 2 ways:

1. There should be errors logged in the controller logs.

2. for a 200+ node cluster, with each node running at least one or more pod with volume - what we can do is, go to AWS console and force detach one of the volumes being used inside the pod. If fetching node information is working as expected, then the detached volume will be automatically attached back. If not - it would remain detached.

Comment 15 Mike Fiedler 2017-07-06 17:38:42 UTC

Thanks.   Here is the test case for verification:

- 1 master + 1 infra + 200 nodes on AWS us-west-2b
- Run 1 pod on every node with a volume/volumeMount referencing an existing PVC
- Verify PVCs are bound to PVs.   
- Verify all volumes in us by instances in the AWS EC2 console
- Force mount volumes from instances.   
- Verify volumes transition to Available and then are automatically re-attached to the instance.
- Verify PV and PVCs remain bound in OpenShift

Comment 16 Mike Fiedler 2017-07-06 17:39:47 UTC

Step above should be "Force Detach" volumes from instances.

Comment 17 Mike Fiedler 2017-07-07 00:56:47 UTC

Verified on 3.6.133

Executed test in comment 15.   After force detach, volumes are re-attached successfully:

Jul  6 20:53:46 ip-172-31-8-32 atomic-openshift-master: I0706 20:53:46.750530   39189 node_status_updater.go:136] Updating status for node "ip-172-31-27-235.us-west-2.compute.internal" succeeded. patchBytes: "{\"status\":{\"volumesAttached\":[{\"devicePath\":\"/dev/xvdcd\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-04bb7b19cddccb436\"},{\"devicePath\":\"/dev/xvdbn\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-0c838ab3eecc2d6c0\"}]}}" VolumesAttached: [{kubernetes.io/aws-ebs/aws://us-west-2b/vol-04bb7b19cddccb436 /dev/xvdcd} {kubernetes.io/aws-ebs/aws://us-west-2b/vol-0c838ab3eecc2d6c0 /dev/xvdbn}]
Jul  6 20:53:46 ip-172-31-8-32 atomic-openshift-master: I0706 20:53:46.755358   39189 node_status_updater.go:136] Updating status for node "ip-172-31-25-70.us-west-2.compute.internal" succeeded. patchBytes: "{\"status\":{\"volumesAttached\":[{\"devicePath\":\"/dev/xvdbj\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-0223d80541a74c9b7\"},{\"devicePath\":\"/dev/xvdcy\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-07a774bb0a889af1b\"}]}}" VolumesAttached: [{kubernetes.io/aws-ebs/aws://us-west-2b/vol-0223d80541a74c9b7 /dev/xvdbj} {kubernetes.io/aws-ebs/aws://us-west-2b/vol-07a774bb0a889af1b /dev/xvdcy}]
Jul  6 20:53:46 ip-172-31-8-32 atomic-openshift-master: I0706 20:53:46.760077   39189 node_status_updater.go:136] Updating status for node "ip-172-31-3-250.us-west-2.compute.internal" succeeded. patchBytes: "{\"status\":{\"volumesAttached\":[{\"devicePath\":\"/dev/xvdba\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-0182b1a36bda8511e\"},{\"devicePath\":\"/dev/xvdbz\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-0902b6ce264f8f2c5\"}]}}" VolumesAttached: [{kubernetes.io/aws-ebs/aws://us-west-2b/vol-0182b1a36bda8511e /dev/xvdba} {kubernetes.io/aws-ebs/aws://us-west-2b/vol-0902b6ce264f8f2c5 /dev/xvdbz}]
Jul  6 20:53:46 ip-172-31-8-32 atomic-openshift-master: I0706 20:53:46.772668   39189 node_status_updater.go:136] Updating status for node "ip-172-31-22-5.us-west-2.compute.internal" succeeded. patchBytes: "{\"status\":{\"volumesAttached\":[{\"devicePath\":\"/dev/xvdce\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-04cdac9a650e119a7\"},{\"devicePath\":\"/dev/xvdcx\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-043479f3dbb958922\"}]}}" VolumesAttached: [{kubernetes.io/aws-ebs/aws://us-west-2b/vol-04cdac9a650e119a7 /dev/xvdce} {kubernetes.io/aws-ebs/aws://us-west-2b/vol-043479f3dbb958922 /dev/xvdcx}]

Comment 19 errata-xmlrpc 2017-08-10 05:28:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716