Bug 1461865 - AWS getInstancesByNodeNames is broken for large clusters
AWS getInstancesByNodeNames is broken for large clusters
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Pod (Show other bugs)
3.6.0
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Hemant Kumar
Mike Fiedler
:
Depends On: 1460388
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-15 09:25 EDT by Hemant Kumar
Modified: 2017-08-16 15 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1460388
Environment:
Last Closed: 2017-08-10 01:28:09 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Hemant Kumar 2017-06-15 09:25:52 EDT
+++ This bug was initially created as a clone of Bug #1460388 +++

From CloudTrail logs:

    "errorMessage": "The maximum number of filter values specified on a single call is 200",

This should affect all features (such as Load Balancers, Storage) which rely on this function to work.


I have opened an upstream ticket as well - 
https://github.com/kubernetes/kubernetes/issues/47271

--- Additional comment from Hemant Kumar on 2017-06-09 17:59:37 EDT ---

Assigning this to myself for now.

--- Additional comment from Hemant Kumar on 2017-06-09 18:01:43 EDT ---

We will have to backport the fix to both 3.5 and 3.6 once available.
Comment 1 Hemant Kumar 2017-06-15 09:32:47 EDT
PR opened - https://github.com/openshift/origin/pull/14669
Comment 3 DeShuai Ma 2017-06-28 22:16:17 EDT
How much large clusters this will happen? 200 instances? 
QE don't have so much large cluster, Could you help verify the bug. thanks
Comment 5 Hemant Kumar 2017-06-29 07:50:29 EDT
Yeah, this would be tricky to test directly on a cluster. I myself verified this by changing pagination limit to smaller value(default is 150) and making sure it works as expected.

I don't have a good answer, since pagination limit can't be tuned via command line parameter and requires code change. 

Scott - do we have a dev or opstest cluster which is large enough for testing this?
Comment 6 Hemant Kumar 2017-06-29 07:53:41 EDT
Just to clarify - yes the bug happens on 200+ node cluster. But - the key thing here to test is - we are now fetching instance information in batches of 150 and those get aggregated in a single list and returned, as if caller has made a single call (i.e the whole batching mechanism of opaque to the caller).
Comment 10 Hemant Kumar 2017-06-30 10:05:58 EDT
Creating dummy nodes will not work for volume check that fails because of 200 limit. Those nodes have to have volumes attached to them too, for being considered for getInstancesByNodeNames request.

So once cluster goes 200+ limit, what would break is - if you terminate a node in the cluster then volumes that were attached to it, will not correctly detach.


But the bug also affects load balancers. Here is upstream bug - https://github.com/kubernetes/kubernetes/issues/45050 . The upstream bug seems to suggest that once a cluster has 200+ nodes then AWS created doesn't seem to get any healthy nodes behind it.
Comment 13 Mike Fiedler 2017-07-06 09:54:17 EDT
@hekumar Can you take a look at comment 12?
Comment 14 Hemant Kumar 2017-07-06 10:03:21 EDT
From storage perspective there are 2 ways:

1. There should be errors logged in the controller logs.

2. for a 200+ node cluster, with each node running at least one or more pod with volume - what we can do is, go to AWS console and force detach one of the volumes being used inside the pod. If fetching node information is working as expected, then the detached volume will be automatically attached back. If not - it would remain detached.
Comment 15 Mike Fiedler 2017-07-06 13:38:42 EDT
Thanks.   Here is the test case for verification:

- 1 master + 1 infra + 200 nodes on AWS us-west-2b
- Run 1 pod on every node with a volume/volumeMount referencing an existing PVC
- Verify PVCs are bound to PVs.   
- Verify all volumes in us by instances in the AWS EC2 console
- Force mount volumes from instances.   
- Verify volumes transition to Available and then are automatically re-attached to the instance.
- Verify PV and PVCs remain bound in OpenShift
Comment 16 Mike Fiedler 2017-07-06 13:39:47 EDT
Step above should be "Force Detach" volumes from instances.
Comment 17 Mike Fiedler 2017-07-06 20:56:47 EDT
Verified on 3.6.133

Executed test in comment 15.   After force detach, volumes are re-attached successfully:

Jul  6 20:53:46 ip-172-31-8-32 atomic-openshift-master: I0706 20:53:46.750530   39189 node_status_updater.go:136] Updating status for node "ip-172-31-27-235.us-west-2.compute.internal" succeeded. patchBytes: "{\"status\":{\"volumesAttached\":[{\"devicePath\":\"/dev/xvdcd\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-04bb7b19cddccb436\"},{\"devicePath\":\"/dev/xvdbn\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-0c838ab3eecc2d6c0\"}]}}" VolumesAttached: [{kubernetes.io/aws-ebs/aws://us-west-2b/vol-04bb7b19cddccb436 /dev/xvdcd} {kubernetes.io/aws-ebs/aws://us-west-2b/vol-0c838ab3eecc2d6c0 /dev/xvdbn}]
Jul  6 20:53:46 ip-172-31-8-32 atomic-openshift-master: I0706 20:53:46.755358   39189 node_status_updater.go:136] Updating status for node "ip-172-31-25-70.us-west-2.compute.internal" succeeded. patchBytes: "{\"status\":{\"volumesAttached\":[{\"devicePath\":\"/dev/xvdbj\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-0223d80541a74c9b7\"},{\"devicePath\":\"/dev/xvdcy\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-07a774bb0a889af1b\"}]}}" VolumesAttached: [{kubernetes.io/aws-ebs/aws://us-west-2b/vol-0223d80541a74c9b7 /dev/xvdbj} {kubernetes.io/aws-ebs/aws://us-west-2b/vol-07a774bb0a889af1b /dev/xvdcy}]
Jul  6 20:53:46 ip-172-31-8-32 atomic-openshift-master: I0706 20:53:46.760077   39189 node_status_updater.go:136] Updating status for node "ip-172-31-3-250.us-west-2.compute.internal" succeeded. patchBytes: "{\"status\":{\"volumesAttached\":[{\"devicePath\":\"/dev/xvdba\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-0182b1a36bda8511e\"},{\"devicePath\":\"/dev/xvdbz\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-0902b6ce264f8f2c5\"}]}}" VolumesAttached: [{kubernetes.io/aws-ebs/aws://us-west-2b/vol-0182b1a36bda8511e /dev/xvdba} {kubernetes.io/aws-ebs/aws://us-west-2b/vol-0902b6ce264f8f2c5 /dev/xvdbz}]
Jul  6 20:53:46 ip-172-31-8-32 atomic-openshift-master: I0706 20:53:46.772668   39189 node_status_updater.go:136] Updating status for node "ip-172-31-22-5.us-west-2.compute.internal" succeeded. patchBytes: "{\"status\":{\"volumesAttached\":[{\"devicePath\":\"/dev/xvdce\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-04cdac9a650e119a7\"},{\"devicePath\":\"/dev/xvdcx\",\"name\":\"kubernetes.io/aws-ebs/aws://us-west-2b/vol-043479f3dbb958922\"}]}}" VolumesAttached: [{kubernetes.io/aws-ebs/aws://us-west-2b/vol-04cdac9a650e119a7 /dev/xvdce} {kubernetes.io/aws-ebs/aws://us-west-2b/vol-043479f3dbb958922 /dev/xvdcx}]
Comment 19 errata-xmlrpc 2017-08-10 01:28:09 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716

Note You need to log in before you can comment on or make changes to this bug.