Bug 1691208

Summary: [cloud] For volume io1, the minimum of iops should be 100 instead of 1
Product: OpenShift Container Platform Reporter: sunzhaohua <zhsun>
Component: Cloud ComputeAssignee: Jan Chaloupka <jchaloup>
Status: CLOSED ERRATA QA Contact: sunzhaohua <zhsun>
Severity: low Docs Contact:
Priority: low    
Version: 4.1.0CC: agarcial, aos-cloud, jchaloup, mgugino
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:46:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description sunzhaohua 2019-03-21 06:49:46 UTC
Description of problem:
For volume io1, the minimum of iops should be 100 instead of 1

Version-Release number of selected component (if applicable):
clusterversion:4.0.0-0.nightly-2019-03-20-153904

How reproducible:
Always

Steps to Reproduce:
1. Create a machine specifying "iops: 0": 
      blockDevices:
      - ebs:
          iops: 0
          volumeSize: 120
          volumeType: io1

2. Create a machine specifying "iops: 1":
      blockDevices:
      - ebs:
          iops: 1
          volumeSize: 120
          volumeType: io1
3. Create a machine specifying "iops: 100":
      blockDevices:
      - ebs:
          iops: 100
          volumeSize: 120
          volumeType: io1

Actual results:
Step 1: log output "Volume iops of 0 is too low; minimum is 1"
Step 2: machine is always pending, instacne was created and terminated one by one.
Step 3 works as expected.

step 1 log:
I0321 05:50:09.682284       1 controller.go:131] Running reconcile Machine for zhsun1-jbxgv-worker-ap-southeast-1a-io1
I0321 05:50:09.682340       1 actuator.go:367] Checking if machine exists
I0321 05:50:09.720379       1 actuator.go:375] Instance does not exist
I0321 05:50:09.720402       1 controller.go:221] Reconciling machine object zhsun1-jbxgv-worker-ap-southeast-1a-io1 triggers idempotent create.
I0321 05:50:09.720409       1 actuator.go:107] creating machine
I0321 05:50:09.733921       1 instances.go:44] No stopped instances found for machine zhsun1-jbxgv-worker-ap-southeast-1a-io1
I0321 05:50:09.733946       1 instances.go:141] Using AMI ami-0bda8cde9fb69a545
I0321 05:50:09.733952       1 instances.go:73] Describing security groups based on filters
I0321 05:50:09.828074       1 instances.go:118] Describing subnets based on filters
E0321 05:50:10.196648       1 instances.go:310] Error creating EC2 instance: VolumeIOPSLimit: Volume iops of 0 is too low; minimum is 1.
        status code: 400, request id: fc076dfb-dc8e-4c28-8fa0-1448e40be92d
E0321 05:50:10.196705       1 actuator.go:101] Machine error: error launching instance: error creating EC2 instance: VolumeIOPSLimit: Volume iops of 0 is too low; minimum is 1.
        status code: 400, request id: fc076dfb-dc8e-4c28-8fa0-1448e40be92d
E0321 05:50:10.196729       1 actuator.go:110] error creating machine: error launching instance: error creating EC2 instance: VolumeIOPSLimit: Volume iops of 0 is too low; minimum is 1.
        status code: 400, request id: fc076dfb-dc8e-4c28-8fa0-1448e40be92d
I0321 05:50:10.196738       1 actuator.go:159] updating machine conditions
I0321 05:50:10.196958       1 actuator.go:141] machine status has changed, updating

step 2 log:
$ oc get machine
NAME                                        INSTANCE              STATE     TYPE        REGION           ZONE              AGE
zhsun1-jbxgv-master-0                       i-0e71e1808c1d55ef9   running   m4.xlarge   ap-southeast-1   ap-southeast-1a   3h18m
zhsun1-jbxgv-master-1                       i-0e77ae3b4ac8b67a6   running   m4.xlarge   ap-southeast-1   ap-southeast-1b   3h18m
zhsun1-jbxgv-master-2                       i-0146088ea5d3ae520   running   m4.xlarge   ap-southeast-1   ap-southeast-1c   3h18m
zhsun1-jbxgv-worker-ap-southeast-1a-4cp8f   i-03bb275320719e21f   running   m4.large    ap-southeast-1   ap-southeast-1a   3h15m
zhsun1-jbxgv-worker-ap-southeast-1a-io1     i-03336a59db79da85a   pending   m4.large    ap-southeast-1   ap-southeast-1a   114s
zhsun1-jbxgv-worker-ap-southeast-1b-l7sc4   i-076b424b9662484f2   running   m4.large    ap-southeast-1   ap-southeast-1b   3h15m

I0321 06:00:17.781016       1 controller.go:131] Running reconcile Machine for zhsun1-jbxgv-worker-ap-southeast-1a-io1
I0321 06:00:17.781067       1 actuator.go:367] Checking if machine exists
I0321 06:00:17.829003       1 actuator.go:375] Instance does not exist
I0321 06:00:17.829027       1 controller.go:221] Reconciling machine object zhsun1-jbxgv-worker-ap-southeast-1a-io1 triggers idempotent create.
I0321 06:00:17.829035       1 actuator.go:107] creating machine
I0321 06:00:17.841460       1 instances.go:44] No stopped instances found for machine zhsun1-jbxgv-worker-ap-southeast-1a-io1
I0321 06:00:17.841494       1 instances.go:141] Using AMI ami-0bda8cde9fb69a545
I0321 06:00:17.841500       1 instances.go:73] Describing security groups based on filters
I0321 06:00:18.012982       1 instances.go:118] Describing subnets based on filters
I0321 06:00:18.904232       1 actuator.go:464] Updating status
I0321 06:00:18.904260       1 actuator.go:508] finished calculating AWS status
I0321 06:00:18.904334       1 actuator.go:141] machine status has changed, updating
I0321 06:00:18.913650       1 actuator.go:526] Instance state still pending, returning an error to requeue
W0321 06:00:18.913798       1 controller.go:223] unable to create machine zhsun1-jbxgv-worker-ap-southeast-1a-io1: requeue in: 20s
I0321 06:00:18.913907       1 controller.go:225] Actuator returned requeue-after error: requeue in: 20s
I0321 06:00:18.914019       1 controller.go:131] Running reconcile Machine for zhsun1-jbxgv-worker-ap-southeast-1a-io1
I0321 06:00:18.914162       1 actuator.go:367] Checking if machine exists
I0321 06:00:19.135677       1 actuator.go:380] Instance exists as "i-03a58991d1f0b0b9d"
I0321 06:00:19.135706       1 controller.go:210] Reconciling machine object zhsun1-jbxgv-worker-ap-southeast-1a-io1 triggers idempotent update.
I0321 06:00:19.135715       1 actuator.go:293] updating machine
I0321 06:00:19.135806       1 actuator.go:301] obtaining EC2 client for region
I0321 06:00:19.176404       1 actuator.go:318] found 1 instances for machine
I0321 06:00:19.176432       1 actuator.go:337] instance found
I0321 06:00:19.176474       1 actuator.go:464] Updating status
I0321 06:00:19.176552       1 actuator.go:508] finished calculating AWS status
I0321 06:00:19.176690       1 actuator.go:150] status unchanged
I0321 06:00:19.176710       1 actuator.go:526] Instance state still pending, returning an error to requeue
I0321 06:00:19.176724       1 controller.go:213] Actuator returned requeue-after error: requeue in: 20s
I0321 06:00:38.914252       1 controller.go:131] Running reconcile Machine for zhsun1-jbxgv-worker-ap-southeast-1a-io1
I0321 06:00:38.914327       1 actuator.go:367] Checking if machine exists
I0321 06:00:39.008950       1 actuator.go:375] Instance does not exist
I0321 06:00:39.008980       1 controller.go:221] Reconciling machine object zhsun1-jbxgv-worker-ap-southeast-1a-io1 triggers idempotent create.
I0321 06:00:39.008993       1 actuator.go:107] creating machine
I0321 06:00:39.045242       1 instances.go:44] No stopped instances found for machine zhsun1-jbxgv-worker-ap-southeast-1a-io1
I0321 06:00:39.045289       1 instances.go:141] Using AMI ami-0bda8cde9fb69a545
I0321 06:00:39.045300       1 instances.go:73] Describing security groups based on filters
I0321 06:00:39.208317       1 instances.go:118] Describing subnets based on filters
I0321 06:00:40.318635       1 actuator.go:464] Updating status
I0321 06:00:40.318736       1 actuator.go:508] finished calculating AWS status
I0321 06:00:40.318833       1 actuator.go:141] machine status has changed, updating
E0321 06:00:40.322245       1 event.go:203] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"zhsun1-jbxgv-worker-ap-southeast-1a-io1.158de3e4ee6326d7", GenerateName:"", Namespace:"openshift-machine-api", SelfLink:"", UID:"", ResourceVersion:"123926", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, InvolvedObject:v1.ObjectReference{Kind:"Machine", Namespace:"openshift-machine-api", Name:"zhsun1-jbxgv-worker-ap-southeast-1a-io1", UID:"9a556f89-4b9e-11e9-a503-0a0cc71331d6", APIVersion:"machine.openshift.io/v1beta1", ResourceVersion:"123925", FieldPath:""}, Reason:"Created", Message:"Created Machine zhsun1-jbxgv-worker-ap-southeast-1a-io1", Source:v1.EventSource{Component:"aws-controller", Host:""}, FirstTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63688744818, loc:(*time.Location)(0x22c77c0)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbf1ce80212fda607, ext:11620495461066, loc:(*time.Location)(0x22c77c0)}}, Count:2, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "zhsun1-jbxgv-worker-ap-southeast-1a-io1.158de3e4ee6326d7" is forbidden: User "system:serviceaccount:openshift-machine-api:default" cannot patch resource "events" in API group "" in the namespace "openshift-machine-api"' (will not retry!)
I0321 06:00:40.331189       1 actuator.go:526] Instance state still pending, returning an error to requeue
W0321 06:00:40.331225       1 controller.go:223] unable to create machine zhsun1-jbxgv-worker-ap-southeast-1a-io1: requeue in: 20s
I0321 06:00:40.331255       1 controller.go:225] Actuator returned requeue-after error: requeue in: 20s
I0321 06:00:40.337581       1 controller.go:131] Running reconcile Machine for zhsun1-jbxgv-worker-ap-southeast-1a-io1
I0321 06:00:40.337966       1 actuator.go:367] Checking if machine exists
I0321 06:00:40.519919       1 actuator.go:380] Instance exists as "i-001abeaa1ad8edefa"
I0321 06:00:40.519947       1 controller.go:210] Reconciling machine object zhsun1-jbxgv-worker-ap-southeast-1a-io1 triggers idempotent update.
I0321 06:00:40.519956       1 actuator.go:293] updating machine
I0321 06:00:40.520045       1 actuator.go:301] obtaining EC2 client for region
I0321 06:00:40.555840       1 actuator.go:318] found 1 instances for machine
I0321 06:00:40.555892       1 actuator.go:337] instance found
I0321 06:00:40.555932       1 actuator.go:464] Updating status
I0321 06:00:40.556022       1 actuator.go:508] finished calculating AWS status
I0321 06:00:40.556199       1 actuator.go:150] status unchanged
I0321 06:00:40.556261       1 actuator.go:526] Instance state still pending, returning an error to requeue
I0321 06:00:40.556335       1 controller.go:213] Actuator returned requeue-after error: requeue in: 20s
E0321 06:00:40.558280       1 event.go:203] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"zhsun1-jbxgv-worker-ap-southeast-1a-io1.158de3e4fe9d2f32", GenerateName:"", Namespace:"openshift-machine-api", SelfLink:"", UID:"", ResourceVersion:"123929", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, InvolvedObject:v1.ObjectReference{Kind:"Machine", Namespace:"openshift-machine-api", Name:"zhsun1-jbxgv-worker-ap-southeast-1a-io1", UID:"9a556f89-4b9e-11e9-a503-0a0cc71331d6", APIVersion:"machine.openshift.io/v1beta1", ResourceVersion:"124127", FieldPath:""}, Reason:"Updated", Message:"Updated machine zhsun1-jbxgv-worker-ap-southeast-1a-io1", Source:v1.EventSource{Component:"aws-controller", Host:""}, FirstTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63688744819, loc:(*time.Location)(0x22c77c0)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbf1ce802212284cb, ext:11620732758374, loc:(*time.Location)(0x22c77c0)}}, Count:2, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "zhsun1-jbxgv-worker-ap-southeast-1a-io1.158de3e4fe9d2f32" is forbidden: User "system:serviceaccount:openshift-machine-api:default" cannot patch resource "events" in API group "" in the namespace "openshift-machine-api"' (will not retry!)
I0321 06:01:00.331529       1 controller.go:131] Running reconcile Machine for zhsun1-jbxgv-worker-ap-southeast-1a-io1
I0321 06:01:00.331609       1 actuator.go:367] Checking if machine exists
I0321 06:01:00.440083       1 actuator.go:375] Instance does not exist
I0321 06:01:00.440110       1 controller.go:221] Reconciling machine object zhsun1-jbxgv-worker-ap-southeast-1a-io1 triggers idempotent create.
I0321 06:01:00.440119       1 actuator.go:107] creating machine
I0321 06:01:00.486922       1 instances.go:44] No stopped instances found for machine zhsun1-jbxgv-worker-ap-southeast-1a-io1
I0321 06:01:00.486993       1 instances.go:141] Using AMI ami-0bda8cde9fb69a545
I0321 06:01:00.487004       1 instances.go:73] Describing security groups based on filters
I0321 06:01:00.664741       1 instances.go:118] Describing subnets based on filters
I0321 06:01:01.827972       1 actuator.go:464] Updating status
I0321 06:01:01.828059       1 actuator.go:508] finished calculating AWS status
I0321 06:01:01.828167       1 actuator.go:141] machine status has changed, updating

Expected results:
In step 1 log should output "Volume iops of 0 is too low; minimum is 100"

Additional info:

Comment 1 Jan Chaloupka 2019-03-21 12:47:22 UTC
> E0321 05:50:10.196648       1 instances.go:310] Error creating EC2 instance: VolumeIOPSLimit: Volume iops of 0 is too low; minimum is 1.

We do not have control over the error message. It's reported by the AWS client itself. We are only wiring the message down the logs.

Why is the minimum expected to be 100 instead of 1?

Comment 3 sunzhaohua 2019-03-22 03:18:13 UTC
And from the aws webconsole createVolume show that the minimum is 100. https://ap-southeast-1.console.aws.amazon.com/ec2/v2/home?region=ap-southeast-1#CreateVolume:

Comment 4 Michael Gugino 2019-03-22 13:27:50 UTC
I think in this case, docs are wrong, probably should go with what the API thinks is valid.  I'd prefer not to have to code around these types of things because that code may get stale in the future.  What if they decide to say 25 is the minimum at some later point?  Better to let the upstream API make the determination.  Perhaps we should update our code comment to reflect.

Comment 5 Jan Chaloupka 2019-03-26 11:03:33 UTC
The same comment is also in the latest openshift/cluster-api-provider-aws: https://github.com/openshift/cluster-api-provider-aws/blob/master/pkg/apis/awsproviderconfig/v1beta1/awsmachineproviderconfig_types.go#L193

Yeah, we need to re-phrase the comment and just re-direct a reader to the upstream docs.

Comment 12 sunzhaohua 2019-04-10 02:27:39 UTC
Verified.

clusterversion: 4.0.0-0.9

Comment 14 errata-xmlrpc 2019-06-04 10:46:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758