Bug 1372059 - Dynamic provisioned volumes fail in AWS due to incorrect zone
Summary: Dynamic provisioned volumes fail in AWS due to incorrect zone
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 3.7.0
Assignee: Kenny Woodson
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-31 19:49 UTC by Matt Wringe
Modified: 2017-11-28 21:51 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Kubernetes requires that all resources under management be labeled with KubernetesCluster (deprecated) or kubernetes.io/cluster/xxxx so that when attaching/detaching persistent volumes they connect to the correct instance in the correct zone.
Clone Of:
Environment:
Last Closed: 2017-11-28 21:51:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:3188 0 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update 2017-11-29 02:34:54 UTC

Description Matt Wringe 2016-08-31 19:49:52 UTC
Description of problem:
When running on AWS, dynamic volumes currently fail because the dynamic volume is being created within another availability zone, which is not permitted.

Error message from the origin logs:

"E0831 19:31:17.926294    1430 factory.go:514] Error scheduling openshift-infra hawkular-cassandra-1-xxor6: pod (hawkular-cassandra-1-xxor6) failed to fit in any node
fit failure on node (ip-172-18-10-196.ec2.internal): NoVolumeZoneConflict"

The AWS instance is running in zone 'us-east-1d', but the PV and volume are running in 'us-east-1c'.


# oc get pv -o yaml
apiVersion: v1
items:
- apiVersion: v1
  kind: PersistentVolume
  metadata:
    annotations:
      kubernetes.io/createdby: aws-ebs-dynamic-provisioner
      pv.kubernetes.io/bound-by-controller: "yes"
      pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
    creationTimestamp: 2016-08-31T19:26:10Z
    labels:
      failure-domain.beta.kubernetes.io/region: us-east-1
      failure-domain.beta.kubernetes.io/zone: us-east-1c
    name: pvc-c4cd4906-6fb0-11e6-8991-0e3b730bb317
    resourceVersion: "625"
    selfLink: /api/v1/persistentvolumes/pvc-c4cd4906-6fb0-11e6-8991-0e3b730bb317
    uid: c5b67afb-6fb0-11e6-8991-0e3b730bb317
  spec:
    accessModes:
    - ReadWriteOnce
    awsElasticBlockStore:
      fsType: ext4
      volumeID: aws://us-east-1c/vol-b0b2161c
    capacity:
      storage: 10Gi
    claimRef:
      apiVersion: v1
      kind: PersistentVolumeClaim
      name: metrics-cassandra-1
      namespace: openshift-infra
      resourceVersion: "584"
      uid: c4cd4906-6fb0-11e6-8991-0e3b730bb317
    persistentVolumeReclaimPolicy: Delete
  status:
    phase: Bound
kind: List
metadata: {}

Version-Release number of selected component (if applicable):
Origin master, but is also reproducible in v1.3.0-alpha.3

How reproducible:
Always

Comment 2 Jianwei Hou 2016-09-01 02:31:52 UTC
@Matt, pls see if https://bugzilla.redhat.com/show_bug.cgi?id=1365398#c6 works for you.

Comment 3 Eric Paris 2016-09-01 14:18:17 UTC
Brad, can you make sure the docs mention the AWS labeling requirement?

Comment 4 Eric Paris 2016-09-01 14:19:32 UTC
Brad: aka https://bugzilla.redhat.com/show_bug.cgi?id=1367617

Comment 5 Eric Paris 2016-09-02 15:44:02 UTC
Moving this to installer to potentially be addressed in 3.4 or later. For 3.3 we have a docs PR https://github.com/openshift/openshift-docs/pull/2783 to explain how this can be accomplished manually.

Comment 6 Jason DeTiberus 2016-09-22 20:36:37 UTC
I'm not sure this is a bug currently, because we don't currently support provisioning in AWS. That said, we do plan on supporting AWS provisioning for 3.4, so will use this to track ensuring we set this properly.

Comment 7 Scott Dodson 2017-02-10 02:07:18 UTC
AWS provisioning is not included in 3.5

Comment 8 Scott Dodson 2017-08-24 18:50:07 UTC
This happens because instances aren't labeled properly. The AWS provisioning work that Kenny is doing will ensure that instances are labeled though it appears to use the format that's preferred prior to 3.6. We need to update it to use this format of 

Key: kubernetes.io/cluster/xxxx
Value: SomeUniqueClusterId

where xxxx is a unique string

See https://trello.com/c/PWwHHUc0/154-retrofit-existing-clusters-with-the-tags-needed-for-the-the-provisioner#comment-595cee234251514e70b52a32 for reference

Comment 9 Scott Dodson 2017-10-13 13:15:32 UTC
New AWS provisioning playbooks in openshift-ansible master branch should properly tag all resources.

Comment 10 Johnny Liu 2017-10-16 09:33:05 UTC
This is validated for several round of testing, adding "KubernetesCluster" tag make cluster be working well.


But one more thing need to be highlighted:
1. when all instances in the cluster are running in the same zone, adding "KubernetesCluster" tag works well.
2. when all instances in the cluster are running in multi zones, adding "KubernetesCluster" tag does not resolve all issues, will encounter BZ#1491761.
3. For installer enhancement, according to https://bugzilla.redhat.com/show_bug.cgi?id=1491399#c9, seem like openshift-ansible tools would do instance tag check only for upgrade part, why not do the same check for fresh install.


@Scott, based on #3, I move this bug to ASSIGNED status.

Comment 11 Scott Dodson 2017-10-19 14:15:19 UTC
We intend to check during 3.7 install and 3.6 to 3.7 upgrade. Lets track those other bugs separately, the scope of this bug is ensuring that the new provisioning work properly sets a cluster id.

Comment 12 Johnny Liu 2017-10-20 02:31:55 UTC
Based on comment 10 and comment 11, move this bug to VERIFIED.

Comment 13 Mike Fiedler 2017-10-20 11:34:48 UTC
+1 to comment 12 as it prevents a fresh install from succeeding.  In 3.6, the OpenShift documentation barely mentions cluster ID.   The only reference to it that I could find is in a side note[1] in the table on persistent volume types.  Nothing in Installation.

[1] - https://docs.openshift.com/container-platform/3.6/install_config/persistent_storage/dynamically_provisioning_pvs.html#available-dynamically-provisioned-plug-ins

Comment 17 errata-xmlrpc 2017-11-28 21:51:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188


Note You need to log in before you can comment on or make changes to this bug.