Description of problem: When running on AWS, dynamic volumes currently fail because the dynamic volume is being created within another availability zone, which is not permitted. Error message from the origin logs: "E0831 19:31:17.926294 1430 factory.go:514] Error scheduling openshift-infra hawkular-cassandra-1-xxor6: pod (hawkular-cassandra-1-xxor6) failed to fit in any node fit failure on node (ip-172-18-10-196.ec2.internal): NoVolumeZoneConflict" The AWS instance is running in zone 'us-east-1d', but the PV and volume are running in 'us-east-1c'. # oc get pv -o yaml apiVersion: v1 items: - apiVersion: v1 kind: PersistentVolume metadata: annotations: kubernetes.io/createdby: aws-ebs-dynamic-provisioner pv.kubernetes.io/bound-by-controller: "yes" pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs creationTimestamp: 2016-08-31T19:26:10Z labels: failure-domain.beta.kubernetes.io/region: us-east-1 failure-domain.beta.kubernetes.io/zone: us-east-1c name: pvc-c4cd4906-6fb0-11e6-8991-0e3b730bb317 resourceVersion: "625" selfLink: /api/v1/persistentvolumes/pvc-c4cd4906-6fb0-11e6-8991-0e3b730bb317 uid: c5b67afb-6fb0-11e6-8991-0e3b730bb317 spec: accessModes: - ReadWriteOnce awsElasticBlockStore: fsType: ext4 volumeID: aws://us-east-1c/vol-b0b2161c capacity: storage: 10Gi claimRef: apiVersion: v1 kind: PersistentVolumeClaim name: metrics-cassandra-1 namespace: openshift-infra resourceVersion: "584" uid: c4cd4906-6fb0-11e6-8991-0e3b730bb317 persistentVolumeReclaimPolicy: Delete status: phase: Bound kind: List metadata: {} Version-Release number of selected component (if applicable): Origin master, but is also reproducible in v1.3.0-alpha.3 How reproducible: Always
@Matt, pls see if https://bugzilla.redhat.com/show_bug.cgi?id=1365398#c6 works for you.
Brad, can you make sure the docs mention the AWS labeling requirement?
Brad: aka https://bugzilla.redhat.com/show_bug.cgi?id=1367617
Moving this to installer to potentially be addressed in 3.4 or later. For 3.3 we have a docs PR https://github.com/openshift/openshift-docs/pull/2783 to explain how this can be accomplished manually.
I'm not sure this is a bug currently, because we don't currently support provisioning in AWS. That said, we do plan on supporting AWS provisioning for 3.4, so will use this to track ensuring we set this properly.
AWS provisioning is not included in 3.5
This happens because instances aren't labeled properly. The AWS provisioning work that Kenny is doing will ensure that instances are labeled though it appears to use the format that's preferred prior to 3.6. We need to update it to use this format of Key: kubernetes.io/cluster/xxxx Value: SomeUniqueClusterId where xxxx is a unique string See https://trello.com/c/PWwHHUc0/154-retrofit-existing-clusters-with-the-tags-needed-for-the-the-provisioner#comment-595cee234251514e70b52a32 for reference
New AWS provisioning playbooks in openshift-ansible master branch should properly tag all resources.
This is validated for several round of testing, adding "KubernetesCluster" tag make cluster be working well. But one more thing need to be highlighted: 1. when all instances in the cluster are running in the same zone, adding "KubernetesCluster" tag works well. 2. when all instances in the cluster are running in multi zones, adding "KubernetesCluster" tag does not resolve all issues, will encounter BZ#1491761. 3. For installer enhancement, according to https://bugzilla.redhat.com/show_bug.cgi?id=1491399#c9, seem like openshift-ansible tools would do instance tag check only for upgrade part, why not do the same check for fresh install. @Scott, based on #3, I move this bug to ASSIGNED status.
We intend to check during 3.7 install and 3.6 to 3.7 upgrade. Lets track those other bugs separately, the scope of this bug is ensuring that the new provisioning work properly sets a cluster id.
Based on comment 10 and comment 11, move this bug to VERIFIED.
+1 to comment 12 as it prevents a fresh install from succeeding. In 3.6, the OpenShift documentation barely mentions cluster ID. The only reference to it that I could find is in a side note[1] in the table on persistent volume types. Nothing in Installation. [1] - https://docs.openshift.com/container-platform/3.6/install_config/persistent_storage/dynamically_provisioning_pvs.html#available-dynamically-provisioned-plug-ins
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188