Bug 1365398 - Dynamic provisioned volume is not in the same AZ with instance
Summary: Dynamic provisioned volume is not in the same AZ with instance
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Documentation
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Vikram Goyal
QA Contact: Vikram Goyal
Vikram Goyal
URL:
Whiteboard:
: 1468756 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-09 07:48 UTC by Chao Yang
Modified: 2021-03-11 14:38 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-09-19 06:05:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1468579 0 unspecified CLOSED Missing Kubernetes Cluster ID tag from openshift cluster resources 2021-02-22 00:41:40 UTC

Internal Links: 1468579

Description Chao Yang 2016-08-09 07:48:52 UTC
Description of problem:
Set up a cluster with 2 nodes in us-east-1d, use PVC to dynamically provision volumes, found volume sometimes could be provisioned in us-east-1c. This randomly happens but if you keep provisioning volumes, you will definitely see a volume provisioned in wrong AZ.

Version-Release number of selected component (if applicable):
openshift v3.3.0.17
kubernetes v1.3.0+507d3a7
etcd 2.3.0+git

How reproducible:
80%

Steps to Reproduce:
1.Install master and node instances on AWS, Both in the us-east-1d  
2.Create dynamic pvc using this file https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/persistent-volumes/misc/pvc.json with name pvc1 and size is 1Gi
3. Create dynamic pvc using this file https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/persistent-volumes/misc/pvc.json with name pvc2 and size is 2Gi

4.Check pv
pvc-76c63740-5dff-11e6-84a2-0ef1f4b2f333   1Gi        RWO           Bound     jhou/pvc1                       6m
pvc-bbe9982f-5dff-11e6-84a2-0ef1f4b2f333   2Gi        RWX           Bound     jhou/pvc2                       5m

5.oc describe pv
[root@ip-172-18-11-134 ~]# oc describe pv pvc-76c63740-5dff-11e6-84a2-0ef1f4b2f333
Name:		pvc-76c63740-5dff-11e6-84a2-0ef1f4b2f333
Labels:		failure-domain.beta.kubernetes.io/region=us-east-1
		failure-domain.beta.kubernetes.io/zone=us-east-1d
Status:		Bound
Claim:		jhou/pvc1
Reclaim Policy:	Delete
Access Modes:	RWO
Capacity:	1Gi
Message:	
Source:
    Type:	AWSElasticBlockStore (a Persistent Disk resource in AWS)
    VolumeID:	aws://us-east-1d/vol-4c8d98e8
    FSType:	ext4
    Partition:	0
    ReadOnly:	false
No events.

[root@ip-172-18-11-134 ~]# oc describe pv pvc-bbe9982f-5dff-11e6-84a2-0ef1f4b2f333
Name:		pvc-bbe9982f-5dff-11e6-84a2-0ef1f4b2f333
Labels:		failure-domain.beta.kubernetes.io/region=us-east-1
		failure-domain.beta.kubernetes.io/zone=us-east-1c
Status:		Bound
Claim:		jhou/pvc2
Reclaim Policy:	Delete
Access Modes:	RWX
Capacity:	2Gi
Message:	
Source:
    Type:	AWSElasticBlockStore (a Persistent Disk resource in AWS)
    VolumeID:	aws://us-east-1c/vol-65e8abc8
    FSType:	ext4
    Partition:	0
    ReadOnly:	false
No events.


Actual results:
One of pv is created in the us-east-1c, not the same AZ with instance

Expected results:
Two pv should be in the same AZ with instance

Additional info:
Aug  9 03:06:03 ip-172-18-11-134 docker: I0809 03:06:03.623402       1 aws.go:972] Found instances in zones map[us-east-1c:{} us-east-1d:{}]
Aug  9 03:06:03 ip-172-18-11-134 docker: I0809 03:06:03.623442       1 util.go:248] Creating volume for PVC "pvc2"; chose zone="us-east-1c" from zones=["us-east-1c" "us-east-1d"]
Aug  9 03:06:03 ip-172-18-11-134 atomic-openshift-node: I0809 03:06:03.889799   25041 generic.go:181] GenericPLEG: Relisting

Comment 1 Jan Safranek 2016-08-09 09:04:01 UTC
I saw it once or twice. Looking at the code, Kubernetes lists all running AWS instances and randomly selects a zone that is used by one of them. It happens only on your shared AWS account. It should work if Kubernetes is installed on a dedicated AWS project where all AWS instances are Kubernetes nodes.

Filled https://github.com/kubernetes/kubernetes/issues/30265 about it.

Comment 6 Jianwei Hou 2016-08-16 10:33:50 UTC
Current work around:
Add tag "Name=KubernetesCluster,Value=<clusterid>" to all instances of a same cluster. Removed 'testblocker' keyword since the work around works for us.

Comment 7 Eric Paris 2016-08-16 13:09:12 UTC
I think the upstream issue is saying that the tagging is not a 'work around' but is the 'design'. I think this is 'working as expected'. I do not believe there is anything left to fix in this BZ.

Comment 8 Bradley Childs 2016-08-16 20:19:38 UTC
Closing as working as designed per upstreams comment (use the tagging to influence PV zone)

Comment 9 Jianwei Hou 2016-08-17 02:28:33 UTC
We need to document this in case customers runs into same problem. Tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1367617

Comment 11 Scott Dodson 2017-07-06 13:22:31 UTC
openshift-ansible doesn't currently perform any instance manipulation but we start that work as part of 3.7. Is there another suggested short term fix that you'd like?

Comment 12 Scott Dodson 2017-07-11 20:56:51 UTC
*** Bug 1468756 has been marked as a duplicate of this bug. ***

Comment 13 Scott Dodson 2017-07-11 20:58:59 UTC
https://github.com/openshift/openshift-ansible/pull/4726 makes it mandatory to specify a cluster id when you're using the AWS provider or explicitly state that you're only running one cluster per account.

Comment 14 Scott Dodson 2017-07-13 19:48:56 UTC
Based on the following from Hemant Kumar I'm moving this to be a Docs bug. While the ansible installer could update aws.conf this seems like a bad idea because it's yet another item that needs to be kept in sync.

"Also, there is no need to update aws.conf file, because if KubernetesClusterTag is not present in aws.conf then the tag value is picked from master instance tag."

Infact, the docs already mention this, however in a section specific to AWS dynamic volumes. It should probably be moved to a more prominent location and it needs to be updated to reflect the new label. Here's a PR that does the latter.

https://github.com/openshift/openshift-docs/pull/4783


Note You need to log in before you can comment on or make changes to this bug.