Bug 1306520 - [DOCS] [3.2] Document persistent volume storage labeling
[DOCS] [3.2] Document persistent volume storage labeling
Status: CLOSED CURRENTRELEASE
Product: OpenShift Container Platform
Classification: Red Hat
Component: Documentation (Show other bugs)
3.1.0
Unspecified Unspecified
high Severity medium
: ---
: ---
Assigned To: Thien-Thi Nguyen
Jianwei Hou
Vikram Goyal
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-11 02:21 EST by Vikram Goyal
Modified: 2017-03-08 13 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-05-16 12:17:35 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Vikram Goyal 2016-02-11 02:21:25 EST
In OSE 3.2 you can place labels on the persistent volume storage so operators know where they came from and also use the EBS, PD tags to label within their systems the claim and project name.

It helps system admins delete orphaned storage assets.

This is implemented in AWS and Cinder but not GCE.

Likely structure:
 -- why do I need labels on pv storage
 -- how is it implemented
 -- an example implementation

Dev for this feature is:
Jan Safranek

QE for this feature is:
Jianwei Hou

Trello Card is:
https://trello.com/c/cvHQcgch/10-3-add-pv-name-to-each-dynamically-provisioned-volume-ops-rfe-origin

Likely Guide is:
https://docs.openshift.com/enterprise/3.1/architecture/additional_concepts/storage.html
Comment 2 Thien-Thi Nguyen 2016-03-13 16:33:10 EDT
Hi Jan,

The wip for this is:
https://github.com/tnguyen-rh/openshift-docs/blob/bz1306520/architecture/additional_concepts/storage.adoc

From commit:
https://github.com/tnguyen-rh/openshift-docs/commit/4728aa1b1bd1666be1ac1629c0214bf4ad83868e

I think to satisfy this BZ, we should also give an example of using the labels on PVs, PVCs, specifically in a "cull orphaned storage assets" scenario.  I am thinking of adding another example in the Removing Volumes section of the Developer Guide:
https://github.com/tnguyen-rh/openshift-docs/blob/bz1306520/dev_guide/volumes.adoc#removing-volumes

Something along the lines of:
The following example shows two commands.
The first command uses a label selector to list those PVCs with label value `scratch`.
The second command deletes them.

Does that sound reasonable?
Comment 4 Jan Safranek 2016-03-14 05:45:54 EDT
I think whole storage is much more complicated if I we are speaking about OSE 3.2.

We're introducing dynamic provisioning in this release. I am not sure how well supported it is, but admin does not need to provision persistent volumes manually. Instead, users create claims with a specific marker that triggers dynamic provisioning and Kubernetes itself create a volume in external infrastructure (AWS, GCE, OpenStack) and appropriate PV for it.

See https://github.com/kubernetes/kubernetes/tree/master/examples/experimental/persistent-volume-provisioning

In addition, there is a new reclamation policy "Delete":
Retain: Manual reclamation
Recycle: Basic scrub (e.g, rm -rf /<volume>/*)
Delete: Delete the volume in external infrastructure (GCE, AWS, OpenStack).

So, no admin intervention is needed for these dynamically created volumes, they are created and deleted on demand (by creating and deleting a claim). However, there are no policies that would specify *which* users can create/delete such claims and there are no quotas for amount and size of the volumes. This may be quite dangerous if the users are not properly trained. This will be better in future releases.


Anyway, now the labels: 
The trello card you refer to in comment #0 is about labels/tags on AWS/GCE/OpenStack volumes, not Kubernetes PersistentVolume objects! If Kubernetes creates a new AWS EBS volume, it adds tags to it to a *cloud* admin can see who created the volume (i.e. which project and claim name). When something bad happens to Kubernetes, the admin can still see which AWS EBS volume belongs to which user. The same applies to GCE and OpenStack.


OSE 3.2 does not add any new features regarding to labels on Kubernetes claims and PVs. Admins and users may use it to mark PVs and claims with metadata, but Kubernetes ignores them for now. In future, Kubernetes may use these labels to select the right volume for a claim, e.g. claim could in theory specify that it wants to bind only to PVs with label "speed=fast".
Comment 5 Thien-Thi Nguyen 2016-03-15 08:49:05 EDT
(In reply to Jan Safranek from comment #4)
> [overall OSE 3.2 changes related to storage]
> We're introducing dynamic provisioning in this release. [...]
> The trello card you refer to in comment #0 is about labels/tags
> on AWS/GCE/OpenStack volumes, not Kubernetes PersistentVolume
> objects! If Kubernetes creates a new AWS EBS volume, it adds
> tags to it to a *cloud* admin can see who created the volume
> (i.e. which project and claim name).  When something bad happens
> to Kubernetes, the admin can still see which AWS EBS volume
> belongs to which user. The same applies to GCE and OpenStack.
>
> OSE 3.2 does not add any new features regarding to labels on
> Kubernetes claims and PVs. Admins and users may use it to mark
> PVs and claims with metadata, but Kubernetes ignores them for
> now. In future, Kubernetes may use these labels to select the
> right volume for a claim, e.g. claim could in theory specify
> that it wants to bind only to PVs with label "speed=fast".

Hi Jan,

Thank-you very much for the quick response and for the meeting this morning to further clarify things.  Here is a summary of what i learned:
- OpenShift supports three storage backends: AWS EBS, OpenStack Cinder, GCE PD.
- Volumes from these backends can be dynamically provisioned.
- The "set of key/value pairs" concept is the same for each backend, and the same for OpenShift objects.
- The name of the feature and how the data is accessed, however, differs:
  - OpenShift objects have "labels".
  - AWS EBS volumes have "tags".
  - OpenStack Cinder volumes have "metadata".
  - GCE PD volumes have an unstructured text field "description" in which a JSON map is stored.
- There is no connection between OpenShift labels and the others.
- There are three keys to be documented in this BZ:
  - kubernetes.io/created-for/pv/name
  - kubernetes.io/created-for/pvc/namespace
  - kubernetes.io/created-for/pvc/name
- OpenShift sets the keys at provisioning time (dynamically); static volumes don't have these keys set.
- The keys and values are normally not visible to the OpenShift user.
- The intended use is for an admin to be able to recognize volumes associated w/ OpenShift.
- Common scenarios:
  - OpenShift dies; user asks admin to find the adrift data.
  - User does not explicitly delete volumes when done; admin uses keys to identify volumes for deletion.
- Docs exist: https://github.com/openshift/openshift-docs/blob/master/install_config/persistent_storage/dynamically_provisioning_pvs.adoc

Corrections/additions welcome!

The plan is to undo the current (misguided) changes, and then add to the current docs these learnings as:
- a coherent description of the concept, including (non)relation to OpenShift labels
- a brief explanation of the backend-specific terminology
- a description of what OpenShift does and when
- the common scenarios (above) as a rationale
Additionally, i will add to each persistent_storage_BACKEND.adoc a xref back to the general doc.  Perhaps a mention in the Architecture doc would be good, too (still deciding).
Comment 6 Thien-Thi Nguyen 2016-03-15 09:57:32 EDT
Hi Vikram,

The scope of this BZ is now expanded to three storage backends, not just AWS EBS and OpenStack Cinder.  I believe this trello card is now also relevant:

 https://trello.com/c/HW6fyd5g/81-add-pv-name-to-each-dynamically-provisioned-volume-for-gce-ops-rfe32

If you find other GCE PD-specific trello cards and/or BZs, feel free to list them here and point them to this BZ.
Comment 7 Jan Safranek 2016-03-15 10:10:07 EDT
Thien-Thi, almost perfect!

> - OpenShift supports three storage backends: AWS EBS, OpenStack Cinder, GCE PD.

Openshift can use many backends - NFS, iSCSI, FC etc. Only AWS, OpenStack and GCE support dynamic provisioning.

> - Common scenarios:
>   - User does not explicitly delete volumes when done; admin uses keys to identify volumes for deletion.

I don't see it as useful use-case... User can delete its claim and OpenShift will automatically delete the volume. Anyway, I think you got the point, admin can easily match AWS/GCE/OSP volumes with Kubernetes claims and PVs and do anything that's necessary with them.
Comment 8 Thien-Thi Nguyen 2016-03-15 12:09:28 EDT
(In reply to Jan Safranek from comment #7)
> > - OpenShift supports three storage backends [...]
>
> Openshift can use many backends - NFS, iSCSI, FC etc. Only AWS,
> OpenStack and GCE support dynamic provisioning.

Right.  Thanks for the clarification.

> >   - User does not explicitly delete volumes when done;
> >     admin uses keys to identify volumes for deletion.
>
> I don't see it as useful use-case... User can delete its claim
> and OpenShift will automatically delete the volume. Anyway, I
> think you got the point, admin can easily match AWS/GCE/OSP
> volumes with Kubernetes claims and PVs and do anything that's
> necessary with them.

OK.  I have documented the not-so-useful case, anyway, since the
user may fail to delete the claim (for whatever reason), and i
suspect this is common enough...

Here is the current WIP:
https://github.com/tnguyen-rh/openshift-docs/blob/bz1306520/install_config/persistent_storage/dynamically_provisioning_pvs.adoc#storage-provisioner-labels

From commit:
https://github.com/tnguyen-rh/openshift-docs/commit/dedc44897a582fd11a6d75663d45bf581f742eca

WDYT?  Am i headed in the right direction?  If so, i'll continue
w/ the xrefs and create a true PR for your review.  If not, what's
wrong?
Comment 9 Thien-Thi Nguyen 2016-03-15 12:47:42 EDT
PR: https://github.com/openshift/openshift-docs/pull/1744
Comment 12 Vikram Goyal 2016-03-16 01:38:04 EDT
(In reply to Thien-Thi Nguyen from comment #6)
> Hi Vikram,
> 
> The scope of this BZ is now expanded to three storage backends, not just AWS
> EBS and OpenStack Cinder.  I believe this trello card is now also relevant:
> 
>  https://trello.com/c/HW6fyd5g/81-add-pv-name-to-each-dynamically-
> provisioned-volume-for-gce-ops-rfe32

That is the same Trello card that was linked to in comment 0.

> 
> If you find other GCE PD-specific trello cards and/or BZs, feel free to list
> them here and point them to this BZ.

I will do. Clearing the NEEDINFO and setting it to Jan for your last comment. Let me know if there is anything I should comment on?
Comment 13 Thien-Thi Nguyen 2016-03-16 04:01:11 EDT
(In reply to Vikram Goyal from comment #12)
> >  https://trello.com/c/HW6fyd5g/81-add-pv-name-to-each-dynamically-
> > provisioned-volume-for-gce-ops-rfe32
>
> That is the same Trello card that was linked to in comment 0.

I beg to differ.  The one from comment 0:

 https://trello.com/c/cvHQcgch/10-3-add-pv-name-to-each-dynamically-provisioned-volume-ops-rfe-origin

deals with AWS EBS and OpenStack Cinder, only.  The one i cited
deals w/ GCE PD.  (I believe it was split out because the design
and implementation of these storage-provisioner labels for GCE PD
is radically different from that for AWS EBS and OpenStack
Cinder.)  It's true that their (human-readable) titles begin w/
the same wording, though, so i can see how that could be
confusing.
Comment 14 Jan Safranek 2016-03-16 04:57:16 EDT
I'm reviewing linked GitHub PR, it looks fine for me.
Comment 15 Vikram Goyal 2016-03-16 08:13:10 EDT
(In reply to Thien-Thi Nguyen from comment #13)
> (In reply to Vikram Goyal from comment #12)
> > >  https://trello.com/c/HW6fyd5g/81-add-pv-name-to-each-dynamically-
> > > provisioned-volume-for-gce-ops-rfe32
> >
> > That is the same Trello card that was linked to in comment 0.
> 
> I beg to differ.  The one from comment 0:
> 
>  https://trello.com/c/cvHQcgch/10-3-add-pv-name-to-each-dynamically-
> provisioned-volume-ops-rfe-origin
> 
> deals with AWS EBS and OpenStack Cinder, only.  The one i cited
> deals w/ GCE PD.  (I believe it was split out because the design
> and implementation of these storage-provisioner labels for GCE PD
> is radically different from that for AWS EBS and OpenStack
> Cinder.)  It's true that their (human-readable) titles begin w/
> the same wording, though, so i can see how that could be
> confusing.

Whoops I apologize Thien-Thi!
Comment 18 openshift-github-bot 2016-03-18 08:34:43 EDT
Commit pushed to master at https://github.com/openshift/openshift-docs

https://github.com/openshift/openshift-docs/commit/e5e8654386f234c998a1928590183587c1fa712e
Merge pull request #1744 from tnguyen-rh/bz1306520

Bug 1306520 Document persistent volume storage labeling
Comment 19 Thien-Thi Nguyen 2016-03-18 08:39:00 EDT
Hi Jianwei, WDYT?
Comment 21 Jianwei Hou 2016-03-27 22:50:37 EDT
We have reviewed the docs and they are very well documented, thank you!

Note You need to log in before you can comment on or make changes to this bug.