Bug 1404383 - System needs to indicate/allow admins to set a PVC request will never be fulfilled/timeout
Summary: System needs to indicate/allow admins to set a PVC request will never be fulf...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 3.3.1
Hardware: x86_64
OS: Linux
medium
low
Target Milestone: ---
: ---
Assignee: Jan Safranek
QA Contact: Wenqi He
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-12-13 17:17 UTC by Boris Kurktchiev
Modified: 2017-07-24 14:11 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2017-04-12 19:18:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0884 0 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.5 RPM Release Advisory 2017-04-12 22:50:07 UTC

Description Boris Kurktchiev 2016-12-13 17:17:03 UTC
Description of problem:
Right now if I request a PV for a size that will never be available on my cluster, the PVC just sits in a pending state indefinitely. I should be able as an admin: set a setting that forces some sort of timeout on these requests.

Version-Release number of selected component (if applicable):
3.3.1.5

How reproducible:
All the time

Steps to Reproduce:
1. request PV for a size that is not available (local or dynamic provisioning)
2. sit and watch it be in pending state forever
3.

Actual results:


Expected results:
Timeout the request after a certain time period

Additional info:

Comment 1 Clayton Coleman 2016-12-13 17:19:05 UTC
At a minimum, the UI should help report to end users *why* something was not satisfied, and the dynamic provisioner / binder should potentially write information to the PVC status (condition NoPossibleBindings / Unsatisfiable) and possibly send events (if they don't today).  The UI would then use these to indicate whether the resource was unlikely to ever bind.

I think conditions on PVC binding status is a natural evolution and matches the rest of the project (deployments and RCs have recently added these for this exact reason).

Comment 2 Jessica Forrester 2016-12-15 20:19:03 UTC
Re-assigning to Storage, because we need the PVC conditions before we can do anything in the UI.

Comment 3 Bradley Childs 2017-01-23 08:03:31 UTC
This RFE similar to https://trello.com/c/lbxSAqUb/406-list-allowed-selectors-in-storageclass-list-allowed-keys-13# and i'll add it to the card. The cards current requirement is validating/enumerating selectors against PVC match criteria that a provisioner permits. We need provisioning constraints validated or enumerated over multiple PVC fields

I dont think we want to "fail" the PVC, only inform the user that nothing is currently available to satisfy the request.  In a dynamic framework a provisioner could (eventually) be available that satisfies the PVC.

Comment 4 Erin Boyd 2017-01-26 17:34:16 UTC
We can add some checks to the UI to check the available capacity based on the quota and consumption to do a 'pre-check' to eliminate the forever unfulfilled claims:
https://docs.google.com/a/redhat.com/document/d/12RDDJDoN9sT8JjIhJyC45j536RlP9VOkz1MQlp8kTgw/edit?usp=sharing

This RFE also extends to ensure that storageclass quotas are also respected in the dynamically provisioned instance, that users have the ability to even ask for such storage and then will follow the path above ^^ to make sure the capacity exists to ask for the storage.

Comment 5 Eric Paris 2017-01-28 18:30:25 UTC
I don't actually find this similar to brad's request at all. Part of this request we just won't do. We will not 'fail' or 'timeout' a PVC. It will stay pending even if it cannot be satisfied.

However I would like to see events on that PVC giving at much detail as to why is cannot be satisfied. Jan, can you please check as many 'unsatisfiable' cases as you can and see if there are meaningful events as to why it was unsatifiable? Offhand I can think of

`SC not defined`
`Quota over quota`
`PVC has labels unable to be satisfied by SC`
`Non-dynamic provisioned and no PV available, contact storage admin`

What else?

Comment 6 Jan Safranek 2017-02-01 16:09:04 UTC
> `SC not defined`

'SC' is just an annotation value, it does not need refer to a valid SC object, hence it can't be invalid.

> `Quota over quota`
> `Non-dynamic provisioned and no PV available, contact storage admin`

Already reported

> `PVC has labels unable to be satisfied by SC`

Not implemented. Basically, any PVC that does not match existing PV is kept pending. I fix it by mimicking pod scheduler behavior - it sends an event "failed to fit in any node" to a pod and it keeps the pod Pending forever.

So, in the end, it will be a simple event like "no suitable PV found". I can post a PR upstream tomorrow, however I am not sure it gets merged soon enough to catch OpenShift 3.5.

Comment 7 Eric Paris 2017-02-01 16:31:13 UTC
`SC not defined`:

If I set the SC on my PV to `glod` instead of `gold` it would be nice if there was something that said 'no storage class named `glod`.

Comment 8 Jan Safranek 2017-02-02 12:43:44 UTC
filled upstream PR: https://github.com/kubernetes/kubernetes/pull/40859

> `SC not defined`:
> 
> If I set the SC on my PV to `glod` instead of `gold` it would be nice if
> there was something that said 'no storage class named `glod`.

It should behave the same as if you request a PVC with Selector with typos in label names or you require exact PVC.Spec.VolumeName with typos or you make a typo in a Service selector that does not match any Pod - all this is left to user to discover and fix.

Comment 9 Jan Safranek 2017-02-03 12:31:54 UTC
downstream patch: https://github.com/openshift/origin/pull/12796

Comment 10 Jan Safranek 2017-02-06 09:04:44 UTC
Merged today.

Comment 11 Troy Dawson 2017-02-06 19:28:04 UTC
This has been merged into ocp and is in OCP v3.5.0.17 or newer.

Comment 13 Wenqi He 2017-02-07 07:37:03 UTC
Tested on below version:
openshift v3.5.0.17+c55cf2b
kubernetes v1.5.2+43a9be4

I got expecting info as below:

[root@wehe-master ~]# oc get pvc
NAME      STATUS    VOLUME    CAPACITY   ACCESSMODES   AGE
nfsc      Pending                                      2s
[root@wehe-master ~]# oc describe pvc nfsc
Name:		nfsc
Namespace:	wehe
StorageClass:	
Status:		Pending
Volume:		
Labels:		<none>
Capacity:	
Access Modes:	
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----				-------------	--------	------		-------
  9s		5s		2	{persistentvolume-controller }			Normal		FailedBinding	no persistent volumes available for this claim and no storage class is set
[root@wehe-master ~]# oc version

This bug is fixed, changing status to verified. Thanks

Comment 15 errata-xmlrpc 2017-04-12 19:18:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0884


Note You need to log in before you can comment on or make changes to this bug.