Bug 1580273 - Dynamic provisioning PVC using csi volume driver is stuck in Pending status
Summary: Dynamic provisioning PVC using csi volume driver is stuck in Pending status
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.10.0
Assignee: Jan Safranek
QA Contact: Jianwei Hou
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-21 06:47 UTC by Qin Ping
Modified: 2018-05-23 11:38 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-23 11:38:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Qin Ping 2018-05-21 06:47:20 UTC
Description of problem:
When an openshift user create dynamic provisioning PVC using csi volume driver, the PVC will be stuck in Pending status.

Version-Release number of selected component (if applicable):
oc v3.10.0-0.47.0
openshift v3.10.0-0.47.0
kubernetes v1.10.0+b81c8f8
csi-provisioner-0.2.0-1.el7.x86_64
csi-attacher-0.2.0-3.git27299be.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Admin deploy csi per https://github.com/openshift/openshift-docs/pull/8783/files
2. Switch user to piqin
3. Create a new project "mytest"
4. Create a dynamic provisioning PVC
5. Check PVC status

Actual results:
$ oc get pvc
NAME      STATUS    VOLUME    CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc1      Pending                                       cinder         17m

Events:
  Type    Reason                Age                 From                         Message
  ----    ------                ----                ----                         -------
  Normal  ExternalProvisioning  5m (x124 over 10m)  persistentvolume-controller  waiting for a volume to be created, either by external provisioner "csi-cinderplugin" or manually created by system administrator


Expected results:
PVC goes into Bound status.

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:
# oc logs cinder-csi-controller-668465b74d-9dh4m -c csi-provisioner
E0521 06:27:43.074809       1 controller.go:769] Error watching for provisioning success, can't provision for claim "mytest/pvc1": events is forbidden: User "system:serviceaccount:csi:cinder-csi" cannot list events in the namespace "mytest": User "system:serviceaccount:csi:cinder-csi" cannot list events in project "mytest"
E0521 06:27:43.104468       1 event.go:200] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"pvc1.153094f73226c713", GenerateName:"", Namespace:"mytest", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, InvolvedObject:v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"mytest", Name:"pvc1", UID:"116b300a-5cc0-11e8-90c0-fa163eea5a21", APIVersion:"v1", ResourceVersion:"31050", FieldPath:""}, Reason:"Provisioning", Message:"External provisioner is provisioning volume for claim \"mytest/pvc1\"", Source:v1.EventSource{Component:"csi-cinderplugin f9ef1b00-5cb9-11e8-8a0d-0a580a80002d", Host:""}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbeb8b797c5d4d113, ext:2617507880146, loc:(*time.Location)(0x1b2aa40)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbeb8b797c5d4d113, ext:2617507880146, loc:(*time.Location)(0x1b2aa40)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events is forbidden: User "system:serviceaccount:csi:cinder-csi" cannot create events in the namespace "mytest": User "system:serviceaccount:csi:cinder-csi" cannot create events in project "mytest"' (will not retry!)
E0521 06:27:53.104694       1 controller.go:895] Failed to provision volume for claim "mytest/pvc1" with StorageClass "cinder": rpc error: code = DeadlineExceeded desc = context deadline exceeded
E0521 06:27:53.104773       1 goroutinemap.go:165] Operation for "provision-mytest/pvc1[116b300a-5cc0-11e8-90c0-fa163eea5a21]" failed. No retries permitted until 2018-05-21 06:27:53.604748318 +0000 UTC m=+2628.014795229 (durationBeforeRetry 500ms). Error: "rpc error: code = DeadlineExceeded desc = context deadline exceeded"
E0521 06:27:53.106788       1 event.go:200] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"pvc1.153094f9869bfcb8", GenerateName:"", Namespace:"mytest", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, InvolvedObject:v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"mytest", Name:"pvc1", UID:"116b300a-5cc0-11e8-90c0-fa163eea5a21", APIVersion:"v1", ResourceVersion:"31050", FieldPath:""}, Reason:"ProvisioningFailed", Message:"Failed to provision volume with StorageClass \"cinder\": rpc error: code = DeadlineExceeded desc = context deadline exceeded", Source:v1.EventSource{Component:"csi-cinderplugin f9ef1b00-5cb9-11e8-8a0d-0a580a80002d", Host:""}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbeb8b79a463e22b8, ext:2627514782329, loc:(*time.Location)(0x1b2aa40)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbeb8b79a463e22b8, ext:2627514782329, loc:(*time.Location)(0x1b2aa40)}}, Count:1, Type:"Warning", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events is forbidden: User "system:serviceaccount:csi:cinder-csi" cannot create events in the namespace "mytest": User "system:serviceaccount:csi:cinder-csi" cannot create events in project "mytest"' (will not retry!)
E0521 06:27:59.968310       1 event.go:200] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"pvc1.153094f73226c713", GenerateName:"", Namespace:"mytest", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, InvolvedObject:v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"mytest", Name:"pvc1", UID:"116b300a-5cc0-11e8-90c0-fa163eea5a21", APIVersion:"v1", ResourceVersion:"31103", FieldPath:""}, Reason:"Provisioning", Message:"External provisioner is provisioning volume for claim \"mytest/pvc1\"", Source:v1.EventSource{Component:"csi-cinderplugin f9ef1b00-5cb9-11e8-8a0d-0a580a80002d", Host:""}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbeb8b797c5d4d113, ext:2617507880146, loc:(*time.Location)(0x1b2aa40)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbeb8b79bf98539c6, ext:2634375079313, loc:(*time.Location)(0x1b2aa40)}}, Count:2, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "pvc1.153094f73226c713" is forbidden: User "system:serviceaccount:csi:cinder-csi" cannot patch events in the namespace "mytest": User "system:serviceaccount:csi:cinder-csi" cannot "patch" "events" with name "pvc1.153094f73226c713" in project "mytest"' (will not retry!)
E0521 06:28:09.977214       1 controller.go:895] Failed to provision volume for claim "mytest/pvc1" with StorageClass "cinder": rpc error: code = DeadlineExceeded desc = context deadline exceeded
E0521 06:28:09.977301       1 goroutinemap.go:165] Operation for "provision-mytest/pvc1[116b300a-5cc0-11e8-90c0-fa163eea5a21]" failed. No retries permitted until 2018-05-21 06:28:10.977255559 +0000 UTC m=+2645.387302483 (durationBeforeRetry 1s). Error: "rpc error: code = DeadlineExceeded desc = context deadline exceeded"

Comment 1 Jan Safranek 2018-05-21 12:19:51 UTC
This is the interesting part: 

> Failed to provision volume for claim "mytest/pvc1" with StorageClass "cinder": rpc error: code = DeadlineExceeded desc = context deadline exceeded

The provisioner called cinder driver to provision a volume but it did not get a response in 10 seconds. There is PR that re-tries the call couple of times with exponential backoff: https://github.com/kubernetes-csi/external-provisioner/pull/66. It gives the driver approx. 4 minutes to provision a volume.

On the other hand, it would be great to find out why the call did not succeed. 10 seconds sounds enough to me. Can you please attach full logs of csi-provisioner and cinder-driver containers?

Comment 2 Qin Ping 2018-05-22 01:48:40 UTC
Hi Jan,

It had a configuration error when I deployed csi. I forgot to modify the secret.data.cloud.conf value.

cinder-driver report:
E0521 09:37:28.445215       1 utils.go:100] GRPC error: Post https://example.com:13000/v2.0/tokens: dial tcp 93.184.216.34:13000: i/o timeout
E0521 09:39:00.931896       1 utils.go:100] GRPC error: Post https://example.com:13000/v2.0/tokens: dial tcp 93.184.216.34:13000: i/o timeout
E0521 09:40:36.133618       1 utils.go:100] GRPC error: Post https://example.com:13000/v2.0/tokens: dial tcp 93.184.216.34:13000: i/o timeout
E0521 09:42:52.346540       1 utils.go:100] GRPC error: Post https://example.com:13000/v2.0/tokens: dial tcp 93.184.216.34:13000: i/o timeout
E0521 09:45:24.007954       1 utils.go:100] GRPC error: Post https://example.com:13000/v2.0/tokens: dial tcp 93.184.216.34:13000: i/o timeout
E0521 09:47:37.353565       1 utils.go:100] GRPC error: Post https://example.com:13000/v2.0/tokens: dial tcp 93.184.216.34:13000: i/o timeout
E0521 09:50:16.584325       1 utils.go:100] GRPC error: Post https://example.com:13000/v2.0/tokens: dial tcp 93.184.216.34:13000: i/o timeout
E0521 09:52:37.374416       1 utils.go:100] GRPC error: Post https://example.com:13000/v2.0/tokens: dial tcp 93.184.216.34:13000: i/o timeout
E0521 09:55:09.408493       1 utils.go:100] GRPC error: Post https://example.com:13000/v2.0/tokens: dial tcp 93.184.216.34:13000: i/o timeout

Comment 3 Jan Safranek 2018-05-22 09:52:32 UTC
Qin, with enough permissions to post events, did the provisioner show reasonable error in oc describe PVC? It should at least show some timeout there. If so, I think there is no bug to fix.

Comment 4 Qin Ping 2018-05-23 02:03:20 UTC
PVC events:

Events:
  Type     Reason                Age                From                                                   Message
  ----     ------                ----               ----                                                   -------
  Normal   Provisioning          46s (x4 over 1m)   csi-cinderplugin 9ce7f585-5e2c-11e8-a5c0-0a580a80003f  External provisioner is provisioning volume for claim "mytest/pvc1"
  Warning  ProvisioningFailed    36s (x4 over 1m)   csi-cinderplugin 9ce7f585-5e2c-11e8-a5c0-0a580a80003f  Failed to provision volume with StorageClass "cinder": rpc error: code = DeadlineExceeded desc = context deadline exceeded
  Normal   ExternalProvisioning  34s (x25 over 1m)  persistentvolume-controller                            waiting for a volume to be created, either by external provisioner "csi-cinderplugin" or manually created by system administrator
  Warning  ProvisioningFailed    6s (x2 over 28s)   csi-cinderplugin 9cb75da1-5e2c-11e8-9a51-0a580a810024  Failed to provision volume with StorageClass "cinder": rpc error: code = DeadlineExceeded desc = context deadline exceeded
  Normal   Provisioning          1s (x3 over 38s)   csi-cinderplugin 9cb75da1-5e2c-11e8-9a51-0a580a810024  External provisioner is provisioning volume for claim "mytest/pvc1"

Comment 6 Jan Safranek 2018-05-23 11:38:06 UTC
I am closing this bug, it was caused by misconfiguration and with the policy fix it shows at least something in PVC events. Double PVCs is most probably dup of #1581113


Note You need to log in before you can comment on or make changes to this bug.