Bug 1337179 - Persistent volume claims taking very long time to be granted.
Summary: Persistent volume claims taking very long time to be granted.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Storage
Version: 3.x
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Dan Mace
QA Contact: Jianwei Hou
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-05-18 12:31 UTC by Graham Dumpleton
Modified: 2016-06-23 17:32 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-06-23 17:32:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Graham Dumpleton 2016-05-18 12:31:55 UTC
Description of problem:

Persistent volume claims taking over 5 minutes to be granted.

Version-Release number of selected component (if applicable):

$ oc version
oc v1.1.6
kubernetes v1.2.0-36-g4a3f9c5

This is with the online preview environment. I don't know any more information about cluster versions than that.

How reproducible:

Seen occurring multiple times, but doesn't occur every time.

Was using:

oc new-app kallithea-scm-single-postgresql

where the template was loaded using:

oc create -f https://raw.githubusercontent.com/GrahamDumpleton/openshift3-kallithea/master/template.json

Actual results:

$ oc get pvc --watch
NAME            STATUS    VOLUME    CAPACITY   ACCESSMODES   AGE
kallithea-pvc   Pending                                      20s
NAME            STATUS    VOLUME         CAPACITY   ACCESSMODES   AGE
kallithea-pvc   Pending   pv-aws-px4pq   0                        7m
kallithea-pvc   Bound     pv-aws-px4pq   1Gi       RWO       7m

$ oc get events --watch
2016-05-18 22:16:46 +1000 AEST   2016-05-18 22:16:46 +1000 AEST   1         kallithea-1-build              Pod                                                             Normal    Scheduled          {default-scheduler }                     Successfully assigned kallithea-1-build to ip-172-31-4-147.ec2.internal
2016-05-18 22:16:48 +1000 AEST   2016-05-18 22:16:48 +1000 AEST   1         kallithea-1-build              Pod                     spec.containers{sti-build}              Normal    Pulled             {kubelet ip-172-31-4-147.ec2.internal}   Container image "openshift3/ose-sti-builder:v3.2.0.40" already present on machine
2016-05-18 22:16:49 +1000 AEST   2016-05-18 22:16:49 +1000 AEST   1         kallithea-1-build              Pod                     spec.containers{sti-build}              Normal    Created            {kubelet ip-172-31-4-147.ec2.internal}   Created container with docker id 1b90b8614e7f
2016-05-18 22:16:50 +1000 AEST   2016-05-18 22:16:50 +1000 AEST   1         kallithea-1-build              Pod                     spec.containers{sti-build}              Normal    Started            {kubelet ip-172-31-4-147.ec2.internal}   Started container with docker id 1b90b8614e7f
2016-05-18 22:19:05 +1000 AEST   2016-05-18 22:19:05 +1000 AEST   1         kallithea   DeploymentConfig               Normal    DeploymentCreated   {deploymentconfig-controller }   Created new deployment "kallithea-1" for version 1
2016-05-18 22:19:05 +1000 AEST   2016-05-18 22:19:05 +1000 AEST   1         kallithea   DeploymentConfig             Warning   FailedUpdate   {deployment-controller }   Cannot update deployment snakes/kallithea-1 status to Pending: replicationcontrollers "kallithea-1" cannot be updated: the object has been modified; please apply your changes to the latest version and try again
2016-05-18 22:19:05 +1000 AEST   2016-05-18 22:19:05 +1000 AEST   1         kallithea-1-deploy   Pod                 Normal    Scheduled   {default-scheduler }   Successfully assigned kallithea-1-deploy to ip-172-31-14-23.ec2.internal
2016-05-18 22:19:08 +1000 AEST   2016-05-18 22:19:08 +1000 AEST   1         kallithea-1-deploy   Pod       spec.containers{deployment}   Normal    Pulled    {kubelet ip-172-31-14-23.ec2.internal}   Container image "openshift3/ose-deployer:v3.2.0.40" already present on machine
2016-05-18 22:19:14 +1000 AEST   2016-05-18 22:19:14 +1000 AEST   1         kallithea-1-deploy   Pod       spec.containers{deployment}   Normal    Created   {kubelet ip-172-31-14-23.ec2.internal}   Created container with docker id bf75ec7775ac
2016-05-18 22:19:14 +1000 AEST   2016-05-18 22:19:14 +1000 AEST   1         kallithea-1-deploy   Pod       spec.containers{deployment}   Normal    Started   {kubelet ip-172-31-14-23.ec2.internal}   Started container with docker id bf75ec7775ac
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:19:16 +1000 AEST   1         kallithea-1   ReplicationController             Normal    SuccessfulCreate   {replication-controller }   Created pod: kallithea-1-e3rev
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:19:16 +1000 AEST   1         kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:19:17 +1000 AEST   2         kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:19:19 +1000 AEST   3         kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:19:23 +1000 AEST   4         kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:19:31 +1000 AEST   5         kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:19:47 +1000 AEST   6         kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:20:19 +1000 AEST   7         kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:21:19 +1000 AEST   8         kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:21:20 +1000 AEST   9         kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:21:22 +1000 AEST   10        kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:21:26 +1000 AEST   11        kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:21:34 +1000 AEST   12        kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:21:50 +1000 AEST   13        kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:22:22 +1000 AEST   14        kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:23:22 +1000 AEST   15        kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:23:23 +1000 AEST   16        kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:23:25 +1000 AEST   17        kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:23:29 +1000 AEST   18        kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:23:37 +1000 AEST   19        kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:19:16 +1000 AEST   2016-05-18 22:23:53 +1000 AEST   20        kallithea-1-e3rev   Pod                 Warning   FailedScheduling   {default-scheduler }   PersistentVolumeClaim is not bound: "kallithea-pvc"
2016-05-18 22:24:25 +1000 AEST   2016-05-18 22:24:25 +1000 AEST   1         kallithea-1-e3rev   Pod                 Normal    Scheduled   {default-scheduler }   Successfully assigned kallithea-1-e3rev to ip-172-31-4-148.ec2.internal
2016-05-18 22:24:32 +1000 AEST   2016-05-18 22:24:32 +1000 AEST   1         kallithea-1-e3rev   Pod       spec.containers{kallithea}   Normal    Pulling   {kubelet ip-172-31-4-148.ec2.internal}   pulling image "172.30.94.234:5000/snakes/kallithea@sha256:ed99e92656ee3860fafe7ee4031434f2c19136452f9454d6255d1232c346cb01"
2016-05-18 22:24:41 +1000 AEST   2016-05-18 22:24:41 +1000 AEST   1         kallithea-1-e3rev   Pod       spec.containers{kallithea}   Normal    Pulled    {kubelet ip-172-31-4-148.ec2.internal}   Successfully pulled image "172.30.94.234:5000/snakes/kallithea@sha256:ed99e92656ee3860fafe7ee4031434f2c19136452f9454d6255d1232c346cb01"
2016-05-18 22:24:42 +1000 AEST   2016-05-18 22:24:42 +1000 AEST   1         kallithea-1-e3rev   Pod       spec.containers{kallithea}   Normal    Created   {kubelet ip-172-31-4-148.ec2.internal}   Created container with docker id bcc1b832cf46
2016-05-18 22:24:42 +1000 AEST   2016-05-18 22:24:42 +1000 AEST   1         kallithea-1-e3rev   Pod       spec.containers{kallithea}   Normal    Started   {kubelet ip-172-31-4-148.ec2.internal}   Started container with docker id bcc1b832cf46
2016-05-18 22:24:43 +1000 AEST   2016-05-18 22:24:43 +1000 AEST   1         kallithea-1-e3rev   Pod       spec.containers{kallithea-db}   Normal    Pulling   {kubelet ip-172-31-4-148.ec2.internal}   pulling image "registry.access.redhat.com/rhscl/postgresql-94-rhel7:latest"
2016-05-18 22:24:53 +1000 AEST   2016-05-18 22:24:53 +1000 AEST   1         kallithea-1-e3rev   Pod       spec.containers{kallithea-db}   Normal    Pulled    {kubelet ip-172-31-4-148.ec2.internal}   Successfully pulled image "registry.access.redhat.com/rhscl/postgresql-94-rhel7:latest"
2016-05-18 22:24:53 +1000 AEST   2016-05-18 22:24:53 +1000 AEST   1         kallithea-1-e3rev   Pod       spec.containers{kallithea-db}   Normal    Created   {kubelet ip-172-31-4-148.ec2.internal}   Created container with docker id dda402e5b0e2
2016-05-18 22:24:54 +1000 AEST   2016-05-18 22:24:54 +1000 AEST   1         kallithea-1-e3rev   Pod       spec.containers{kallithea-db}   Normal    Started   {kubelet ip-172-31-4-148.ec2.internal}   Started container with docker id dda402e5b0e2
2016-05-18 22:24:55 +1000 AEST   2016-05-18 22:24:55 +1000 AEST   1         kallithea-1-e3rev   Pod       spec.containers{kallithea}   Warning   Unhealthy   {kubelet ip-172-31-4-148.ec2.internal}   Readiness probe failed: Get http://10.1.2.34:8080/: dial tcp 10.1.2.34:8080: connection refused
2016-05-18 22:24:55 +1000 AEST   2016-05-18 22:25:05 +1000 AEST   2         kallithea-1-e3rev   Pod       spec.containers{kallithea}   Warning   Unhealthy   {kubelet ip-172-31-4-148.ec2.internal}   Readiness probe failed: Get http://10.1.2.34:8080/: dial tcp 10.1.2.34:8080: connection refused
2016-05-18 22:25:05 +1000 AEST   2016-05-18 22:25:05 +1000 AEST   1         kallithea-1-e3rev   Pod       spec.containers{kallithea-db}   Warning   Unhealthy   {kubelet ip-172-31-4-148.ec2.internal}   Readiness probe failed: sh: cannot set terminal process group (-1): Inappropriate ioctl for device
sh: no job control in this shell
psql: could not connect to server: Connection refused
          Is the server running on host "127.0.0.1" and accepting
          TCP/IP connections on port 5432?

Expected results:

It not take 5+ minutes to be granted, but to be granted straight away.

Additional info:

Have only seen this with online preview environment over the last couple of days. Hadn't seen it before that.

Comment 1 Dan Mace 2016-05-18 13:08:21 UTC
Assigning to Mark. I noticed this yesterday as well.

Comment 2 Dan Mace 2016-05-20 14:20:09 UTC
So, here's the sequence of events:

1. PVC is created (quickly)
2. PV is created (quickly)
3. PV is provisioned (quickly)
4. PVC is finished provisioning (quickly)
5. PV is Bound to PVC (quickly)
6. PVC remains Pending for minutes, eventually becoming Bound (this probably coincides with a sync interval)

The problem is with step 6, and that responsibility lies in the upstream PV infrastructure, not in the dynamic provisioner.

Comment 3 Dan Mace 2016-05-20 14:57:05 UTC
Workaround in https://github.com/openshift/online/issues/152

Comment 4 Dan Mace 2016-05-24 12:31:01 UTC
Workaround deployed to dev preview INT/STG/PROD.

Comment 5 Wenqi He 2016-05-27 02:47:17 UTC
I have verified this both on INT and STG
All of pvc can be bound less than 1 mins

On INT:
NAME            STATUS    VOLUME         CAPACITY   ACCESSMODES   AGE
kallithea-pvc   Bound     pv-aws-9w0vb   1Gi        RWO           6s

on SGT:
NAME            STATUS    VOLUME         CAPACITY   ACCESSMODES   AGE
kallithea-pvc   Bound     pv-aws-bi41x   1Gi        RWO           33s

The openshift version of STG is:
openshift v3.2.0.44
kubernetes v1.2.0-36-g4a3f9c5
etcd 2.2.5

Thanks


Note You need to log in before you can comment on or make changes to this bug.