Description of problem: When deploying a pod using CNS to dynamically fill PVC's, the pods sometimes hang at ContainerCreating, then timeout and go to error Version-Release number of selected component (if applicable): 3.6 How reproducible: Believe 100% for specific situation Steps to Reproduce: 1. Deploy pod with PVC filled by dynamic GlusterFS (CNS) PV 2. 3. Actual results: Pod hangs at Container Creating, then go to Error Expected results: Successful creation Additional info: Believe there is a consistent reproducer using Coolstore MSA (https://github.com/jbossdemocentral/coolstore-microservice) with the provision-demo.sh and inventory-postgresql
# oc get pvc inventory-postgresql-pv NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE inventory-postgresql-pv Bound pvc-4b651208-9fcd-11e7-bcc6-001a4a160152 1Gi RWO gluster-container 3d # oc get pv pvc-4b651208-9fcd-11e7-bcc6-001a4a160152 NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-4b651208-9fcd-11e7-bcc6-001a4a160152 1Gi RWO Delete Bound coolstore-prod-ocuser/inventory-postgresql-pv gluster-container 3d # oc describe pv pvc-4b651208-9fcd-11e7-bcc6-001a4a160152 Name: pvc-4b651208-9fcd-11e7-bcc6-001a4a160152 Labels: <none> Annotations: pv.beta.kubernetes.io/gid=2006 pv.kubernetes.io/bound-by-controller=yes pv.kubernetes.io/provisioned-by=kubernetes.io/glusterfs volume.beta.kubernetes.io/mount-options=auto_unmount StorageClass: gluster-container Status: Bound Claim: coolstore-prod-ocuser/inventory-postgresql-pv Reclaim Policy: Delete Access Modes: RWO Capacity: 1Gi Message: Source: Type: Glusterfs (a Glusterfs mount on the host that shares a pod's lifetime) EndpointsName: glusterfs-dynamic-inventory-postgresql-pv Path: vol_ec2c28819fd3e3d5050b9727da2dc494 ReadOnly: false Events: <none> # oc get ep glusterfs-dynamic-inventory-postgresql-pv Error from server (NotFound): endpoints "glusterfs-dynamic-inventory-postgresql-pv" not found
I don't think this has been seen since reporting, and it's not critical for the release. Moving this to 3.11. However, Thom, are you still able to reproduce this?
I'm seeing this in 3.6-z but I don't have a consistent reproducer.
One cause seems to be if a delete of a PVC is followed immediately by a create of the same PVC. The glusterfs-dynamic-<<pvc_name>> either doesn't exist or has endpoints = <none>
Since my cluster is currently offline with another issue, please reach out to CNS QE.