Description of problem: When a pod uses a PersistentVolumeClaim, this can happen: Version-Release number of selected component (if applicable): origin-3.1.0 How reproducible: ~30%-50% Steps to Reproduce: 1. create a PV and a claim (I use Cinder volumes, but I saw it on AWS and GCE too) 2. create a pod that uses the claim 3. In a loop: 3.1 create the pod 3.2 wait until it's running 3.3 run 'kubectl describe pods' 3.4 delete it 3.5 wait until the volume is unmounted and detached from the node (this is important!) Actual results: at step 3.3, you can see it was necessary to restart the pod container(s), "kubectl describe pods" shows something like: Error syncing pod, skipping: not all containers have started: 0 != 1 Expected results: The containers start on the first try. Additional info: The pod is started eventually, it's just slower (~1 minute in my OpenStack setup, attaching a volume is slow...). Fix: https://github.com/kubernetes/kubernetes/pull/19600
In a conversation with Jan, we determined this is not a blocker because kubelet and the volume eventually reach the correct state. It might take a few minutes to reconcile, making it a bad UX but not a blocker.
Upstream PR is merged. Awaiting rebase into Origin.
In case there is no rebase I filled Origin PR: https://github.com/openshift/origin/pull/7107
Origin PR merged
Verified with openshift v1.1.2-260-gf556adc kubernetes v1.2.0-origin etcd 2.2.2+git The issue described here is not reproducible, moving to verified.