Created attachment 1272989 [details] kubelet logs grepping for 'tphnk' We found a pod which is 'running' but which the kubelet is complaining that it cannot start because it times out waiting for the volume. I'm going to attach lots of logs. The main log of interest is the kubelet log which shows the pod 'failing'. But one will be able to see in the pod definition that it is in fact running.
Created attachment 1272990 [details] pod, pvc, and pv yaml
Created attachment 1272991 [details] node logs
Created attachment 1272992 [details] master logs
Comment on attachment 1272992 [details] master logs `oc get events` shows these 2 events, likely the updates are the counts changing over and over and over 1m 4h 137 jenkins-1-tphnk Pod Warning FailedMount {kubelet ip-172-31-59-87.ec2.internal} Unable to mount volumes for pod "jenkins-1-tphnk_dakinitest-20170412-8(12fad6f0-2469-11e7-a4d0-0a0ddbb52ad4)": timeout expired waiting for volumes to attach/mount for pod "dakinitest-20170412-8"/"jenkins-1-tphnk". list of unattached/unmounted volumes=[jenkins-data] 1m 4h 137 jenkins-1-tphnk Pod Warning FailedSync {kubelet ip-172-31-59-87.ec2.internal} Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "dakinitest-20170412-8"/"jenkins-1-tphnk". list of unattached/unmounted volumes=[jenkins-data]
Created attachment 1273000 [details] node yaml and relevant /proc/mounts
It feels like from starting to look at https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/kubelet/volumemanager/volume_manager.go#L347 that this would only be possible if 'actualStateOfTheWorld' were out of sync and not getting updated...
*** Bug 1444131 has been marked as a duplicate of this bug. ***
How long did the messages go on for on this pod? They should stop after approximately 3 minutes because that's when the actual state of world gets force synced based on pod directories. I can't think of a way the actual state of world could get out of sync without there being a node restart (didn't happen) or a failed 'MarkVolumeAsMounted' immediately after a successful mount. But if MarkVolumeAsMounted failed then we would see "operationExecutor.MountVolume failed...for pod jenkins-1-tphnk_dakinitest-20170412-8..." in the logs somewhere
It was in this state at least 24 hours. Pod running, PV mounted, keep getting these errors. (It appears to be happening for a number of pods, maybe about 10, although I only collected data for this one)
OK, in that case then did the volumes successfully get unmounted when of those pods stopped? Because I believe they can only be unmounted if they are in the actual state of world, i.e. the volumemanager is aware that they exist. If they unmount fine, then the issue is the check itself which I'm digging into. I think there's something wrong with the way it gets names of 'expectedVolumes'. That doesn't explain why this is only happening on *this* cluster now, but, it's the best idea I have at the moment.
This pod was still running when we observed and wasn't unmounted.
Ok: then I guess we'd have to stop a pod ourselves or find a pod that was stopped to see if its unmounting failed. But I think I found a bug, the reconstructor in the 3min 'force' sync inserts into actual state of world a different volumeName than the ordinary 'non-force' sync. It uses <plugin_name>/<volume_name> vs just <volume_name>. The checker asks for <volume_name>, so it won't find <plugin_name>/<volume_name> if that's all that exists in the actual state of world. Still deciding if that is the cause of this or not.
So I found there's no issue with the actualStateOfWorld being out of sync thankfully, unmounting works fine. Apparently it's a known limitation of the reconstruction that it can't always recover the proper "outerVolumeSpecName" (not "volumeName" like I said before). https://github.com/openshift/origin/blob/release-1.5/vendor/k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler/reconciler.go#L568 . And this leads any checks the volume manager makes against its actual state of world that rely on "outerVolumeSpecName" to fail. As far as I can tell the only bad symptom is the misleading log/event spam, will work on fix tomorrow. repro steps: 1. Create PV, PVC, pod 2. Restart kubelet to trigger volumemanager asw reconstruction 3. Wait 2 minutes for misleading log/event spam to show up, will continue every 2 minutes forever
upstream PR https://github.com/kubernetes/kubernetes/pull/44781 (just final code quality reviews now)
PR to origin master here https://github.com/openshift/origin/pull/14144
I'd like to wait a little long time to check the log with failing pod, will test it tomorrow. Thanks
Tested on below version: openshift v3.6.86 kubernetes v1.6.1+5115d708d7 1. Create dynamic pvc and pod 2. Restart atomic-openshift-node.service 3. Check the event log in node log messages 4. There is no event spam shows up every two mins
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716