Bug 1444096 - Pod running but logs say volume not attached
Summary: Pod running but logs say volume not attached
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Matthew Wong
QA Contact: Wenqi He
URL:
Whiteboard:
: 1444131 (view as bug list)
Depends On:
Blocks: 1502789
TreeView+ depends on / blocked
 
Reported: 2017-04-20 14:42 UTC by Eric Paris
Modified: 2023-09-07 18:52 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
: 1502789 (view as bug list)
Environment:
Last Closed: 2017-08-10 05:20:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
kubelet logs grepping for 'tphnk' (10.83 KB, text/plain)
2017-04-20 14:42 UTC, Eric Paris
no flags Details
pod, pvc, and pv yaml (6.25 KB, text/plain)
2017-04-20 14:46 UTC, Eric Paris
no flags Details
node logs (17.87 KB, text/plain)
2017-04-20 14:48 UTC, Eric Paris
no flags Details
master logs (48.19 KB, text/plain)
2017-04-20 14:58 UTC, Eric Paris
no flags Details
node yaml and relevant /proc/mounts (18.33 KB, text/plain)
2017-04-20 15:07 UTC, Eric Paris
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3200662 0 None None None 2018-01-18 05:18:25 UTC
Red Hat Product Errata RHEA-2017:1716 0 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.6 RPM Release Advisory 2017-08-10 09:02:50 UTC

Description Eric Paris 2017-04-20 14:42:53 UTC
Created attachment 1272989 [details]
kubelet logs grepping for 'tphnk'

We found a pod which is 'running' but which the kubelet is complaining that it cannot start because it times out waiting for the volume. I'm going to attach lots of logs. The main log of interest is the kubelet log which shows the pod 'failing'. But one will be able to see in the pod definition that it is in fact running.

Comment 1 Eric Paris 2017-04-20 14:46:25 UTC
Created attachment 1272990 [details]
pod, pvc, and pv yaml

Comment 2 Eric Paris 2017-04-20 14:48:19 UTC
Created attachment 1272991 [details]
node logs

Comment 3 Eric Paris 2017-04-20 14:58:24 UTC
Created attachment 1272992 [details]
master logs

Comment 4 Eric Paris 2017-04-20 15:00:10 UTC
Comment on attachment 1272992 [details]
master logs

`oc get events`
shows these 2 events, likely the updates are the counts changing over and over and over

1m         4h          137       jenkins-1-tphnk   Pod                   Warning   FailedMount   {kubelet ip-172-31-59-87.ec2.internal}   Unable to mount volumes for pod "jenkins-1-tphnk_dakinitest-20170412-8(12fad6f0-2469-11e7-a4d0-0a0ddbb52ad4)": timeout expired waiting for volumes to attach/mount for pod "dakinitest-20170412-8"/"jenkins-1-tphnk". list of unattached/unmounted volumes=[jenkins-data]
1m         4h          137       jenkins-1-tphnk   Pod                   Warning   FailedSync    {kubelet ip-172-31-59-87.ec2.internal}   Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "dakinitest-20170412-8"/"jenkins-1-tphnk". list of unattached/unmounted volumes=[jenkins-data]

Comment 5 Eric Paris 2017-04-20 15:07:26 UTC
Created attachment 1273000 [details]
node yaml and relevant /proc/mounts

Comment 6 Eric Paris 2017-04-20 15:22:08 UTC
It feels like from starting to look at https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/kubelet/volumemanager/volume_manager.go#L347 that this would only be possible if 'actualStateOfTheWorld' were out of sync and not getting updated...

Comment 7 Hemant Kumar 2017-04-20 17:06:51 UTC
*** Bug 1444131 has been marked as a duplicate of this bug. ***

Comment 8 Matthew Wong 2017-04-20 18:27:35 UTC
How long did the messages go on for on this pod? They should stop after approximately 3 minutes because that's when the actual state of world gets force synced based on pod directories.

I can't think of a way the actual state of world could get out of sync without there being a node restart (didn't happen) or a failed 'MarkVolumeAsMounted' immediately after a successful mount. But if MarkVolumeAsMounted failed then we would see "operationExecutor.MountVolume failed...for pod jenkins-1-tphnk_dakinitest-20170412-8..." in the logs somewhere

Comment 9 Eric Paris 2017-04-20 18:36:32 UTC
It was in this state at least 24 hours. Pod running, PV mounted, keep getting these errors. (It appears to be happening for a number of pods, maybe about 10, although I only collected data for this one)

Comment 10 Matthew Wong 2017-04-20 19:07:21 UTC
OK, in that case then did the volumes successfully get unmounted when of those pods stopped? Because I believe they can only be unmounted if they are in the actual state of world, i.e. the volumemanager is aware that they exist.

If they unmount fine, then the issue is the check itself which I'm digging into. I think there's something wrong with the way it gets names of 'expectedVolumes'. That doesn't explain why this is only happening on *this* cluster now, but, it's the best idea I have at the moment.

Comment 11 Hemant Kumar 2017-04-20 19:27:42 UTC
This pod was still running when we observed and wasn't unmounted.

Comment 12 Matthew Wong 2017-04-20 19:36:30 UTC
Ok: then I guess we'd have to stop a pod ourselves or find a pod that was stopped to see if its unmounting failed.

But I think I found a bug, the reconstructor in the 3min 'force' sync inserts into actual state of world a different volumeName than the ordinary 'non-force' sync. It uses <plugin_name>/<volume_name> vs just <volume_name>. The checker asks for <volume_name>, so it won't find <plugin_name>/<volume_name> if that's all that exists in the actual state of world.

Still deciding if that is the cause of this or not.

Comment 13 Matthew Wong 2017-04-20 21:30:45 UTC
So I found there's no issue with the actualStateOfWorld being out of sync thankfully, unmounting works fine. 

Apparently it's a known limitation of the reconstruction that it can't always recover the proper "outerVolumeSpecName" (not "volumeName" like I said before). https://github.com/openshift/origin/blob/release-1.5/vendor/k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler/reconciler.go#L568 . And this leads any checks the volume manager makes against its actual state of world that rely on "outerVolumeSpecName" to fail. As far as I can tell the only bad symptom is the misleading log/event spam, will work on fix tomorrow.

repro steps:
1. Create PV, PVC, pod
2. Restart kubelet to trigger volumemanager asw reconstruction
3. Wait 2 minutes for misleading log/event spam to show up, will continue every 2 minutes forever

Comment 14 Matthew Wong 2017-05-04 14:38:13 UTC
upstream PR https://github.com/kubernetes/kubernetes/pull/44781 (just final code quality reviews now)

Comment 15 Matthew Wong 2017-05-11 16:21:33 UTC
PR to origin master here https://github.com/openshift/origin/pull/14144

Comment 17 Wenqi He 2017-05-31 10:49:12 UTC
I'd like to wait a little long time to check the log with failing pod, will test it tomorrow. Thanks

Comment 18 Wenqi He 2017-06-01 06:33:46 UTC
Tested on below version:
openshift v3.6.86
kubernetes v1.6.1+5115d708d7

1. Create dynamic pvc and pod
2. Restart atomic-openshift-node.service
3. Check the event log in node log messages
4. There is no event spam shows up every two mins

Comment 20 errata-xmlrpc 2017-08-10 05:20:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716


Note You need to log in before you can comment on or make changes to this bug.