Bug 2014083

Summary: When a pod is force deleted the volume may not be unmounted from the node, that causes mkfs to fail but the PV is set to available
Product: OpenShift Container Platform Reporter: Mario Vázquez <mavazque>
Component: NodeAssignee: Peter Hunt <pehunt>
Node sub component: CRI-O QA Contact: Sunil Choudhary <schoudha>
Status: CLOSED DUPLICATE Docs Contact:
Severity: urgent    
Priority: high CC: alosadag, aos-bugs, augol, bzhai, cback, colleen.o.malley, ealcaniz, ehashman, harpatil, igreen, jbrassow, jfindysz, jsafrane, keyoung, kir, kkarampo, krizza, mcornea, nagrawal, obulatov, openshift-bugs-escalate, peasters, pehunt, rphillips, rsandu, vlaad
Version: 4.8Flags: ehashman: needinfo-
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-01-17 17:59:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mario Vázquez 2021-10-14 12:57:33 UTC
Description of problem:

If for some reason you need to force-delete a pod, it may happen that the volume remains mounted in the node. Then diskmaker cannot clean it and it will fail with messages like:

Deleting PV block volume "local-pv-96cf56b9" device hostpath "/mnt/local-storage/general/dm-name-autopart-lv_9", mountpath "/mnt/local-storage/general/dm-name-autopart-lv_9"
Cleanup pv "local-pv-96cf56b9": StdoutBuf - "Calling mkfs"
Cleanup pv "local-pv-96cf56b9": StdoutBuf - "mke2fs 1.45.6 (20-Mar-2020)"
Cleanup pv "local-pv-96cf56b9": StdoutBuf - "/mnt/local-storage/general/dm-name-autopart-lv_9 is apparently in use by the system; will not make a filesystem here"

In the node we can see this:

$ lsblk

<output_omitted>
autopart-lv_9 235:8 0 1G 0 lvm /var/lib/kubelet/pods/8dfac272-50dd-445a-9267-ed4c0f527e44/volumes/kubernetes.io~local-volume/local-pv-96cf56b9
<output_omitted>

And if we look at the pv state:

oc get pv local-pv-96cf56b9

NAME                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS          REASON   AGE
local-pv-96cf56b9   1Gi        RWO            Delete           Available           general            114m


That means that a PVC can bound to this PV and pod will fail to work since the lv is still mounted and was not cleaned up.


Version-Release number of selected component (if applicable):

4.8.13

How reproducible:

Under certain conditions

Steps to Reproduce:
1. Create a pod that mounts a volume exposed by LSO
2. Force delete the pod
3. Check if volume is still mounted in the node
4. Check pv state

Actual results:

PV is available

Expected results:

PV is not available

Comment 6 Alberto Losada 2021-11-02 12:34:04 UTC
@kir any news on this? if the fix was included in 4.6 should not be already included in 4.8?

Comment 9 Kir Kolyshkin 2021-11-04 21:45:59 UTC
It looks like the runc fix mentioned earlier is unrelated.

From my perspective, what happens here is there are some processes still left in the cgroup, this is the reason why it can't be removed.

Volumes are out of runc scope, but the volume might be left mounted for the same reason why cgroup can't be removed -- there are some processes using it.

I'd use lsof or similar tools to investigate further.

Comment 39 Jonathan Earl Brassow 2022-01-03 17:02:28 UTC
If a file system cannot be unmounted due to a process using it, there isn't much that can be done by the file system or below - it would have to be resolved by the process releasing the FS first, then unmounting, etc.

It is possible that it is storage related if:
1) the storage is blocking/hung and the process that is preventing the unmount cannot proceed.  This could happen, for example, if a device-mapper device was suspended (I don't immediately see an indication of this).
2) A FS or storage tool /is/ the process preventing unmount.  It could be hung due to locking or other software bug.  I find this very unlikely, as most often these tools do not utilize the storage they are administering.

Comment 9 seems like the best suggestion - identify the process(es) that are running which are preventing the unmount.  There are nastier hacks which could be performed (like remapping the PV to 'error'), but that would be worst case, I think.

Comment 41 Peter Hunt 2022-01-07 16:43:00 UTC
I now suspect this is a dup of https://bugzilla.redhat.com/show_bug.cgi?id=2003206. according to https://bugzilla.redhat.com/show_bug.cgi?id=2014083#c18 we're hitting a deadlock with stop. I could verify if we could gather the goroutine stacks of a running instance: https://github.com/cri-o/cri-o/blob/main/tutorials/debugging.md#printing-go-routines

Comment 43 Peter Hunt 2022-01-11 14:47:48 UTC
setting need info for information requested in https://bugzilla.redhat.com/show_bug.cgi?id=2014083#c41

Comment 44 Peter Hunt 2022-01-11 14:49:10 UTC
a clarification comment: if we get into this situation where a pod can't be cleaned up, I'll need someone to ssh to the node and run commands described in https://github.com/cri-o/cri-o/blob/main/tutorials/debugging.md#printing-go-routines and post the file written to /tmp. That will help me determine whether the issue we're seeing is the same as the suspected duplicate.

Comment 45 Christoffer Back 2022-01-11 14:56:05 UTC
(In reply to Peter Hunt from comment #44)
> a clarification comment: if we get into this situation where a pod can't be
> cleaned up, I'll need someone to ssh to the node and run commands described
> in
> https://github.com/cri-o/cri-o/blob/main/tutorials/debugging.md#printing-go-
> routines and post the file written to /tmp. That will help me determine
> whether the issue we're seeing is the same as the suspected duplicate.
Hi Peter, 

I will contact customer and get you that stack trace asap. The case linked to this bz was put on hold due to the possible duplicate with namespace termination mentioned. 

br, 
Chris

Comment 49 Red Hat Bugzilla 2023-09-15 01:16:09 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days