1845666 – Application pods using RBD/RWO PVC are stuck waiting for PV lock release when the node it was running on is going down

This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .

Bug 1845666 - Application pods using RBD/RWO PVC are stuck waiting for PV lock release when the node it was running on is going down

Summary: Application pods using RBD/RWO PVC are stuck waiting for PV lock release when...

Keywords:
Status:	CLOSED MIGRATED
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	csi-driver
Sub Component:
Version:	4.3
Hardware:	All
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Rakshith
QA Contact:	Elad
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1841611 (view as bug list)
Depends On:	1795372
Blocks:	1948728
TreeView+	depends on / blocked

Reported:	2020-06-09 19:12 UTC by svolkov
Modified:	2023-08-09 16:37 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2023-04-05 09:54:12 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1795372	1	None	None	None	2024-06-27 07:14:10 UTC
Red Hat Issue Tracker	RHSTOR-2500	0	None	None	None	2023-04-05 09:54:11 UTC

Internal Links: 1948728

Description svolkov 2020-06-09 19:12:00 UTC

Description of problem (please be detailed as possible and provide log
snippests):
The scenario is simple.
an application pod is using our PVC (RBD/RWO). if the node the application pod is running on is going down (shutdown), it takes time until k8s understand the node is down (I think 20s), and then while the pod is moving to another location (node), the application pod hangs and waiting for k8s to release the lock on the PVC. IIRC k8s will never release the lock and wants a manual user intervention.
This of course basically fails any SLA the customer might have for the application (think PostgreSQL pod on a failed node moving to another node and never completing the startup process since it can't acquire lock on the PVC the previous pod used).

This bz is that we can track this issue from the application perspective.
bz 1795372 relates to the solution (I guess).
Version of all relevant components (if applicable):
any OCS4 version

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
the expect behavior is the application pod will migrate to the new node and that the PVC will move to it - thus not blocking any IOs on the application, at least not for an endless period of time.

Is there any workaround available to the best of your knowledge?
force kill the old application pod.

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
yes

Can this issue reproduce from the UI?
yes

If this is a regression, please provide more details to justify this:

Steps to Reproduce:
1. create a pod that needs a PVC from OCS of type block/RBD
2. kill or shutdown the *node* the pod is currently running
3. monitor the creation of the new pod.

Actual results:

Expected results:

Additional info:
From competitive perspective, Portworx have a solution for this (Annette checked it), when you fail a node, the application pod migrates to a new node and in matter of seconds gets the PVC attached to it.

Comment 2 Travis Nielsen 2020-06-09 19:24:29 UTC

This may be a dup of 1795372, but I'd recommend we leave it open to track the application impact of this issue and raise visibility.
More discussion is here: https://github.com/ceph/ceph-csi/issues/578#issuecomment-583501921

Comment 3 Yaniv Kaul 2020-06-10 12:15:00 UTC

This has nothing to do with OCS though - it happens with other providers as well?

Comment 4 svolkov 2020-06-10 15:00:50 UTC

as I wrote, this doesn't happen in Portworx. They figured out a way around this problem.
I've also checked with someone I know in MayaData, and they also have a way around this making the lock disappear in matter of seconds.

regardless, waiting on k8s to solve this (which might be never) is (IMHO) the wrong approach as OCS (and OCP) customers are facing this problem today and it will delay the move of their stateful applications to k8s, or the customer will choose the application method for HA (meaning, application specific solution like crunchy, percona or Mongo) and bypass the need to buy OCS.

Comment 7 Michael Adam 2020-06-26 13:03:26 UTC

Making this depend on the "solution bug" #1795372.
That one is targeted to 4.6 ==> moving this one to 4.6 as well

Comment 8 Humble Chirammal 2020-06-29 11:11:11 UTC

(In reply to Michael Adam from comment #7)
> Making this depend on the "solution bug" #1795372.
> That one is targeted to 4.6 ==> moving this one to 4.6 as well

Sure, We have to reconsider the possible workarounds/solutions though. Till then lets keep it on OCS 4.6 target

Comment 9 Mudit Agarwal 2020-07-12 13:09:56 UTC

*** Bug 1841611 has been marked as a duplicate of this bug. ***

Comment 10 Mudit Agarwal 2020-09-22 11:33:33 UTC

https://bugzilla.redhat.com/show_bug.cgi?id=1795372 is targeted for 4.7

Comment 11 Niels de Vos 2021-01-07 15:41:29 UTC

Similar to bug 1795372 (should this really not be closed as a duplicate), moving out of ocs-4.7.

Comment 12 Humble Chirammal 2021-05-05 06:48:51 UTC

Few of the CSI spec changes happening/proposed could help us I believe:
 https://github.com/container-storage-interface/spec/pull/477

Comment 13 Mudit Agarwal 2021-05-25 11:33:18 UTC

Depends on https://bugzilla.redhat.com/show_bug.cgi?id=1795372

Comment 21 Humble Chirammal 2023-01-23 12:27:03 UTC

"Non graceful node shutdown feature" is beta in v1.26 version of kubernetes upstream, so mostly OCP 4.13 will be enabling this feature. The detection of node (not ready) state or triggering the failover is still a process outside of this mentioned feature though.

Note You need to log in before you can comment on or make changes to this bug.