I don't see anything obviously wrong with CSI in the cluster. What I noticed is that nodes in the cluster are drained while e2e tests run. This is quite dangerous, as e2e tests install CSI drivers as Pods, not as DaemonSet, and CSI driver pods may be evicted before pods that use the driver, leading to volumes that cannot be unmounted and pods that can be deleted. Actually, all e2e tests I remember really don't expect that nodes are drained underneath them.
Maybe I closed it too early... Is there any magic that would make a pod A ("application") drain before B (CSI driver)? We can add labels/annotation/priority class if we wanted.
Checked with node team, we can't make Pods drain in a specific order. To sum it up: do not drain nodes when the tests are running!
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days