Bug 1930220 - Cinder CSI driver is not able to mount volumes under heavier load
Summary: Cinder CSI driver is not able to mount volumes under heavier load
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.7
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.8.0
Assignee: Jan Safranek
QA Contact: Wei Duan
Depends On:
Blocks: 1933659
TreeView+ depends on / blocked
Reported: 2021-02-18 14:20 UTC by Jan Safranek
Modified: 2021-07-27 22:47 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Last Closed: 2021-07-27 22:45:14 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github openshift cloud-provider-openstack pull 45 0 None open Bug 1930220: Add udev to the driver image 2021-02-19 08:26:18 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:47:27 UTC

Description Jan Safranek 2021-02-18 14:20:24 UTC
Description of problem:
I created 18 Pods on a single node, each of them using its own Cinder volume (=18 volumes attached to the node). Randomly, some of these pod can't start:

Warning  FailedMount             19s (x16 over 25m)   kubelet                  MountVolume.MountDevice failed for volume "pvc-9a5f1a70-2266-4876-8e64-d0fea7ef20da" : rpc error: code = Internal desc = Unable to find Device path for volume

The reason seems to be udev - it did not create /dev/disk/by-id symlinks for the attached volume.

$ udevadm info /dev/vdi
P: /devices/pci0000:00/0000:00:0e.0/virtio11/block/vdi
N: vdi
S: disk/by-path/pci-0000:00:0e.0
S: disk/by-path/virtio-pci-0000:00:0e.0
E: DEVLINKS=/dev/disk/by-path/virtio-pci-0000:00:0e.0 /dev/disk/by-path/pci-0000:00:0e.0
E: DEVNAME=/dev/vdi
E: DEVPATH=/devices/pci0000:00/0000:00:0e.0/virtio11/block/vdi
E: ID_PATH=pci-0000:00:0e.0
E: ID_PATH_TAG=pci-0000_00_0e_0
E: MAJOR=252
E: MINOR=128
E: TAGS=:systemd:

Another volume that was mounted correctly has more DEVLINKS:
E: DEVLINKS=/dev/disk/by-id/virtio-396d709b-f498-439a-a /dev/disk/by-path/pci-0000:00:0f.0 /dev/disk/by-uuid/dcfaa60a-7896-4c61-bce1-f1841d9acbe0 /dev/disk/by-path/virtio-pci-0000:00:0f.0

The CSI driver is trying to mitigate this by calling "udevadm trigger" on each NodeStage [1], however, udevadm is not installed in the CSI driver container:

1: https://github.com/kubernetes/cloud-provider-openstack/blob/7b5efc481ea6b151300928c0976c336abee3b7e3/pkg/util/mount/mount.go#L110

From CSI driver logs:

I0218 14:00:56.664894       1 mount.go:178] Failed to find device for the volumeID: "2b0b000e-a660-48c4-a3e4-9f5153d9722b" by serial ID
I0218 14:00:57.015228       1 mount.go:113] error running udevadm trigger executable file not found in $PATH

How reproducible:

Steps to Reproduce:
1. Create ~20 volumes + 20 pods that use them *one the same node*
2. (delete + repeat until it's reproduced)

Actual results:
Some pods are ContainerCreating for a long time.

Expected results:
All pods Running.

Comment 4 Wei Duan 2021-02-24 05:18:09 UTC
Should be the same Upshift...
Yes I checked the CSI driver logs on the node, all mounts are successful, no such error.
Changed status to Verified.

Comment 7 errata-xmlrpc 2021-07-27 22:45:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.