1937299 – pod.spec.volumes.awsElasticBlockStore.partition is not respected on NVMe volumes

Bug 1937299 - pod.spec.volumes.awsElasticBlockStore.partition is not respected on NVMe volumes

Summary: pod.spec.volumes.awsElasticBlockStore.partition is not respected on NVMe volumes

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Storage
Sub Component:
Version:	4.8
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	4.8.0
Assignee:	Jan Safranek
QA Contact:	Qin Ping
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-03-10 10:22 UTC by Qin Ping
Modified:	2021-07-27 22:53 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-07-27 22:52:37 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift kubernetes pull 661	0	None	open	Bug 1937299: Fix mounting partitions on NVMe devices	2021-04-13 13:01:31 UTC
Red Hat Product Errata	RHSA-2021:2438	0	None	None	None	2021-07-27 22:53:05 UTC

Description Qin Ping 2021-03-10 10:22:21 UTC

Description of problem:
pod.spec.volumes.awsElasticBlockStore.partition is not respected

Version-Release number of selected component (if applicable):
4.8.0-0.nightly-2021-03-08-184701

How reproducible:
Always

Steps to Reproduce:
1. Create a aws ebs volume from the cloud side
2. Attach this volume to a worker node(/dev/xvdp)
3. Create 2 partitions on this disk
sh-4.4# lsblk -f
NAME        FSTYPE LABEL      UUID                                 MOUNTPOINT
nvme0n1
|-nvme0n1p1
|-nvme0n1p2 vfat   EFI-SYSTEM BD40-9FC9
|-nvme0n1p3 ext4   boot       d922290a-34c9-4365-8f58-87634a722883 /boot
`-nvme0n1p4 xfs    root       c3b3309d-db37-4e3c-baa0-e8331575a654 /sysroot
nvme1n1
|-nvme1n1p1
`-nvme1n1p2
4. Create a inline volume pod consuming partition 1

Actual results:
Pod can not run successfully and get the following error:

Warning  FailedMount  43s  kubelet  Unable to attach or mount volumes: unmounted volumes=[inline], unattached volumes=[inline default-token-c65mn]: timed out waiting for the condition
Warning  FailedMount  13s  kubelet  (combined from similar events): MountVolume.MountDevice failed for volume "inline" : mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/vol-0140cff45bc3ecdbc --scope -- mount -t xfs -o defaults /dev/nvme1n1 /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/vol-0140cff45bc3ecdbc
Output: Running scope as unit: run-r2e3e07eb3da648abb45fcd8c4151a9a9.scope
mount: /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/vol-0140cff45bc3ecdbc: wrong fs type, bad option, bad superblock on /dev/nvme1n1, missing codepage or helper program, or other error.



Expected results:
Pod can run successfully.

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:
$ cat pod-inlinevlume.yaml 
{
    "apiVersion": "v1",
    "kind": "Pod",
    "metadata": {
        "name": "inline"
    },
    "spec": {
        "nodeSelector": {
                "topology.kubernetes.io/zone": "us-east-2a"
        },
        "containers": [
            {
                "name": "inline",
                "image": "quay.io/openshifttest/storage@sha256:a05b96d373be86f46e76817487027a7f5b8b5f87c0ac18a246b018df11529b40",
                "securityContext": {
                    "privileged": true
                },
                "imagePullPolicy": "IfNotPresent",
                "volumeMounts": [
                    {
                        "mountPath": "/mnt/storage",
                        "name": "inline"
                    }
                ]
            }
        ],
        "volumes": [
            {
                "name": "inline",
                "awsElasticBlockStore": {
                    "fsType": "xfs",
                    "volumeID": "vol-0140cff45bc3ecdbc",
                    "partition": 1,
                    "readOnly": false
                }
            }
        ]
    }
}

Comment 1 Jan Safranek 2021-03-23 17:47:57 UTC

I did not reproduce this on a "standard" AWS machines (where volumes are /dev/xvdac and partitions /dev/xvdac1). I reproduced it only on Nitro machines, where EBS volumes are represented as NVMe (/dev/nvme1n1) and partitions are /dev/nvme1n1p1.

Comment 2 Jan Safranek 2021-03-23 18:46:48 UTC

Posted PR upstream: https://github.com/kubernetes/kubernetes/pull/100500

Comment 4 Qin Ping 2021-04-25 03:31:10 UTC

Verified with: 4.8.0-0.nightly-2021-04-24-234710

Comment 7 errata-xmlrpc 2021-07-27 22:52:37 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438

Note You need to log in before you can comment on or make changes to this bug.