Bug 2097436
| Summary: | Online disk expansion ignores filesystem overhead change | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Kevin Alon Goldblatt <kgoldbla> |
| Component: | Storage | Assignee: | Álvaro Romero <alromero> |
| Status: | CLOSED ERRATA | QA Contact: | Kevin Alon Goldblatt <kgoldbla> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.11.0 | CC: | mrashish, yadu |
| Target Milestone: | --- | ||
| Target Release: | 4.12.0 | ||
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | CNV v4.12.0-260 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-01-24 13:36:39 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Test on the following version:
[cloud-user@ocp-psi-executor ~]$ oc version
Client Version: 4.12.0-ec.1
Kustomize Version: v4.5.4
Server Version: 4.12.0-ec.1
Kubernetes Version: v1.24.0+a9d6306
[cloud-user@ocp-psi-executor ~]$ oc get csv -n openshift-cnv
NAME DISPLAY VERSION REPLACES PHASE
kubevirt-hyperconverged-operator.v4.12.0 OpenShift Virtualization 4.12.0 kubevirt-hyperconverged-operator.v4.10.5 Succeeded
Steps to Reproduce:
1. Edit the HCO cr to change the filesystem overhead to 20%:
[cloud-user@ocp-psi-executor ~]$ oc describe hco -n openshift-cnv
...
Spec:
Filesystem Overhead:
Storage Class:
Nfs: 0.2<<<<<<<<<<
2. Create a VM requesting a volume size of 2G:
[cloud-user@ocp-psi-executor ~]$ oc get vm
NAME AGE STATUS READY
vm-cirros-datavolume1 37m Running True
3. Check the storage request:
[cloud-user@ocp-psi-executor ~]$ oc get pvc cirros-dv -oyaml
...
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100M
storageClassName: nfs
volumeMode: Filesystem
4. Check the online expansion requested by the vmi:
[cloud-user@ocp-psi-executor ~]$ oc get vmi vm-cirros-datavolume1 -oyaml
...
volumeStatus:
- name: datavolumevolume
persistentVolumeClaimInfo:
accessModes:
- ReadWriteOnce
capacity:
storage: 5Gi
filesystemOverhead: "0.2"<<<<<<<<<<<<<<<<
requests:
storage: 100M
volumeMode: Filesystem
So we get the expected result that the updated filesystem overhead of 20% have been used in the online expansion.
The version is 4.12.0-425. Verified with the following code:
-----------------------------------------
oc version
Client Version: 4.8.0-fc.2
Server Version: 4.12.0-ec.1
Kubernetes Version: v1.24.0+a9d6306
oc get csv -n openshift-cnv
NAME DISPLAY VERSION REPLACES PHASE
kubevirt-hyperconverged-operator.v4.12.0 OpenShift Virtualization 4.12.0 kubevirt-hyperconverged-operator.v4.10.5 Succeeded
volsync-product.v0.5.0 VolSync 0.5.0 Succeeded
Deployed: OCP-4.12.0-ec.1
Deployed: CNV-v4.12.0-450
Verified with the following scenario:
-----------------------------------------
1. Edit the hco cr 'oc edit hco -n openshift-cnv' and change the filesystem overhead to 20%:
filesystemOverhead:
storageClass:
nfs: "0.2"
2.See that nfs filesystem overhead was updted to 0.2:
oc get cdiconfig -o jsonpath='{.items..status.filesystemOverhead}'
{"global":"0.055","storageClass":{"csi-manila-ceph":"0.055","hostpath-csi-basic":"0.055","hostpath-csi-pvc-block":"0.055","local-block-hpp":"0.055","local-block-ocs":"0.055","nfs":"0.2","ocs-storagecluster-ceph-rbd":"0.055","ocs-storagecluster-ceph-rgw":"0.055","standard-csi":"0.055"}}
3. Create a vm with the yaml below requesting a volume size of 2G
4. Check the storage request with 'oc get pvc cirros-dv4 -oyaml':
resources:
requests:
storage: "2684354560"
storageClassName: nfs
The pvc size was correctly created to include the filesystem overhead
5. Check the online expansion requested by the vmi 'oc get vmi vm-cirros-datavolume4 -oyaml':
volumeStatus:
- name: datavolumedisk1
persistentVolumeClaimInfo:
accessModes:
- ReadWriteOnce
capacity:
storage: 5Gi
filesystemOverhead: "0.2" >>>>>> THE CORRECT UPDATED EXPANSION WAS USED
requests:
storage: "2684354560"
volumeMode: Filesystem
target: vda
Actual results:
The updated filesystem overhead of 0.2(20%) was requested.
Expected results:
The correct updated nfs filesystem overhead was used.
6. Accessed the vm and verified the correct requested size is displayed using lsblk:
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:0 0 2G 0 disk
|-vda1 252:1 0 2G 0 part /
`-vda15 252:15 0 8M 0 part
Moving this to VERIFIED!
Additional info:
vm-yaml:
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
labels:
kubevirt.io/vm: vm-cirros-datavolume
name: vm-cirros-datavolume
spec:
dataVolumeTemplates:
- metadata:
creationTimestamp: null
name: cirros-dv
spec:
storage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
storageClassName: nfs
source:
http:
url: http://xxx.xxx.xxx.com/files/cnv-tests/cirros-images/cirros-0.5.1-x86_64-disk.img
running: false
template:
metadata:
labels:
kubevirt.io/vm: vm-cirros-datavolume
spec:
domain:
devices:
disks:
- disk:
bus: virtio
name: datavolumedisk1
resources:
requests:
memory: 128Mi
terminationGracePeriodSeconds: 0
volumes:
- dataVolume:
name: cirros-dv
name: datavolumedisk1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Virtualization 4.12.0 Images security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:0408 |
Description of problem: After tuning the filesystem overhead in the HCO cr to 20%, pvc created as part of a vm with template clone is created correctly including the overhead. However the online disk expansion ignores the change and uses the default filesystem overhead of 5.5% Version-Release number of selected component (if applicable): The error ocurred using the following code: -------------------------------------------------------- oc version Client Version: 4.11.0-202206090038.p0.g194e99e.assembly.stream-194e99e Kustomize Version: v4.5.4 Server Version: 4.11.0-fc.0 Kubernetes Version: v1.24.0+beaaed6 [cnv-qe-jenkins@stg10-kevin-6v8qf-executor ~]$ oc get csv -n openshift-cnv NAME DISPLAY VERSION REPLACES PHASE kubevirt-hyperconverged-operator.v4.11.0 OpenShift Virtualization 4.11.0 kubevirt-hyperconverged-operator.v4.10.1 Succeeded How reproducible: 100% Steps to Reproduce: 1. Edit the hco cr 'oc edit hco -n openshift-cnv' and change the filesystem overhead to 20%: filesystemOverhead: storageClass: nfs: "0.2" 2. Create a vm with the yaml below requesting a volume size of 2G 3. Check the storage request with 'oc get pvc cirros-dv4 -oyaml': resources: requests: storage: "2684354560" storageClassName: nfs The pvc size was correctly created to include the filesystem overhead 4. Check the online expansion requested by the vmi 'oc get vmi vm-cirros-datavolume4 -oyaml': filesystemOverhead: "0.055" >>>>>>>THE DEFAULT FILESYSTEM OVERHEAD WAS REQUESTED requests: storage: "2684354560" volumeMode: Filesystem Actual results: The default filesystem overhead of 0.055(5.5%) was requested ignoring the updated value of 20% Expected results: The updated filesystem overhead of 20% should have been used in the online expansion Additional info: HCO---------------------- oc edit hco -n openshift-cnv filesystemOverhead: storageClass: nfs: "0.2" PVC--------------------- oc get pvc cirros-dv4 -oyaml apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: cdi.kubevirt.io/storage.condition.running: "false" cdi.kubevirt.io/storage.condition.running.message: Import Complete cdi.kubevirt.io/storage.condition.running.reason: Completed cdi.kubevirt.io/storage.contentType: kubevirt cdi.kubevirt.io/storage.import.endpoint: http://cnv-qe-server.rhevdev.lab.eng.rdu2.redhat.com/files/cnv-tests/cirros-images/cirros-0.5.1-x86_64-disk.img cdi.kubevirt.io/storage.import.importPodName: importer-cirros-dv4 cdi.kubevirt.io/storage.import.source: http cdi.kubevirt.io/storage.pod.phase: Succeeded cdi.kubevirt.io/storage.pod.restarts: "0" cdi.kubevirt.io/storage.preallocation.requested: "false" pv.kubernetes.io/bind-completed: "yes" pv.kubernetes.io/bound-by-controller: "yes" creationTimestamp: "2022-06-15T15:45:00Z" finalizers: - kubernetes.io/pvc-protection labels: alerts.k8s.io/KubePersistentVolumeFillingUp: disabled app: containerized-data-importer app.kubernetes.io/component: storage app.kubernetes.io/managed-by: cdi-controller app.kubernetes.io/part-of: hyperconverged-cluster app.kubernetes.io/version: 4.11.0 name: cirros-dv4 namespace: default ownerReferences: - apiVersion: cdi.kubevirt.io/v1beta1 blockOwnerDeletion: true controller: true kind: DataVolume name: cirros-dv4 uid: c2f5ea59-ff9d-426b-a271-b49a007168d9 resourceVersion: "4188763" uid: 644a9c0b-616d-4450-ad5b-7c399c9bd37c spec: accessModes: - ReadWriteOnce resources: requests: storage: "2684354560" storageClassName: nfs volumeMode: Filesystem volumeName: nfs-pv-08 status: accessModes: - ReadWriteMany - ReadWriteOnce capacity: storage: 5Gi phase: Bound VMI-------------------------------- oc get vmi vm-cirros-datavolume4 -oyaml apiVersion: kubevirt.io/v1 kind: VirtualMachineInstance metadata: annotations: kubevirt.io/latest-observed-api-version: v1 kubevirt.io/storage-observed-api-version: v1alpha3 creationTimestamp: "2022-06-15T15:56:49Z" finalizers: - kubevirt.io/virtualMachineControllerFinalize - foregroundDeleteVirtualMachine generation: 8 labels: kubevirt.io/nodeName: stg10-kevin-6v8qf-worker-0-dkg2q kubevirt.io/vm: vm-cirros-datavolume4 name: vm-cirros-datavolume4 namespace: default ownerReferences: - apiVersion: kubevirt.io/v1 blockOwnerDeletion: true controller: true kind: VirtualMachine name: vm-cirros-datavolume4 uid: c7f5f992-590e-42bd-b234-0aaa95e414f8 resourceVersion: "4203188" uid: 2ea9cb38-fff6-4ec4-ac95-4b17e2301319 spec: domain: cpu: cores: 1 model: host-model sockets: 1 threads: 1 devices: disks: - disk: bus: virtio name: datavolumedisk4 interfaces: - masquerade: {} name: default features: acpi: enabled: true firmware: uuid: 5e29c93e-7ab4-5c76-b86a-7a8b58f279af machine: type: pc-q35-rhel8.4.0 resources: requests: memory: 128Mi networks: - name: default pod: {} terminationGracePeriodSeconds: 0 volumes: - dataVolume: name: cirros-dv4 name: datavolumedisk4 status: activePods: a82f1f4c-f0d3-4722-a7af-8f42a0b0b534: stg10-kevin-6v8qf-worker-0-dkg2q conditions: - lastProbeTime: null lastTransitionTime: "2022-06-15T15:56:55Z" status: "True" type: Ready - lastProbeTime: null lastTransitionTime: null message: 'cannot migrate VMI: PVC cirros-dv4 is not shared, live migration requires that all PVCs must be shared (using ReadWriteMany access mode)' reason: DisksNotLiveMigratable status: "False" type: LiveMigratable guestOSInfo: {} interfaces: - infoSource: domain ipAddress: 10.128.2.62 ipAddresses: - 10.128.2.62 mac: 52:54:00:38:d1:3f name: default launcherContainerImageVersion: registry.redhat.io/container-native-virtualization/virt-launcher@sha256:a2e887eb37fc7573a4aaba855f1d6ba64aa6c14f8a2c01b1e8bfd51526c51e99 migrationMethod: BlockMigration migrationTransport: Unix nodeName: stg10-kevin-6v8qf-worker-0-dkg2q phase: Running phaseTransitionTimestamps: - phase: Pending phaseTransitionTimestamp: "2022-06-15T15:56:49Z" - phase: Scheduling phaseTransitionTimestamp: "2022-06-15T15:56:50Z" - phase: Scheduled phaseTransitionTimestamp: "2022-06-15T15:56:56Z" - phase: Running phaseTransitionTimestamp: "2022-06-15T15:57:03Z" qosClass: Burstable runtimeUser: 107 virtualMachineRevisionName: revision-start-vm-c7f5f992-590e-42bd-b234-0aaa95e414f8-2 volumeStatus: - name: datavolumedisk4 persistentVolumeClaimInfo: accessModes: - ReadWriteOnce capacity: storage: 5Gi filesystemOverhead: "0.055" requests: storage: "2684354560" volumeMode: Filesystem target: vda