Bug 2048275
| Summary: | HPP mounter deployment crashes on parsing lsblk output | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Alex Kalenyuk <akalenyu> |
| Component: | Storage | Assignee: | Alex Kalenyuk <akalenyu> |
| Status: | CLOSED ERRATA | QA Contact: | Jenia Peimer <jpeimer> |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.10.0 | CC: | cnv-qe-bugs, jpeimer, pelauter |
| Target Milestone: | --- | ||
| Target Release: | 4.10.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | hostpath-provisioner-rhel8-operator v4.10.0-61, CNV v4.10.0-643 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-03-16 16:06:49 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Peter, this affects our ability to use HPP downstream due to different available versions of the lsblk utility. Please approve. Verified on CNV v4.10.0-643, hostpath-provisioner-operator v4.10.0-61
Installed HPP CR that uses backing pvcTemplate pool of volumeMode: Block
$ oc get pods -n openshift-cnv | grep hpp
hpp-pool-1ce22a1b-6698bdfd89-6fmw9 1/1 Running 0 38m
hpp-pool-1de0d017-5bd8588866-kvnsq 1/1 Running 0 38m
hpp-pool-c36842f7-7f4b595997-8xmj8 1/1 Running 0 38m
$ oc get deployments -n openshift-cnv | grep hpp
hpp-pool-1ce22a1b 1/1 1 1 39m
hpp-pool-1de0d017 1/1 1 1 39m
hpp-pool-c36842f7 1/1 1 1 39m
$ oc get pvc -n openshift-cnv | grep hpp-pool
hpp-pool-1ce22a1b Bound pvc-453c0278-b9b5-4239-9cb6-6c9d1130dc09 40Gi RWO ocs-storagecluster-ceph-rbd 39m
hpp-pool-1de0d017 Bound pvc-2e772566-a7f2-439a-9c02-90498931f8ec 40Gi RWO ocs-storagecluster-ceph-rbd 39m
hpp-pool-c36842f7 Bound pvc-21df2206-ac0a-4ec1-9d5a-b24f16d839e5 40Gi RWO ocs-storagecluster-ceph-rbd 39m
$ oc get pods -n openshift-cnv | grep hostpath
hostpath-provisioner-csi-bvsrz 4/4 Running 0 57m
hostpath-provisioner-csi-m6psb 4/4 Running 0 57m
hostpath-provisioner-csi-sr77g 4/4 Running 0 57m
hostpath-provisioner-operator-5869d68856-mxf2c 1/1 Running 1 (54m ago) 130m
Created a VM:
$ oc get vmi -A
NAMESPACE NAME AGE PHASE IP NODENAME READY
default vm-cirros 67s Running *********** c01-jp410-fr5-7zthb-worker-0-p9nr8 True
Checked that disk.img can be found in the path we gave in the CR's yaml:
$ oc debug node/c01-jp410-fr5-7zthb-worker-0-p9nr8
sh-4.4# chroot /host
sh-4.4# ls var/hpp-csi-pvc-template-ocs-block/csi/
pvc-8cd5768f-179d-49c3-ae1b-3cbabdedf12c
sh-4.4#
sh-4.4# ls pvc-8cd5768f-179d-49c3-ae1b-3cbabdedf12c/
disk.img
sh-4.4#
Yamls used:
$ cat hpp-ocs-block-cr.yaml
apiVersion: hostpathprovisioner.kubevirt.io/v1beta1
kind: HostPathProvisioner
metadata:
name: hostpath-provisioner
spec:
imagePullPolicy: IfNotPresent
storagePools:
- name: hpp-csi-pvc-template-ocs-block
pvcTemplate:
volumeMode: Block # If omitted - FS is the default
storageClassName: ocs-storagecluster-ceph-rbd
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 40Gi
path: "/var/hpp-csi-pvc-template-ocs-block"
workload:
nodeSelector:
kubernetes.io/os: linux
$ cat sc-hpp-ocs-block.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: hostpath-csi-pvc-template-ocs-block
provisioner: kubevirt.io.hostpath-provisioner
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
parameters:
storagePool: hpp-csi-pvc-template-ocs-block
$ cat vm.yaml
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
creationTimestamp: null
labels:
kubevirt.io/vm: vm-cirros
name: vm-cirros
spec:
dataVolumeTemplates:
- metadata:
creationTimestamp: null
name: cirros-dv
spec:
pvc:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: hostpath-csi-pvc-template-ocs-block
source:
http:
url: http://.../cirros-images/cirros-0.4.0-x86_64-disk.qcow2
running: true
template:
metadata:
labels:
kubevirt.io/vm: vm-cirros
spec:
domain:
devices:
disks:
- disk:
bus: virtio
name: datavolume
machine:
type: ""
resources:
requests:
memory: 100M
terminationGracePeriodSeconds: 0
volumes:
- dataVolume:
name: cirros-dv
name: datavolume
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.10.0 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0947 |
Description of problem: HPP mounter deployment crashes on parsing lsblk output (depending on lsblk version) Version-Release number of selected component (if applicable): CNV 4.10.0 How reproducible: 100% Steps to Reproduce: 1. Install HPP, use a backing PVCTemplate pool of volumeMode: Block in the HPP CR Actual results: Mounter deployment failing Expected results: Success Additional info: The boolean fields in newer versions (rm, ro) are outputted as true/false in --json: # lsblk /dev/data -J { "blockdevices": [ {"name":"rbd0", "maj:min":"251:0", "rm":false, "size":"40G", "ro":false, "type":"disk", "mountpoint":"/host/var/hpvolumes/csi"} ] } bash-5.1# lsblk --version lsblk from util-linux 2.36.2 In older lsblk versions this is not the case: # lsblk /dev/data -J { "blockdevices": [ {"name": "vdb", "maj:min": "252:16", "rm": "0", "size": "120G", "ro": "0", "type": "disk", "mountpoint": null} ] } [root@hpp-pool-sno-test-infra-cluster-621da615-master-0-7d9ccbfctx8mn /]# lsblk --version lsblk from util-linux 2.32.1 [cnv-qe-jenkins@psi-hitchhiker-w8k9d-executor ~]$ oc logs -n openshift-cnv -f hpp-pool-ceph-backed-pool-psi-hitchhiker-w8k9d-worker-0-8fqsc57 {"level":"info","ts":1643546741.2130804,"logger":"mounter","msg":"Go Version: go1.16.6"} {"level":"info","ts":1643546741.2131374,"logger":"mounter","msg":"Go OS/Arch: linux/amd64"} panic: json: cannot unmarshal string into Go struct field DeviceInfo.blockdevices.rm of type bool goroutine 1 [running]: main.mountBlockVolume(0x7ffcb710a1f3, 0x9, 0x7ffcb710a209, 0x1a, 0x7ffcb710a22f, 0x5) /remote-source/app/cmd/mounter/main.go:226 +0x6ad main.main() /remote-source/app/cmd/mounter/main.go:163 +0x434 This started occurring since we switched to using a UBI built downstream image for the mounted deployment, and thus a different lsblk version [cnv-qe-jenkins@psi-hitchhiker-w8k9d-executor debug-tier1-cdi]$ oc get hostpathprovisioner hostpath-provisioner -o yaml apiVersion: hostpathprovisioner.kubevirt.io/v1beta1 kind: HostPathProvisioner metadata: creationTimestamp: "2022-01-30T12:31:27Z" finalizers: - finalizer.delete.hostpath-provisioner generation: 78 name: hostpath-provisioner resourceVersion: "417784" uid: 8b324ef7-f00a-4d44-ae45-83628aa7de0e spec: imagePullPolicy: IfNotPresent storagePools: - name: ceph-backed-pool path: /var/hppcephbackedpool pvcTemplate: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi storageClassName: ocs-storagecluster-ceph-rbd volumeMode: Block workload: {}