Bug 1738620
| Summary: | Failing to launch VM: SyncFailed "Could not open '/var/run/kubevirt-private/vmi-disks/rootdisk/disk.img': Permission denied'" | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Alexander Chuzhoy <sasha> |
| Component: | Storage | Assignee: | Adam Litke <alitke> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Kevin Alon Goldblatt <kgoldbla> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 2.0 | CC: | alitke, ashoshan, awels, cnv-qe-bugs, fsimonce, ncredi, sgott, ycui, yprokule |
| Target Milestone: | --- | ||
| Target Release: | 2.1.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-11-04 15:04:56 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Assigning this to storage team as the disk is a dataVolume, and CDI sets the permissions for that. Could it be that this bug is related to that? *** Bug 1743250 has been marked as a duplicate of this bug. *** Happening with 2.1 as well *** Bug 1743248 has been marked as a duplicate of this bug. *** Can I get access to the system that is giving you this. I have access to a different system, and the ceph storage configuration there is not working, neither is the local storage configuration. Which is causing the permission denied on that system. But it appears that CDI successfully complete on the system where this is being reported, I would like to see the status of the data volume (complete yaml) and the status of the PVC and PV that are related to it as well. I did some more investigation, and I believe I have found the culprit. The version of CDI that is running appears to be setting the securityContext runAsUser:
securityContext:
runAsNonRoot: true
runAsUser: 1001
This is causing the permission denied when trying to write on certain provisioners, such as the ceph provisioner. We fixed this https://github.com/kubevirt/containerized-data-importer/pull/875 and backported it to 1.9 branch https://github.com/kubevirt/containerized-data-importer/pull/880
So the fix has been in since CDI 1.9.4 according to the log, so the deployed version must be older than that.
After looking through the PRs again, it turns out the PR I thought fixed the issue, did in fact fix the issue for cloning and upload, however it didn't fix it for import, which is what we are seeing here. As a result 1.10.0 did not include the fix. However 1.10.1 did include the fix for the importer. Tested with HCO 4.2 CNV 2.1.0-47
Reproduced on rook-ceph-block with volumeMode Block as follows:
1. Created the template and only added the volumeMode as Block
2. Created a datavolume
3. Created the VM and added the volumemode as Block and changed the running param to false
4. Started the VM with virtctl start vm1
5. Accessed the VM with virtctl console vm1 >>>>>>> VM is running and accessable
Moving to VERIFIED!
Yamls I used can be seen below:
Datavolume:
-----------------------------------------
apiVersion: cdi.kubevirt.io/v1alpha1
kind: DataVolume
metadata:
name: first-rootdisk
spec:
source:
http:
url: "http://cnv-qe-server.rhevdev.lab.eng.rdu2.redhat.com/files/rhel-images/rhel-8/rhel-8.qcow2"
pvc:
storageClassName: rook-ceph-block
volumeMode: Block
accessModes:
- ReadWriteMany
resources:
requests:
storage: 12Gi
VM:
--------------------------------------------
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
annotations:
name.os.template.kubevirt.io/rhel7.6: Red Hat Enterprise Linux 7.6
selfLink: /apis/kubevirt.io/v1alpha3/namespaces/sky/virtualmachines/vm1
resourceVersion: '311256'
name: vm1
uid: 33ce2a4a-b926-11e9-a08c-98039b6185e8
creationTimestamp: '2019-08-07T15:15:34Z'
generation: 7
namespace: sky
labels:
app: vm1
flavor.template.kubevirt.io/large: 'true'
os.template.kubevirt.io/rhel7.6: 'true'
vm.kubevirt.io/template: first
vm.kubevirt.io/template-namespace: sky
workload.template.kubevirt.io/desktop: 'true'
spec:
dataVolumeTemplates:
- metadata:
creationTimestamp: null
name: vm1-first-rootdisk-clone
spec:
pvc:
accessModes:
- ReadWriteMany
dataSource: null
resources:
requests:
storage: 12Gi
volumeMode: Block
storageClassName: rook-ceph-block
source:
pvc:
name: first-rootdisk
namespace: sky
status: {}
running: false
template:
metadata:
creationTimestamp: null
labels:
kubevirt.io/domain: vm1
kubevirt.io/size: large
vm.kubevirt.io/name: vm1
spec:
domain:
cpu:
cores: 1
sockets: 2
threads: 1
devices:
disks:
- bootOrder: 1
disk:
bus: virtio
name: rootdisk
inputs:
- bus: virtio
name: tablet
type: tablet
interfaces:
- bootOrder: 2
masquerade: {}
name: nic0
rng: {}
machine:
type: ''
resources:
requests:
memory: 2G
evictionStrategy: LiveMigrate
hostname: vm1
networks:
- name: nic0
pod: {}
terminationGracePeriodSeconds: 0
volumes:
- dataVolume:
name: vm1-first-rootdisk-clone
name: rootdisk
status:
created: true
ready: true
Template:
--------------------------------------------
kind: Template
apiVersion: template.openshift.io/v1
metadata:
name: first
namespace: sky
selfLink: /apis/template.openshift.io/v1/namespaces/sky/templates/first
uid: 27c1c366-b926-11e9-b865-0a580a81002e
resourceVersion: '304547'
creationTimestamp: '2019-08-07T15:15:13Z'
labels:
flavor.template.kubevirt.io/large: 'true'
os.template.kubevirt.io/rhel7.6: 'true'
template.kubevirt.io/type: vm
vm.kubevirt.io/template: rhel7-desktop-large
vm.kubevirt.io/template-namespace: openshift
workload.template.kubevirt.io/desktop: 'true'
annotations:
name.os.template.kubevirt.io/rhel7.6: Red Hat Enterprise Linux 7.6
objects:
- apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
labels:
app: '${NAME}'
vm.kubevirt.io/template: rhel7-desktop-large
name: '${NAME}'
spec:
template:
metadata:
labels:
kubevirt.io/domain: '${NAME}'
kubevirt.io/size: large
spec:
domain:
cpu:
cores: 1
sockets: 2
threads: 1
devices:
inputs:
- bus: virtio
name: tablet
type: tablet
rng: {}
interfaces:
- name: nic0
bootOrder: 2
masquerade: {}
disks:
- disk:
bus: virtio
name: rootdisk
bootOrder: 1
resources:
requests:
memory: 2G
evictionStrategy: LiveMigrate
terminationGracePeriodSeconds: 0
networks:
- name: nic0
pod: {}
volumes:
- name: rootdisk
dataVolume:
name: first-rootdisk
hostname: '${NAME}'
dataVolumeTemplates: []
parameters:
- name: NAME
description: Name for the new VM
|
Environment: kubevirt-hyperconverged-operator.v2.0.0 Steps to reproduce: Attempted to launch an instance via UI using wizzard: Created a template with: kind: Template apiVersion: template.openshift.io/v1 metadata: name: first namespace: sky selfLink: /apis/template.openshift.io/v1/namespaces/sky/templates/first uid: 27c1c366-b926-11e9-b865-0a580a81002e resourceVersion: '304547' creationTimestamp: '2019-08-07T15:15:13Z' labels: flavor.template.kubevirt.io/large: 'true' os.template.kubevirt.io/rhel7.6: 'true' template.kubevirt.io/type: vm vm.kubevirt.io/template: rhel7-desktop-large vm.kubevirt.io/template-namespace: openshift workload.template.kubevirt.io/desktop: 'true' annotations: name.os.template.kubevirt.io/rhel7.6: Red Hat Enterprise Linux 7.6 objects: - apiVersion: kubevirt.io/v1alpha3 kind: VirtualMachine metadata: labels: app: '${NAME}' vm.kubevirt.io/template: rhel7-desktop-large name: '${NAME}' spec: template: metadata: labels: kubevirt.io/domain: '${NAME}' kubevirt.io/size: large spec: domain: cpu: cores: 1 sockets: 2 threads: 1 devices: inputs: - bus: virtio name: tablet type: tablet rng: {} interfaces: - name: nic0 bootOrder: 2 masquerade: {} disks: - disk: bus: virtio name: rootdisk bootOrder: 1 resources: requests: memory: 6G evictionStrategy: LiveMigrate terminationGracePeriodSeconds: 0 networks: - name: nic0 pod: {} volumes: - name: rootdisk dataVolume: name: first-rootdisk hostname: '${NAME}' dataVolumeTemplates: [] parameters: - name: NAME description: Name for the new VM And then tried to create a VM using: apiVersion: kubevirt.io/v1alpha3 kind: VirtualMachine metadata: annotations: name.os.template.kubevirt.io/rhel7.6: Red Hat Enterprise Linux 7.6 selfLink: /apis/kubevirt.io/v1alpha3/namespaces/sky/virtualmachines/vm1 resourceVersion: '311256' name: vm1 uid: 33ce2a4a-b926-11e9-a08c-98039b6185e8 creationTimestamp: '2019-08-07T15:15:34Z' generation: 7 namespace: sky labels: app: vm1 flavor.template.kubevirt.io/large: 'true' os.template.kubevirt.io/rhel7.6: 'true' vm.kubevirt.io/template: first vm.kubevirt.io/template-namespace: sky workload.template.kubevirt.io/desktop: 'true' spec: dataVolumeTemplates: - metadata: creationTimestamp: null name: vm1-first-rootdisk-clone spec: pvc: accessModes: - ReadWriteMany dataSource: null resources: requests: storage: 30Gi storageClassName: rook-ceph-block source: pvc: name: first-rootdisk namespace: sky status: {} running: true template: metadata: creationTimestamp: null labels: kubevirt.io/domain: vm1 kubevirt.io/size: large vm.kubevirt.io/name: vm1 spec: domain: cpu: cores: 1 sockets: 2 threads: 1 devices: disks: - bootOrder: 1 disk: bus: virtio name: rootdisk inputs: - bus: virtio name: tablet type: tablet interfaces: - bootOrder: 2 masquerade: {} name: nic0 rng: {} machine: type: '' resources: requests: memory: 6G evictionStrategy: LiveMigrate hostname: vm1 networks: - name: nic0 pod: {} terminationGracePeriodSeconds: 0 volumes: - dataVolume: name: vm1-first-rootdisk-clone name: rootdisk status: created: true ready: true result: [cloud-user@r640-u01 ~]$ oc get pod NAME READY STATUS RESTARTS AGE virt-launcher-vm1-j8jzn 1/1 Running 0 3m25s [cloud-user@r640-u01 ~]$ oc logs virt-launcher-vm1-j8jzn|tail {"component":"virt-launcher","kind":"","level":"info","msg":"Synced vmi","name":"vm1","namespace":"sky","pos":"server.go:166","timestamp":"2019-08-07T15:26:00.671590Z","uid":"264e2328-b927-11e9-9f3c-98039b617c80"} {"component":"virt-launcher","level":"info","msg":"kubevirt domain status: Shutoff(5):Failed(6)","pos":"client.go:179","timestamp":"2019-08-07T15:26:00.672195Z"} {"component":"virt-launcher","level":"error","msg":"Failed to upate agent poller domain info","pos":"agent_poller.go:138","timestamp":"2019-08-07T15:26:00.672900Z"} {"component":"virt-launcher","level":"info","msg":"processed event","pos":"client.go:235","timestamp":"2019-08-07T15:26:00.672932Z"} {"component":"virt-launcher","level":"info","msg":"Still missing PID for 133bf63e-9459-5126-9b21-b56e9b3d17b3, Process 133bf63e-9459-5126-9b21-b56e9b3d17b3 not found in /proc","pos":"monitor.go:207","timestamp":"2019-08-07T15:26:00.782287Z"} {"component":"virt-launcher","level":"info","msg":"DomainLifecycle event 0 with reason 1 received","pos":"client.go:248","timestamp":"2019-08-07T15:26:00.867423Z"} {"component":"virt-launcher","kind":"","level":"info","msg":"Synced vmi","name":"vm1","namespace":"sky","pos":"server.go:166","timestamp":"2019-08-07T15:26:00.868135Z","uid":"264e2328-b927-11e9-9f3c-98039b617c80"} {"component":"virt-launcher","level":"info","msg":"kubevirt domain status: Shutoff(5):Failed(6)","pos":"client.go:179","timestamp":"2019-08-07T15:26:00.868530Z"} {"component":"virt-launcher","level":"error","msg":"Failed to upate agent poller domain info","pos":"agent_poller.go:138","timestamp":"2019-08-07T15:26:00.869228Z"} {"component":"virt-launcher","level":"info","msg":"processed event","pos":"client.go:235","timestamp":"2019-08-07T15:26:00.869261Z"} [cloud-user@r640-u01 ~]$ oc get events --sort-by='.lastTimestamp'|tail 5m47s Normal SuccessfulDelete virtualmachineinstance/vm1 Deleted virtual machine pod virt-launcher-vm1-x8wzh 5m47s Normal SuccessfulDelete virtualmachineinstance/vm1 Deleted PodDisruptionBudget kubevirt-disruption-budget-qjjsp 5m47s Normal SuccessfulDelete virtualmachine/vm1 Stopped the virtual machine by deleting the virtual machine instance 47495b7d-b926-11e9-9f3c-98039b617c80 5m44s Normal Started pod/virt-launcher-vm1-j8jzn Started container compute 5m44s Normal Created pod/virt-launcher-vm1-j8jzn Created container compute 5m44s Normal Pulled pod/virt-launcher-vm1-j8jzn Container image "registry.redhat.io/container-native-virtualization/virt-launcher:v2.0.0-39" already present on machine 5m39s Warning Unhealthy pod/virt-launcher-vm1-j8jzn Readiness probe failed: cat: /var/run/kubevirt-infra/healthy: No such file or directory 5m37s Warning SyncFailed virtualmachineinstance/vm1 server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly closed the monitor: 2019-08-07T15:22:30.802547Z qemu-kvm: -drive file=/var/run/kubevirt-private/vmi-disks/rootdisk/disk.img,format=raw,if=none,id=drive-ua-rootdisk,cache=none: Could not open '/var/run/kubevirt-private/vmi-disks/rootdisk/disk.img': Permission denied')" 5m37s Normal Started virtualmachineinstance/vm1 VirtualMachineInstance started. 37s Normal Created virtualmachineinstance/vm1 VirtualMachineInstance defined.