Bug 1738620 - Failing to launch VM: SyncFailed "Could not open '/var/run/kubevirt-private/vmi-disks/rootdisk/disk.img': Permission denied'"
Summary: Failing to launch VM: SyncFailed "Could not open '/var/run/kubevirt-private/v...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Storage
Version: 2.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 2.1.0
Assignee: Adam Litke
QA Contact: Kevin Alon Goldblatt
URL:
Whiteboard:
: 1743248 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-07 15:28 UTC by Alexander Chuzhoy
Modified: 2019-11-04 15:04 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-04 15:04:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Alexander Chuzhoy 2019-08-07 15:28:57 UTC
Environment:
 kubevirt-hyperconverged-operator.v2.0.0

Steps to reproduce:

Attempted to launch an instance via UI using wizzard:

Created a template with:
kind: Template
apiVersion: template.openshift.io/v1
metadata:
  name: first
  namespace: sky
  selfLink: /apis/template.openshift.io/v1/namespaces/sky/templates/first
  uid: 27c1c366-b926-11e9-b865-0a580a81002e
  resourceVersion: '304547'
  creationTimestamp: '2019-08-07T15:15:13Z'
  labels:
    flavor.template.kubevirt.io/large: 'true'
    os.template.kubevirt.io/rhel7.6: 'true'
    template.kubevirt.io/type: vm
    vm.kubevirt.io/template: rhel7-desktop-large
    vm.kubevirt.io/template-namespace: openshift
    workload.template.kubevirt.io/desktop: 'true'
  annotations:
    name.os.template.kubevirt.io/rhel7.6: Red Hat Enterprise Linux 7.6
objects:
  - apiVersion: kubevirt.io/v1alpha3
    kind: VirtualMachine
    metadata:
      labels:
        app: '${NAME}'
        vm.kubevirt.io/template: rhel7-desktop-large
      name: '${NAME}'
    spec:
      template:
        metadata:
          labels:
            kubevirt.io/domain: '${NAME}'
            kubevirt.io/size: large
        spec:
          domain:
            cpu:
              cores: 1
              sockets: 2
              threads: 1
            devices:
              inputs:
                - bus: virtio
                  name: tablet
                  type: tablet
              rng: {}
              interfaces:
                - name: nic0
                  bootOrder: 2
                  masquerade: {}
              disks:
                - disk:
                    bus: virtio
                  name: rootdisk
                  bootOrder: 1
            resources:
              requests:
                memory: 6G
          evictionStrategy: LiveMigrate
          terminationGracePeriodSeconds: 0
          networks:
            - name: nic0
              pod: {}
          volumes:
            - name: rootdisk
              dataVolume:
                name: first-rootdisk
          hostname: '${NAME}'
      dataVolumeTemplates: []
parameters:
  - name: NAME
    description: Name for the new VM



And then tried to create a VM using:
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  annotations:
    name.os.template.kubevirt.io/rhel7.6: Red Hat Enterprise Linux 7.6
  selfLink: /apis/kubevirt.io/v1alpha3/namespaces/sky/virtualmachines/vm1
  resourceVersion: '311256'
  name: vm1
  uid: 33ce2a4a-b926-11e9-a08c-98039b6185e8
  creationTimestamp: '2019-08-07T15:15:34Z'
  generation: 7
  namespace: sky
  labels:
    app: vm1
    flavor.template.kubevirt.io/large: 'true'
    os.template.kubevirt.io/rhel7.6: 'true'
    vm.kubevirt.io/template: first
    vm.kubevirt.io/template-namespace: sky
    workload.template.kubevirt.io/desktop: 'true'
spec:
  dataVolumeTemplates:
    - metadata:
        creationTimestamp: null
        name: vm1-first-rootdisk-clone
      spec:
        pvc:
          accessModes:
            - ReadWriteMany
          dataSource: null
          resources:
            requests:
              storage: 30Gi
          storageClassName: rook-ceph-block
        source:
          pvc:
            name: first-rootdisk
            namespace: sky
      status: {}
  running: true
  template:
    metadata:
      creationTimestamp: null
      labels:
        kubevirt.io/domain: vm1
        kubevirt.io/size: large
        vm.kubevirt.io/name: vm1
    spec:
      domain:
        cpu:
          cores: 1
          sockets: 2
          threads: 1
        devices:
          disks:
            - bootOrder: 1
              disk:
                bus: virtio
              name: rootdisk
          inputs:
            - bus: virtio
              name: tablet
              type: tablet
          interfaces:
            - bootOrder: 2
              masquerade: {}
              name: nic0
          rng: {}
        machine:
          type: ''
        resources:
          requests:
            memory: 6G
      evictionStrategy: LiveMigrate
      hostname: vm1
      networks:
        - name: nic0
          pod: {}
      terminationGracePeriodSeconds: 0
      volumes:
        - dataVolume:
            name: vm1-first-rootdisk-clone
          name: rootdisk
status:
  created: true
  ready: true


result:



[cloud-user@r640-u01 ~]$ oc get pod
NAME                      READY   STATUS    RESTARTS   AGE
virt-launcher-vm1-j8jzn   1/1     Running   0          3m25s

[cloud-user@r640-u01 ~]$ oc logs virt-launcher-vm1-j8jzn|tail
{"component":"virt-launcher","kind":"","level":"info","msg":"Synced vmi","name":"vm1","namespace":"sky","pos":"server.go:166","timestamp":"2019-08-07T15:26:00.671590Z","uid":"264e2328-b927-11e9-9f3c-98039b617c80"}
{"component":"virt-launcher","level":"info","msg":"kubevirt domain status: Shutoff(5):Failed(6)","pos":"client.go:179","timestamp":"2019-08-07T15:26:00.672195Z"}
{"component":"virt-launcher","level":"error","msg":"Failed to upate agent poller domain info","pos":"agent_poller.go:138","timestamp":"2019-08-07T15:26:00.672900Z"}
{"component":"virt-launcher","level":"info","msg":"processed event","pos":"client.go:235","timestamp":"2019-08-07T15:26:00.672932Z"}
{"component":"virt-launcher","level":"info","msg":"Still missing PID for 133bf63e-9459-5126-9b21-b56e9b3d17b3, Process 133bf63e-9459-5126-9b21-b56e9b3d17b3 not found in /proc","pos":"monitor.go:207","timestamp":"2019-08-07T15:26:00.782287Z"}
{"component":"virt-launcher","level":"info","msg":"DomainLifecycle event 0 with reason 1 received","pos":"client.go:248","timestamp":"2019-08-07T15:26:00.867423Z"}
{"component":"virt-launcher","kind":"","level":"info","msg":"Synced vmi","name":"vm1","namespace":"sky","pos":"server.go:166","timestamp":"2019-08-07T15:26:00.868135Z","uid":"264e2328-b927-11e9-9f3c-98039b617c80"}
{"component":"virt-launcher","level":"info","msg":"kubevirt domain status: Shutoff(5):Failed(6)","pos":"client.go:179","timestamp":"2019-08-07T15:26:00.868530Z"}
{"component":"virt-launcher","level":"error","msg":"Failed to upate agent poller domain info","pos":"agent_poller.go:138","timestamp":"2019-08-07T15:26:00.869228Z"}
{"component":"virt-launcher","level":"info","msg":"processed event","pos":"client.go:235","timestamp":"2019-08-07T15:26:00.869261Z"}




[cloud-user@r640-u01 ~]$ oc get events --sort-by='.lastTimestamp'|tail
5m47s       Normal    SuccessfulDelete             virtualmachineinstance/vm1                             Deleted virtual machine pod virt-launcher-vm1-x8wzh
5m47s       Normal    SuccessfulDelete             virtualmachineinstance/vm1                             Deleted PodDisruptionBudget kubevirt-disruption-budget-qjjsp
5m47s       Normal    SuccessfulDelete             virtualmachine/vm1                                     Stopped the virtual machine by deleting the virtual machine instance 47495b7d-b926-11e9-9f3c-98039b617c80
5m44s       Normal    Started                      pod/virt-launcher-vm1-j8jzn                            Started container compute
5m44s       Normal    Created                      pod/virt-launcher-vm1-j8jzn                            Created container compute
5m44s       Normal    Pulled                       pod/virt-launcher-vm1-j8jzn                            Container image "registry.redhat.io/container-native-virtualization/virt-launcher:v2.0.0-39" already present on machine
5m39s       Warning   Unhealthy                    pod/virt-launcher-vm1-j8jzn                            Readiness probe failed: cat: /var/run/kubevirt-infra/healthy: No such file or directory
5m37s       Warning   SyncFailed                   virtualmachineinstance/vm1                             server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly closed the monitor: 2019-08-07T15:22:30.802547Z qemu-kvm: -drive file=/var/run/kubevirt-private/vmi-disks/rootdisk/disk.img,format=raw,if=none,id=drive-ua-rootdisk,cache=none: Could not open '/var/run/kubevirt-private/vmi-disks/rootdisk/disk.img': Permission denied')"
5m37s       Normal    Started                      virtualmachineinstance/vm1                             VirtualMachineInstance started.
37s         Normal    Created                      virtualmachineinstance/vm1                             VirtualMachineInstance defined.

Comment 1 sgott 2019-08-07 20:11:11 UTC
Assigning this to storage team as the disk is a dataVolume, and CDI sets the permissions for that.

Could it be that this bug is related to that?

Comment 2 Asher Shoshan 2019-08-19 13:19:04 UTC
*** Bug 1743250 has been marked as a duplicate of this bug. ***

Comment 3 Asher Shoshan 2019-08-19 13:20:58 UTC
Happening with 2.1 as well

Comment 4 Asher Shoshan 2019-08-19 13:31:07 UTC
*** Bug 1743248 has been marked as a duplicate of this bug. ***

Comment 5 Alexander Wels 2019-08-19 14:34:50 UTC
Can I get access to the system that is giving you this. I have access to a different system, and the ceph storage configuration there is not working, neither is the local storage configuration. Which is causing the permission denied on that system. But it appears that CDI successfully complete on the system where this is being reported, I would like to see the status of the data volume (complete yaml) and the status of the PVC and PV that are related to it as well.

Comment 6 Alexander Wels 2019-08-19 16:41:30 UTC
I did some more investigation, and I believe I have found the culprit. The version of CDI that is running appears to be setting the securityContext runAsUser:

    securityContext:
      runAsNonRoot: true
      runAsUser: 1001

This is causing the permission denied when trying to write on certain provisioners, such as the ceph provisioner. We fixed this https://github.com/kubevirt/containerized-data-importer/pull/875 and backported it to 1.9 branch https://github.com/kubevirt/containerized-data-importer/pull/880

So the fix has been in since CDI 1.9.4 according to the log, so the deployed version must be older than that.

Comment 9 Alexander Wels 2019-08-20 17:23:01 UTC
After looking through the PRs again, it turns out the PR I thought fixed the issue, did in fact fix the issue for cloning and upload, however it didn't fix it for import, which is what we are seeing here. As a result 1.10.0 did not include the fix. However 1.10.1 did include the fix for the importer.

Comment 13 Kevin Alon Goldblatt 2019-09-23 13:07:46 UTC
Tested with HCO 4.2 CNV 2.1.0-47

Reproduced on rook-ceph-block with volumeMode Block as follows:

1. Created the template and only added the volumeMode as Block
2. Created a datavolume
3. Created the VM and added the volumemode as Block and changed the running param to false 
4. Started the VM with virtctl start vm1
5. Accessed the VM with virtctl console vm1 >>>>>>> VM is running and accessable

Moving to VERIFIED!

Yamls I used can be seen below:

Datavolume:
-----------------------------------------
apiVersion: cdi.kubevirt.io/v1alpha1
kind: DataVolume
metadata:
  name: first-rootdisk
spec:
  source:
      http:
         url:  "http://cnv-qe-server.rhevdev.lab.eng.rdu2.redhat.com/files/rhel-images/rhel-8/rhel-8.qcow2"
  pvc:
    storageClassName: rook-ceph-block
    volumeMode:  Block
    accessModes:
     - ReadWriteMany
    resources:
      requests:
        storage: 12Gi


VM:
--------------------------------------------
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  annotations:
    name.os.template.kubevirt.io/rhel7.6: Red Hat Enterprise Linux 7.6
  selfLink: /apis/kubevirt.io/v1alpha3/namespaces/sky/virtualmachines/vm1
  resourceVersion: '311256'
  name: vm1
  uid: 33ce2a4a-b926-11e9-a08c-98039b6185e8
  creationTimestamp: '2019-08-07T15:15:34Z'
  generation: 7
  namespace: sky
  labels:
    app: vm1
    flavor.template.kubevirt.io/large: 'true'
    os.template.kubevirt.io/rhel7.6: 'true'
    vm.kubevirt.io/template: first
    vm.kubevirt.io/template-namespace: sky
    workload.template.kubevirt.io/desktop: 'true'
spec:
  dataVolumeTemplates:
    - metadata:
        creationTimestamp: null
        name: vm1-first-rootdisk-clone
      spec:
        pvc:
          accessModes:
            - ReadWriteMany
          dataSource: null
          resources:
            requests:
              storage: 12Gi
          volumeMode: Block
          storageClassName: rook-ceph-block
        source:
          pvc:
            name: first-rootdisk
            namespace: sky
      status: {}
  running: false
  template:
    metadata:
      creationTimestamp: null
      labels:
        kubevirt.io/domain: vm1
        kubevirt.io/size: large
        vm.kubevirt.io/name: vm1
    spec:
      domain:
        cpu:
          cores: 1
          sockets: 2
          threads: 1
        devices:
          disks:
            - bootOrder: 1
              disk:
                bus: virtio
              name: rootdisk
          inputs:
            - bus: virtio
              name: tablet
              type: tablet
          interfaces:
            - bootOrder: 2
              masquerade: {}
              name: nic0
          rng: {}
        machine:
          type: ''
        resources:
          requests:
            memory: 2G
      evictionStrategy: LiveMigrate
      hostname: vm1
      networks:
        - name: nic0
          pod: {}
      terminationGracePeriodSeconds: 0
      volumes:
        - dataVolume:
            name: vm1-first-rootdisk-clone
          name: rootdisk
status:
  created: true
  ready: true






Template:
--------------------------------------------
kind: Template
apiVersion: template.openshift.io/v1
metadata:
  name: first
  namespace: sky
  selfLink: /apis/template.openshift.io/v1/namespaces/sky/templates/first
  uid: 27c1c366-b926-11e9-b865-0a580a81002e
  resourceVersion: '304547'
  creationTimestamp: '2019-08-07T15:15:13Z'
  labels:
    flavor.template.kubevirt.io/large: 'true'
    os.template.kubevirt.io/rhel7.6: 'true'
    template.kubevirt.io/type: vm
    vm.kubevirt.io/template: rhel7-desktop-large
    vm.kubevirt.io/template-namespace: openshift
    workload.template.kubevirt.io/desktop: 'true'
  annotations:
    name.os.template.kubevirt.io/rhel7.6: Red Hat Enterprise Linux 7.6
objects:
  - apiVersion: kubevirt.io/v1alpha3
    kind: VirtualMachine
    metadata:
      labels:
        app: '${NAME}'
        vm.kubevirt.io/template: rhel7-desktop-large
      name: '${NAME}'
    spec:
      template:
        metadata:
          labels:
            kubevirt.io/domain: '${NAME}'
            kubevirt.io/size: large
        spec:
          domain:
            cpu:
              cores: 1
              sockets: 2
              threads: 1
            devices:
              inputs:
                - bus: virtio
                  name: tablet
                  type: tablet
              rng: {}
              interfaces:
                - name: nic0
                  bootOrder: 2
                  masquerade: {}
              disks:
                - disk:
                    bus: virtio
                  name: rootdisk
                  bootOrder: 1
            resources:
              requests:
                memory: 2G
          evictionStrategy: LiveMigrate
          terminationGracePeriodSeconds: 0
          networks:
            - name: nic0
              pod: {}
          volumes:
            - name: rootdisk
              dataVolume:
                name: first-rootdisk
          hostname: '${NAME}'
      dataVolumeTemplates: []
parameters:
  - name: NAME
    description: Name for the new VM


Note You need to log in before you can comment on or make changes to this bug.