Bug 1884536 - VMs with VCPU=1 created from 2.4 templates will stop working in 2.5
Summary: VMs with VCPU=1 created from 2.4 templates will stop working in 2.5
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Networking
Version: 2.5.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 2.5.0
Assignee: Miguel Duarte Barroso
QA Contact: Ofir Nash
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-02 08:48 UTC by Karel Šimon
Modified: 2020-11-17 13:25 UTC (History)
9 users (show)

Fixed In Version: virt-launcher-container-v2.5.0-78
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-11-17 13:24:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt kubevirt pull 4300 0 None closed tap-device, multi-queue: enforce single-queue tap 2020-11-16 16:13:44 UTC
Github kubevirt kubevirt pull 4327 0 None closed [release-0.34] tap-device, multi-queue: enforce single-queue tap 2020-11-16 16:13:44 UTC
Red Hat Product Errata RHEA-2020:5127 0 None None None 2020-11-17 13:25:06 UTC

Description Karel Šimon 2020-10-02 08:48:26 UTC
Description of problem:
VMs which were created from 2.4 common-templates will stop working in CNV 2.5 due to the error: server error. command SyncVMI failed: "LibvirtError(Code=38, Domain=0, Message='Unable to create tap device tap0: Invalid argument')"
All informations and discussion can be found here: https://github.com/kubevirt/kubevirt/issues/4227

Version-Release number of selected component (if applicable):
2.5.0

Comment 5 Petr Horáček 2020-10-08 10:45:51 UTC
The issue was fixed on the master branch. The backport is waiting for an approval https://github.com/kubevirt/kubevirt/pull/4327

Comment 6 Yossi Segev 2020-10-13 13:33:05 UTC
Verified on CNV 2.5.0 with the following scenario:
1. Create a DV:
$ cat << EOF| oc apply -f -
> apiVersion: cdi.kubevirt.io/v1alpha1
> kind: DataVolume
> metadata:
>   name: fedora-dv
> spec:
>   source:
>       http:
>          url: "http://cnv-qe-server.rhevdev.lab.eng.rdu2.redhat.com/files/cnv-tests/fedora-images/Fedora-Cloud-Base-32-1.6.x86_64.qcow2"
>   pvc:
>     storageClassName: ocs-storagecluster-ceph-rbd
>     volumeMode: Block
>     accessModes:
>      - ReadWriteMany
>     resources:
>       requests:
>         storage: 15Gi
> EOF
datavolume.cdi.kubevirt.io/fedora-dv created

2. Change the PVC ownerReferences from DataVolume to PersistentVolumeClaim (a workaround for https://bugzilla.redhat.com/show_bug.cgi?id=1881658)
$ oc edit pvc fedora-dv

    ownerReferences:
    - apiVersion: cdi.kubevirt.io/v1beta1
      blockOwnerDeletion: true
      controller: true
      kind: DataVolume -> PersistentVolumeClaim
      name: fedora-dv

3. Create a template and a VM out of it.
Note that the template used in this command (fedora-server-tiny-v0.11.3) is found by running "oc -n openshift templates".
$ $ oc process -n openshift fedora-server-tiny-v0.11.3 -p NAME="vm-from-template" -p PVCNAME="fedora-dv" -p CLOUD_USER_PASSWORD="123456" | oc create -n yoss-ns  -f -
virtualmachine.kubevirt.io/vm-from-template created

4. Start the VM.
$ virtctl start vm-from-template
VM vm-from-template was scheduled to start

5. Verify the error from the bug description("server error. command SyncVMI failed: "LibvirtError(Code=38, Domain=0, Message='Unable to create tap device tap0: Invalid argument')") doesn't appear in the VMI describe, virt-launcher log and virt-handler log:
 a. $ oc describe vmi vm-from-template | grep LibvirtError
 b. $ oc logs virt-launcher-vm-from-template-6xcpt -c compute | grep Unable
 c. $ oc logs virt-handler-h7vl6 -n openshift-cnv | grep Unable
   (virt-handler-h7vl6 is the virt-handler pod running on the node where the VMI was scheduled).

6. Verify the tap device exists on all the VM's interfaces (which is only eth0), via the VMI's domxml:
 a. Find the ID of the VMI domain:
 $ oc exec -it virt-launcher-vm-from-template-6xcpt -- virsh list
 Id   Name                       State
------------------------------------------
 1    yoss-ns_vm-from-template   running

 b. Dump the domxml for this domain (which is "1" in this example)
 $ oc exec -it virt-launcher-vm-from-template-6xcpt -- virsh dumpxml 1 | less

d. Search for the ethernet entries - they all should have tap device defined in them:
    <interface type='ethernet'>
      <mac address='02:00:00:dd:e9:16'/>
      <target dev='tap0' managed='no'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <mtu size='1400'/>
      <alias name='ua-default'/>
      <rom enabled='no'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>


Thank you Ruty for your assistance in reproducing and verifying this.

Comment 7 Yossi Segev 2020-10-13 13:33:33 UTC
Verified by CNV Network QE.

Comment 10 errata-xmlrpc 2020-11-17 13:24:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Virtualization 2.5.0 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:5127


Note You need to log in before you can comment on or make changes to this bug.