Description of problem: RFE https://bugzilla.redhat.com/1723367 introduced the ability to connect <interface> to a pre-created host tap/macvtap device. However, when passing the device including a MTU to libvirt, it does both setting the MTU on the device and passing it to the guest. Since the device is pre-created, it already contains the correct MTU. Libvirt shouldn't touch it. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1.Pre create a tap device with MTU 2000. 2 Start a vm with the device. The device passed to libvirt should be "unmanaged" and contain "mtu". For example - <interface type='ethernet'> <mac address='02:00:00:d0:03:54'/> <target dev='tap0' managed='no'/> <model type='virtio'/> <mtu size='1440'/> <alias name='ua-default'/> <rom enabled='no'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </interface> Actual results: Libvirt sets the MTU on the tap device and passes it to the guest. (The MTU of the tap was changed to 1440, the MTU on the guest nic is 1440). Expected results: Libvirt shouldn't set the MTU on the tap/macvtap device. It should just pass the MTU to the guest. (The MTU of the tap device should be untouched and stay 2000, the MTU on the guest nic should be 1440). Additional info: The bug is a blocker to cnv since we run the vm in a pod with no net-admin capability. We have a pre-created tap device that already has the correct MTU. But when passing the device to libvirt we get the following error - virError(Code=38, Domain=0, Message='Cannot set interface MTU on 'tap0': Operation not permitted') We cannot omit the MTU option from the xml since we need libvirt to pass the MTU to the guest.
This sounds like a reasonable request. I'll look into making a patch for it. (I thought I recalled that if the tap device was already set to the correct MTU, then this error wouldn't occur. Is there a specific reason why you want the device to have a larger MTU than the NIC in the guest? I'm not sure about this specific situation, but mismatched MTUs can sometimes lead to "bad things" happening (in particular, one end thinks the MTU is larger, so it tries to send larger packets w/o fragmenting, but those packets are dropped at the other end because they're too big))
We have the same MTU on the nic and the tap device. The example in the description was just to illustrate the issue (that libvirt shouldn't touch the MTU of the unmanaged device). In CNV we run the vm (and libvirt) inside a pod with no net_admin capability. We have a pre-created tap device that already has the correct MTU (the same MTU we pass on the xml of the tap device to libvirt). But when passing the device to libvirt we get the following error - virError(Code=38, Domain=0, Message='Cannot set interface MTU on 'tap0': Operation not permitted') We cannot omit the MTU option from the xml since we need libvirt to pass the MTU to the guest. So in case of unmanaged tap/macvtap device we expect libvirt not to touch the MTU of the tap device but just pass the MTU to the guest.
Any news about this ? Can I help in any way ? I fully understand the holidays were in between, but we (CNV) are very keen on a fix for this.
Yeah, sorry. This was on my list to get to after the holidays, but I hadn't gotten there yet. It's actually a very simple fix - I just sent a patch upstream for it: https://www.redhat.com/archives/libvir-list/2021-January/msg00634.html
Thanks a lot Laine. Laine, Does passing the MTU by libvirt to the *guest* OS (as you wrote on your patch -to tell the emulated device what MTU to set on the other end of the tap) is supported for all the interface drivers (e.g. virtio, e1000)/all OSs? (For example, I see it doesn't work for cirros guest OS but works for fedora).
(In reply to Alona Kaplan from comment #5) > Does passing the MTU by libvirt to the *guest* OS (as you wrote on your > patch -to tell the emulated device what MTU to set on the other end of the > tap) is supported for all the interface drivers (e.g. virtio, e1000)/all > OSs? (For example, I see it doesn't work for cirros guest OS but works for > fedora). It only works for virtio-net (no e1000 or anything else), and only on guest OSes with a virtio-net driver new enough to understand the data passed into the guest via a PCI config register in the emulated device. AFAIK, this is only supported for Linux in the last year or two, and not by any other OS.
This is now upstream, but unfortunately not in the 7.0.0 release: commit 3bb87556b8ab010e5b808ac6775af7c10ea3d05d Author: Laine Stump <laine> Date: Tue Jan 12 14:10:05 2021 -0500 qemu: don't set interface MTU when managed='no'
Reproduce the issue on libvirt-6.10.0-1.module+el8.4.0+8898+a84e86e1.x86_64: 1. Create a tap device and set the mtu of it to be '2000'; # ip tuntap add mode tap name mytap0 # ip link set dev mytap0 mtu 2000 # ip l show mytap0 33: mytap0: <BROADCAST,MULTICAST> mtu 2000 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/ether 42:d4:c3:77:3b:09 brd ff:ff:ff:ff:ff:ff 2. Start a vm with the tap device above: # virsh dumpxml rhel | grep /interface -B6 <interface type='ethernet'> <mac address='52:54:00:93:0f:bc'/> <target dev='mytap0' managed='no'/> <model type='virtio'/> <mtu size='1400'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </interface> # virsh start rhel Domain rhel started 3. Check the mtu size both on host for the tap device and the interface on guest, both change to '1400'; # ip l show mytap0 33: mytap0: <BROADCAST,MULTICAST> mtu 1400 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/ether 42:d4:c3:77:3b:09 brd ff:ff:ff:ff:ff:ff [root@new_guest ~]# ip l show | grep -v lo 2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 52:54:00:93:0f:bc brd ff:ff:ff:ff:ff:ff For unprivileged user, this can not happen because the vm can not start with mtu setting: # ip tuntap add mode tap user test group test name mytap0 # ip link set dev mytap0 mtu 2000 # ip l show mytap0 $ virsh dumpxml vm1 | grep /interface -B6 <interface type='ethernet'> <mac address='52:54:00:93:0f:bc'/> <target dev='mytap0' managed='no'/> <model type='virtio'/> <mtu size='1440'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </interface> $ virsh start vm1 error: Failed to start domain vm1 error: Cannot set interface MTU on 'mytap0': Operation not permitted
Test on libvirt-7.0.0-2.module+el8.4.0+9520+ef609c5f.x86_64 1. prepare the tap device with mtu as '2000' # ip l show mytap0 1732: mytap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2000 qdisc fq_codel master br0 state UP mode DEFAULT group default qlen 1000 link/ether ae:14:31:88:b3:ab brd ff:ff:ff:ff:ff:ff 2. start the vm with mtu set as '1400': # virsh dumpxml test | grep /interface -B6 <interface type='network'> <mac address='52:54:00:58:ff:ab'/> <source network='net'/> <model type='virtio'/> <mtu size='1400'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </interface> # virsh start test Domain 'test' started 3. check the mtu of the tap device, it is not changed by libvirt and keep as '2000': # ip l show mytap0 1732: mytap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2000 qdisc fq_codel master br0 state UP mode DEFAULT group default qlen 1000 link/ether ae:14:31:88:b3:ab brd ff:ff:ff:ff:ff:ff 4.check the mtu on vm, it is 1400 as set in the xml: # ip l | grep -v lo 2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 52:54:00:58:ff:ab brd ff:ff:ff:ff:ff:ff scenario 2: set the mtu in interface larger than the tap device # ip l show mytap0 1732: mytap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 2000 qdisc fq_codel master br0 state DOWN mode DEFAULT group default qlen 1000 link/ether ae:14:31:88:b3:ab brd ff:ff:ff:ff:ff:ff # virsh dumpxml test | grep /interface -B6 <interface type='network'> <mac address='52:54:00:58:ff:ab'/> <source network='net'/> <model type='virtio'/> <mtu size='3000'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </interface> start the vm and check tap device again: # ip l show mytap0 1732: mytap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 2000 qdisc fq_codel master br0 state DOWN mode DEFAULT group default qlen 1000 link/ether ae:14:31:88:b3:ab brd ff:ff:ff:ff:ff:ff check on guest: # ip l show enp1s0 2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 3000 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 52:54:00:58:ff:ab brd ff:ff:ff:ff:ff:ff the result is as expected, set the bug to be verified.
Can we have this fix back-ported to RHEL-AV-8.3.0 (and the bug cloned) ? CNV's next release will unfortunately use that instead - correct me if I'm wrong @phoracek - and we'd like to have the fix asap.
Yes, the next release of OpenShift Virtualization will be released on 8.3. We'd appreciate having the fix in 8.3, so we could verify whether it really covers our use-case completely. If we find out there are more issues blocking us on the libvirt side, we'd have enough time to open more RFEs.
(In reply to Petr Horáček from comment #20) > Yes, the next release of OpenShift Virtualization will be released on 8.3. > > We'd appreciate having the fix in 8.3, so we could verify whether it really > covers our use-case completely. If we find out there are more issues > blocking us on the libvirt side, we'd have enough time to open more RFEs. @laine is this doable ?
Backporting the patch is trivial, it will take 5 minutes, and the chance of regression is essentially 0. I can never remember whether we're supposed to add the "ZStream" keyword, or set the zstream=? flag in order to request z-stream; I will ask on IRC and make the necessary change. When you say "8.3", do you mean 8.3.1.z, or 8.3.0.z? AFAIU if you need 8.3.0.z then 8.3.1.z will need to be done first.
(In reply to Laine Stump from comment #22) > Backporting the patch is trivial, it will take 5 minutes, and the chance of > regression is essentially 0. I can never remember whether we're supposed to > add the "ZStream" keyword, or set the zstream=? flag in order to request > z-stream; I will ask on IRC and make the necessary change. > > When you say "8.3", do you mean 8.3.1.z, or 8.3.0.z? AFAIU if you need > 8.3.0.z then 8.3.1.z will need to be done first. Yes, they need 8.3.0.z, but as you point out, we need to backport into 8.3.1.z too.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2098