1294677 – Reboot with SR-IOV devices confuses IOMMU

Bug 1294677 - Reboot with SR-IOV devices confuses IOMMU

Summary: Reboot with SR-IOV devices confuses IOMMU

Keywords:
Status:	CLOSED DEFERRED
Alias:	None
Product:	Virtualization Tools
Classification:	Community
Component:	libvirt
Sub Component:
Version:	unspecified
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Libvirt Maintainers
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-12-29 15:35 UTC by jakub.kicinski
Modified:	2020-11-03 17:13 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2020-11-03 17:13:04 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description jakub.kicinski 2015-12-29 15:35:22 UTC

Description of problem:

We hit an issue with PCI passthrough - after we reboot the VM IOMMU mappings are incorrect and devices will access invalid memory.

The issue is quite easy to reproduce, we were using NICs (e.g. Intel 40
Gig Ethernet NICs hit the issue reliably) but you could probably just pass through any PCI device which does a bit of DMA.


How reproducible:

100%


Steps to Reproduce:

With NICs we do the following to reproduce:
 - configure the NIC for SR-IOV passthrough [1];
 - create two standard VMs;
 - configure VMs with 4GB current allocation and 15GB maximum allocation of memory (my machines have 32 or 64GB total);
 - pass a VF to each machine;

Note1: the current/maximum allocation of memory seem to play a role here.  I'm not 100% sure, however, if it causes the bug or just makes it more likely to be triggered.
Note2: we leave <on_reboot>restart</on_reboot> so that VMs can reboot.

I was able to reproduce easily on 3 distinct machines (dual CPU Haswell E, single CPU Haswell E, single Sandy Bridge EP).

With the VMs created above do the following:
 (1) boot;
 (2) configure VF interfaces;
 (3) run ping -c30 to confirm they can communicate;
 (4) run iperf -P4 -t30 between the machines;
 (5) reboot;
 (6) goto 2;


Results:

First time (fresh after boot) ping and iperf should work fine.  After first reboot, there should already be communication problems.  From traffic inspection with tcpdump it appears that VFs receive zeroed packets.  Only some of the packets are zeroed so depending on your luck the communication may work for a while.  Usually it breaks down when ARP or important TCP segment gets placed in area that device reads as zero.  Reboot will not fix this condition, shutdown/start will.


System info:

Intel NIC i40e (configured for SR-IOV)
Machine with any of following CPU configurations: dual CPU Haswell E, single CPU Haswell E, single Sandy Bridge EP
32-64GB RAM
Linux Ubuntu 14.04 LTS
Linux kernel linux-next.git 4ef76753
Qemu git 38a762fe
(libvirt 1.2.2)

Comment 1 jakub.kicinski 2016-01-04 21:05:36 UTC

Reproduced easily on fully up-to-date CentOS 7 with ixgbe (Intel's 10gig cards):

CentOS Linux release 7.2.1511 (Core)

# rpm -q libvirt qemu-kvm kernel
libvirt-1.2.17-13.el7_2.2.x86_64
qemu-kvm-1.5.3-105.el7_2.1.x86_64
kernel-3.10.0-327.el7.x86_64
kernel-3.10.0-327.3.1.el7.x86_64

Ivy Bridge CPU

Comment 2 Laine Stump 2016-01-04 22:32:59 UTC

Postings to libvir-list that may (or may not) contain more information:

 https://www.redhat.com/archives/libvir-list/2015-December/msg00917.html
 https://www.redhat.com/archives/libvir-list/2016-January/msg00040.html

Comment 3 Daniel Berrangé 2020-11-03 17:13:04 UTC

Thank you for reporting this issue to the libvirt project. Unfortunately we have been unable to resolve this issue due to insufficient maintainer capacity and it will now be closed. This is not a reflection on the possible validity of the issue, merely the lack of resources to investigate and address it, for which we apologise. If you none the less feel the issue is still important, you may choose to report it again at the new project issue tracker https://gitlab.com/libvirt/libvirt/-/issues The project also welcomes contribution from anyone who believes they can provide a solution.

Note You need to log in before you can comment on or make changes to this bug.