Bug 959191

Summary: On start(restart) libvirt stops existing domains with PCI passthrough devices
Product: [Community] Virtualization Tools Reporter: Vlastimil Holer <vlastimil.holer>
Component: libvirtAssignee: Laine Stump <laine>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: dallan, dyasny, eblake, jdenemar, vlastimil.holer
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-05-31 19:26:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Domain with PCI passthrough dev.
none
Libvirt (start only) and QEMU debug logs none

Description Vlastimil Holer 2013-05-03 10:19:44 UTC
Created attachment 743145 [details]
Domain with PCI passthrough dev.

Description of problem:
If I have running domain (both persistent or transient) with PCI passthrough device, libvirt mustn't be restarted. On next libvirt start, it immediately terminates these running domains. If domain doesn't contain PCI passthrough device, everything works fine and this domain is untouched.

Version-Release number of selected component (if applicable):
libvirt 1.0.5 (from Debian experimental)
QEMU 1.4.1 (from Debian experimental)
kernel 3.2.41

Steps to Reproduce:
1. start domain with PCI passthrough device
2. stop libvirt
3. start libvirt
  
Actual results:
Domain with PCI passthrough device is stopped on libvirt start.

Expected results:
Domain remains running.

Additional info:
I'm attaching domain XML definition for failing domain and libvirt daemon debug logs with running domains: 1) with PCI device, 2) without PCI device

Comment 1 Vlastimil Holer 2013-05-03 10:32:37 UTC
Created attachment 743150 [details]
Libvirt (start only) and QEMU debug logs

Comment 2 Vlastimil Holer 2013-05-03 10:41:33 UTC
Tested with two different PCI devices, always terminates domain.

<device>
  <name>pci_0000_03_00_0</name>
  <parent>pci_0000_00_02_0</parent>
  <capability type='pci'>
    <domain>0</domain>
    <bus>3</bus>
    <slot>0</slot>
    <function>0</function>
    <product id='0x1003'>MT27500 Family [ConnectX-3]</product>
    <vendor id='0x15b3'>Mellanox Technologies</vendor>
  </capability>
</device>

<device>
  <name>pci_0000_00_1f_2</name>
  <parent>computer</parent>
  <driver>
    <name>ata_piix</name>
  </driver>
  <capability type='pci'>
    <domain>0</domain>
    <bus>0</bus>
    <slot>31</slot>
    <function>2</function>
    <product id='0x1d00'>C600/X79 series chipset 4-Port SATA IDE Controller</product>
    <vendor id='0x8086'>Intel Corporation</vendor>
  </capability>
</device>

Comment 3 Jiri Denemark 2013-05-23 20:57:28 UTC
Logs suggest it might be related to nwfilter as that is the last thing done before the domain gets killed. And the result of nwfilter commands differs between the two logs. What happens if you remove

<filterref filter="clean-traffic">
  <parameter name="IP" value="147.251.254.121"/>
</filterref>

from domain definition? Is it still terminated when you restart libvirtd?

Comment 4 Vlastimil Holer 2013-05-30 12:10:36 UTC
Sorry for delay. No, it doesn't help either (v1.0.5). Domain with PCI device and without nwfilter is killed on libvirt restart as well.

Comment 6 Laine Stump 2013-05-31 19:26:34 UTC
I believe this is the cause of the problem:

http://libvirt.org/git/?p=libvirt.git;a=commit;h=2ea45647bcde23cff5da48f725561ff5ba3fba39

That patch has been pushed upstream (so it will be in libvirt-1.0.6) and also to the maintenance branches for 1.0.3, 1.0.4, and 1.0.5 (which are the only releases affected by the bug). Distro maintainers should pull from the -maint branch and do a new build as soon as possible, since this misbehavior is just not at all nice.