Bug 959191 - On start(restart) libvirt stops existing domains with PCI passthrough devices
Summary: On start(restart) libvirt stops existing domains with PCI passthrough devices
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libvirt
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Laine Stump
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-05-03 10:19 UTC by Vlastimil Holer
Modified: 2013-05-31 19:26 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2013-05-31 19:26:55 UTC
Embargoed:


Attachments (Terms of Use)
Domain with PCI passthrough dev. (2.55 KB, text/xml)
2013-05-03 10:19 UTC, Vlastimil Holer
no flags Details
Libvirt (start only) and QEMU debug logs (183.71 KB, application/x-bzip)
2013-05-03 10:32 UTC, Vlastimil Holer
no flags Details

Description Vlastimil Holer 2013-05-03 10:19:44 UTC
Created attachment 743145 [details]
Domain with PCI passthrough dev.

Description of problem:
If I have running domain (both persistent or transient) with PCI passthrough device, libvirt mustn't be restarted. On next libvirt start, it immediately terminates these running domains. If domain doesn't contain PCI passthrough device, everything works fine and this domain is untouched.

Version-Release number of selected component (if applicable):
libvirt 1.0.5 (from Debian experimental)
QEMU 1.4.1 (from Debian experimental)
kernel 3.2.41

Steps to Reproduce:
1. start domain with PCI passthrough device
2. stop libvirt
3. start libvirt
  
Actual results:
Domain with PCI passthrough device is stopped on libvirt start.

Expected results:
Domain remains running.

Additional info:
I'm attaching domain XML definition for failing domain and libvirt daemon debug logs with running domains: 1) with PCI device, 2) without PCI device

Comment 1 Vlastimil Holer 2013-05-03 10:32:37 UTC
Created attachment 743150 [details]
Libvirt (start only) and QEMU debug logs

Comment 2 Vlastimil Holer 2013-05-03 10:41:33 UTC
Tested with two different PCI devices, always terminates domain.

<device>
  <name>pci_0000_03_00_0</name>
  <parent>pci_0000_00_02_0</parent>
  <capability type='pci'>
    <domain>0</domain>
    <bus>3</bus>
    <slot>0</slot>
    <function>0</function>
    <product id='0x1003'>MT27500 Family [ConnectX-3]</product>
    <vendor id='0x15b3'>Mellanox Technologies</vendor>
  </capability>
</device>

<device>
  <name>pci_0000_00_1f_2</name>
  <parent>computer</parent>
  <driver>
    <name>ata_piix</name>
  </driver>
  <capability type='pci'>
    <domain>0</domain>
    <bus>0</bus>
    <slot>31</slot>
    <function>2</function>
    <product id='0x1d00'>C600/X79 series chipset 4-Port SATA IDE Controller</product>
    <vendor id='0x8086'>Intel Corporation</vendor>
  </capability>
</device>

Comment 3 Jiri Denemark 2013-05-23 20:57:28 UTC
Logs suggest it might be related to nwfilter as that is the last thing done before the domain gets killed. And the result of nwfilter commands differs between the two logs. What happens if you remove

<filterref filter="clean-traffic">
  <parameter name="IP" value="147.251.254.121"/>
</filterref>

from domain definition? Is it still terminated when you restart libvirtd?

Comment 4 Vlastimil Holer 2013-05-30 12:10:36 UTC
Sorry for delay. No, it doesn't help either (v1.0.5). Domain with PCI device and without nwfilter is killed on libvirt restart as well.

Comment 6 Laine Stump 2013-05-31 19:26:34 UTC
I believe this is the cause of the problem:

http://libvirt.org/git/?p=libvirt.git;a=commit;h=2ea45647bcde23cff5da48f725561ff5ba3fba39

That patch has been pushed upstream (so it will be in libvirt-1.0.6) and also to the maintenance branches for 1.0.3, 1.0.4, and 1.0.5 (which are the only releases affected by the bug). Distro maintainers should pull from the -maint branch and do a new build as soon as possible, since this misbehavior is just not at all nice.


Note You need to log in before you can comment on or make changes to this bug.