Bug 1717290 - [RHOS 15][deployment] Hot-plugging more than a single network interface with 'q35' machine type fails with "libvirt.libvirtError: internal error: No more available PCI slots"
Summary: [RHOS 15][deployment] Hot-plugging more than a single network interface with ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 15.0 (Stein)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: 15.0 (Stein)
Assignee: Martin Schuppert
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On: 1716356
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-05 06:44 UTC by Martin Schuppert
Modified: 2020-12-21 19:35 UTC (History)
11 users (show)

Fixed In Version: puppet-nova-14.4.1-0.20190605170411.17663a5.el8ost openstack-tripleo-heat-templates-10.5.1-0.20190606030408.d62ad34.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1716356
Environment:
Last Closed: 2019-09-21 11:22:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1831701 0 None None None 2019-06-05 06:58:07 UTC
OpenStack gerrit 663228 0 'None' 'MERGED' 'Add parameter for `libvirt/num_pcie_ports`' 2019-11-20 07:30:04 UTC
OpenStack gerrit 663500 0 'None' 'MERGED' 'Add new role parameter NovaLibvirtNumPciePorts' 2019-11-20 07:30:04 UTC
Red Hat Product Errata RHEA-2019:2811 0 None None None 2019-09-21 11:23:10 UTC

Description Martin Schuppert 2019-06-05 06:44:35 UTC
+++ This bug was initially created as a clone of Bug #1716356 +++

tempest.api.compute.servers.test_attach_interfaces.AttachInterfacesTestJSON.test_create_list_show_delete_interfaces_by_network_port fails with:
tempest.lib.exceptions.ServerFault: Got server fault
Details: Failed to attach network adapter device to 483cfa3d-2af5-4a4e-9296-e4204b59fbd7

Seen in a 1 controller 1 compute ML2/OVS with VXLAN tunnels deployment

Digging in nova logs shows:
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [req-d8c533c7-81ea-4fcd-8713-8c97b8e6738d 806027625a2d4a54b47f8c6e522aa6bd aa381084f7e14421abef5e4378404b36 - default default] [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7] attaching network adapter failed.: libvirt.libvirtError: internal error: No more available PCI slots
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7] Traceback (most recent call last):
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 1761, in attach_interface
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     guest.attach_device(cfg, persistent=True, live=live)
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/guest.py", line 306, in attach_device
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     self._domain.attachDeviceFlags(device_xml, flags=flags)
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 190, in doit
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     result = proxy_call(self._autowrap, f, *args, **kwargs)
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 148, in proxy_call
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     rv = execute(f, *args, **kwargs)
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 129, in execute
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     six.reraise(c, e, tb)
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     raise value
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 83, in tworker
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     rv = meth(*args, **kwargs)
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib64/python3.6/site-packages/libvirt.py", line 605, in attachDeviceFlags
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     if ret == -1: raise libvirtError ('virDomainAttachDeviceFlags() failed', dom=self)
2019-05-31 21:18:22.391 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7] libvirt.libvirtError: internal error: No more available PCI slots

--- Additional comment from Bernard Cafarelli on 2019-06-03 10:15:49 UTC ---

Job failure:
https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/ReleaseDelivery/view/OSP15/job/phase2-15_director-rhel-8.0-virthost-1cont_1comp-ipv4-vxlan-lvm/44/testReport/junit/tempest.api.compute.servers.test_attach_interfaces/AttachInterfacesTestJSON/test_create_list_show_delete_interfaces_by_network_port_id_73fe8f02_590d_4bf1_b184_e9ca81065051_network_/

--- Additional comment from Matthew Booth on 2019-06-04 13:36:40 UTC ---

2019-05-31 21:18:09.638 [controller-0/N-API] 20 DEBUG nova.api.openstack.wsgi [req-d8c533c7-81ea-4fcd-8713-8c97b8e6738d 806027625a2d4a54b47f8c6e522aa6bd aa381084f7e14421abef5e4378404b36 - default default] Action: 'create', calling method: <bound method InterfaceAttachmentController.create of <nova.api.openstack.compute.attach_interfaces.InterfaceAttachmentController object at 0x7f286c940c18>>, body: {"interfaceAttachment": {"net_id": "5605723a-fa70-401b-b4b8-5724bc6a350c"}} _process_stack /usr/lib/python3.6/site-packages/nova/api/openstack/wsgi.py:520

2019-05-31 21:18:09.640 [controller-0/N-API] 20 DEBUG nova.compute.api [req-d8c533c7-81ea-4fcd-8713-8c97b8e6738d 806027625a2d4a54b47f8c6e522aa6bd aa381084f7e14421abef5e4378404b36 - default default] [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7] Fetching instance by UUID get /usr/lib/python3.6/site-packages/nova/compute/api.py:2633

2019-05-31 21:18:22.368 [compute-0/N-CPU] 7 DEBUG nova.virt.libvirt.guest [req-d8c533c7-81ea-4fcd-8713-8c97b8e6738d 806027625a2d4a54b47f8c6e522aa6bd aa381084f7e14421abef5e4378404b36 - default default] attach device xml: <interface type="bridge">
                                          <mac address="fa:16:3e:54:10:e5"/>
                                          <model type="virtio"/>
                                          <driver name="vhost" rx_queue_size="512"/>
                                          <source bridge="qbre4872d18-3f"/>
                                          <mtu size="1450"/>
                                          <target dev="tape4872d18-3f"/>
                                        </interface>
                                         attach_device /usr/lib/python3.6/site-packages/nova/virt/libvirt/guest.py:305

2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [req-d8c533c7-81ea-4fcd-8713-8c97b8e6738d 806027625a2d4a54b47f8c6e522aa6bd aa381084f7e14421abef5e4378404b36 - default default] [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7] attaching network adapter failed.: libvirt.libvirtError: internal error: No more available PCI slots
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7] Traceback (most recent call last):
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 1761, in attach_interface
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     guest.attach_device(cfg, persistent=True, live=live)
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/guest.py", line 306, in attach_device
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     self._domain.attachDeviceFlags(device_xml, flags=flags)
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 190, in doit
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     result = proxy_call(self._autowrap, f, *args, **kwargs)
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 148, in proxy_call
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     rv = execute(f, *args, **kwargs)
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 129, in execute
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     six.reraise(c, e, tb)
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     raise value
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 83, in tworker
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     rv = meth(*args, **kwargs)
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]   File "/usr/lib64/python3.6/site-packages/libvirt.py", line 605, in attachDeviceFlags
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7]     if ret == -1: raise libvirtError ('virDomainAttachDeviceFlags() failed', dom=self)
2019-05-31 21:18:22.391 [compute-0/N-CPU] 7 ERROR nova.virt.libvirt.driver [instance: 483cfa3d-2af5-4a4e-9296-e4204b59fbd7] libvirt.libvirtError: internal error: No more available PCI slots

--- Additional comment from Matthew Booth on 2019-06-04 13:40:31 UTC ---

2019-05-31 21:17:30.807 [compute-0/N-CPU] 7 DEBUG nova.virt.libvirt.driver [req-dc72abc6-052a-49d0-9811-18950532d2a3 806027625a2d4a54b47f8c6e522aa6bd aa381084f7e14421abef5e4378404b36 - default default] [instance: 483c
fa3d-2af5-4a4e-9296-e4204b59fbd7] End _get_guest_xml xml=<domain type="kvm">
                                          <uuid>483cfa3d-2af5-4a4e-9296-e4204b59fbd7</uuid>
                                          <name>instance-0000002a</name>
                                          <memory>65536</memory>
                                          <vcpu>1</vcpu>
                                          <metadata>
                                            <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
                                              <nova:package version="19.0.1-0.20190528131506.498608c.el8ost"/>
                                              <nova:name>tempest-AttachInterfacesTestJSON-server-1268307078</nova:name>
                                              <nova:creationTime>2019-05-31 21:17:30</nova:creationTime>
                                              <nova:flavor name="m1.nano">
                                                <nova:memory>64</nova:memory>
                                                <nova:disk>1</nova:disk>
                                                <nova:swap>0</nova:swap>
                                                <nova:ephemeral>0</nova:ephemeral>
                                                <nova:vcpus>1</nova:vcpus>
                                              </nova:flavor>
                                              <nova:owner>
                                                <nova:user uuid="806027625a2d4a54b47f8c6e522aa6bd">tempest-AttachInterfacesTestJSON-848335770</nova:user>
                                                <nova:project uuid="aa381084f7e14421abef5e4378404b36">tempest-AttachInterfacesTestJSON-848335770</nova:project>
                                              </nova:owner>
                                              <nova:root type="image" uuid="10d693f8-c441-45c4-ae89-46b5b7922a41"/>
                                            </nova:instance>
                                          </metadata>
                                          <sysinfo type="smbios">
                                            <system>
                                              <entry name="manufacturer">Red Hat</entry>
                                              <entry name="product">OpenStack Compute</entry>
                                              <entry name="version">19.0.1-0.20190528131506.498608c.el8ost</entry>
                                              <entry name="serial">483cfa3d-2af5-4a4e-9296-e4204b59fbd7</entry>
                                              <entry name="uuid">483cfa3d-2af5-4a4e-9296-e4204b59fbd7</entry>
                                              <entry name="family">Virtual Machine</entry>
                                            </system>
                                          </sysinfo>
                                          <os>
                                            <type machine="pc-q35-rhel8.0.0">hvm</type>
                                            <boot dev="hd"/>
                                            <smbios mode="sysinfo"/>
                                          </os>
                                          <features>
                                            <acpi/>
                                            <apic/>
                                          </features>
                                          <cputune>
                                            <shares>1024</shares>
                                          </cputune>
                                          <clock offset="utc">
                                            <timer name="pit" tickpolicy="delay"/>
                                            <timer name="rtc" tickpolicy="catchup"/>
                                            <timer name="hpet" present="no"/>
                                          </clock>
                                          <cpu mode="host-model" match="exact">
                                            <topology sockets="1" cores="1" threads="1"/>
                                          </cpu>
                                          <devices>
                                            <disk type="file" device="disk">
                                              <driver name="qemu" type="qcow2" cache="none"/>
                                              <source file="/var/lib/nova/instances/483cfa3d-2af5-4a4e-9296-e4204b59fbd7/disk"/>
                                              <target bus="virtio" dev="vda"/>
                                            </disk>
                                            <interface type="bridge">
                                              <mac address="fa:16:3e:ac:9b:ac"/>
                                              <model type="virtio"/>
                                              <driver name="vhost" rx_queue_size="512"/>
                                              <source bridge="qbrd170f8c3-c0"/>
                                              <mtu size="1450"/>
                                              <target dev="tapd170f8c3-c0"/>
                                            </interface>
                                            <serial type="pty">
                                              <log file="/var/lib/nova/instances/483cfa3d-2af5-4a4e-9296-e4204b59fbd7/console.log" append="off"/>
                                            </serial>
                                            <input type="tablet" bus="usb"/>
                                            <graphics type="vnc" autoport="yes" listen="172.17.1.51"/>
                                            <video>
                                              <model type="cirrus"/>
                                            </video>
                                            <rng model="virtio">
                                              <backend model="random">/dev/urandom</backend>
                                            </rng>
                                            <memballoon model="virtio">
                                              <stats period="10"/>
                                            </memballoon>
                                          </devices>
                                        </domain>
                                         _get_guest_xml /usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py:5516

--- Additional comment from Matthew Booth on 2019-06-04 13:49:39 UTC ---

Guessing this is another q35 bug. Perhaps we need to manually create a pcie-root-port controller? However, it reads to me like libvirt should be creating one for us: https://libvirt.org/pci-hotplug.html#x86_64-q35

--- Additional comment from Kashyap Chamarthy on 2019-06-04 15:21:35 UTC ---

tl;dr — The immediate "fix" is to make TripleO configure the
        'num_pcie_ports' to 12 (or 16), because 'q35' machine type by
        default allows hotplugging only _one_  PCIe device.

Long
----

(*) Firstly, the Tempest test[1],
    test_create_list_show_delete_interfaces_by_network_port(), is trying
    to hot-plug *three* network interfaces:

        [...]
        try:
            iface = self._test_create_interface(server)
        [...]
        iface = self._test_create_interface_by_network_id(server, ifs)
        ifs.append(iface)

        iface = self._test_create_interface_by_port_id(server, ifs)
	ifs.append(iface)
        [...]

(*) We're here using 'q35' machine type, which by default allows only a
    *single* PCIe device to be hotplugged.  And Nova currently sets
    'num_pcie_ports' to "0" (which means, it defaults to libvirt's "1"),
    but as the previous point showed, the test is hot-plugging _3_
    interfaces.

    And as the libvirt document[2] states: "If you plan to hotplug more
    than a single PCI Express device, you should add a suitable number
    of pcie-root-port controllers when defining the guest".

(*) But the next question is: "Why does the test work with 'pc'
    machine type, then?"  It works because, with 'pc' (or 'i440fx'),
    "each of the 31 slots (from 0x01 to 0x1f) on the pci-root controller
    is hotplug capable and can accept a legacy PCI device"[3].

[1] https://github.com/openstack/tempest/blob/25f5d28f3c2c79d7d0abfaa48db5d53a41f5e40d/tempest/api/compute/servers/test_attach_interfaces.py#L219
[2] https://libvirt.org/pci-hotplug.html#x86_64-q35
[3] https://libvirt.org/pci-hotplug.html#x86_64-i440fx

Next Steps
----------

- Immediately, make TripleO increment the no. of 'num_pcie_ports' to 16.

- Long-term, write-up a spec-less Blueprint for allowing this via
  flavor and image metadata property (e.g. "hw_num_pcie_ports").

--- Additional comment from Artom Lifshitz on 2019-06-05 01:59:03 UTC ---

This has been reproduced upstream with my DNM q35 job (once the IDE CDROM business was at least partially sorted by Lee's patch below).

[1] http://logs.openstack.org/87/662887/6/check/tempest-full-py3/65e3798/controller/logs/screen-n-cpu.txt.gz?level=ERROR#_Jun_04_23_59_40_494675

Comment 7 errata-xmlrpc 2019-09-21 11:22:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2811


Note You need to log in before you can comment on or make changes to this bug.