1410589 – PCI: Reserve MMIO space over 4G for PCI hotplug

Bug 1410589 - PCI: Reserve MMIO space over 4G for PCI hotplug

Summary: PCI: Reserve MMIO space over 4G for PCI hotplug

Keywords:
Status:	CLOSED INSUFFICIENT_DATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-nova
Sub Component:
Version:	unspecified
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	low
Target Milestone:	---
Target Release:	---
Assignee:	OSP DFG:Compute
QA Contact:	OSP DFG:Compute
Docs Contact:
URL:
Whiteboard:
Depends On:	1390346 1408813
Blocks:
TreeView+	depends on / blocked

Reported:	2017-01-05 20:07 UTC by Marcel Apfelbaum
Modified:	2023-03-21 18:38 UTC (History)
CC List:	22 users (show)
Fixed In Version:
Doc Type:	Enhancement
Doc Text:
Clone Of:	1408813
Environment:
Last Closed:	2018-08-28 09:49:12 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Marcel Apfelbaum 2017-01-05 20:07:40 UTC

+++ This bug was initially created as a clone of Bug #1408813 +++

+++ This bug was initially created as a clone of Bug #1390346 +++

QEMU reserves only 32-bit memory range for hotplug. The range can be rather limited and not enough to hot-plug devices with large BARs.

Add a parameter to QEMU to reserve a 64-bit range in CRS
starting after the memory ranged reserved for memory hotplug.

Check that all the range is addressable by VM's CPU.

Add libvirt support for the new command line parameters.

Comment 1 Marcel Apfelbaum 2017-01-05 20:10:48 UTC

This should work the same as with Memory Hotplug.
If for Memory Hotplug we need to specify the slots and max size at command line, for the PCI Hotplug over 4G we only need to specify the size.

By default the allocated size will be 0, because when we reserve the space we add a constraint on the VM CPU addressable bits. Reserving a bigger chunk may create problems when we want to migrate the VM to a host with less addressable bits.

Because of the above, libvirt can't decide by itself and the upper layers need to handle this trade-off: more hot-pluggable space VS potential migration limitations.

Comment 2 Kashyap Chamarthy 2017-01-13 14:56:20 UTC

Marcel,

I take it this bug is dependent on the relevant QEMU bug that you're assigned to:

    https://bugzilla.redhat.com/show_bug.cgi?id=1390346

If you get time, I'd also appreciate if you have some examples of this bug-fix / behavior with practical QEMU command-line.

Also, a small clarification, I'm not clear what you mean by "over 4G for PCI hotplug", what precisely you mean?

Dave Gilbert was guessing on IRC: I suspect it's PCI bus addressing; every device on a PCI bus has a few address ranges, and something somewhere has to pick them.

Comment 3 Stephen Gordon 2017-01-18 19:39:39 UTC

Leaving un-flagged, no clear feature definition to present for Pike.

Comment 4 Marcel Apfelbaum 2017-01-25 13:26:57 UTC

(In reply to Kashyap Chamarthy from comment #2)
> Marcel,
> 
> I take it this bug is dependent on the relevant QEMU bug that you're
> assigned to:
> 
>     https://bugzilla.redhat.com/show_bug.cgi?id=1390346
> 
> If you get time, I'd also appreciate if you have some examples of this
> bug-fix / behavior with practical QEMU command-line.
> 
> Also, a small clarification, I'm not clear what you mean by "over 4G for PCI
> hotplug", what precisely you mean?
> 
> Dave Gilbert was guessing on IRC: I suspect it's PCI bus addressing; every
> device on a PCI bus has a few address ranges, and something somewhere has to
> pick them.

Hi,

In order to be able to hot-plug PCI devices, QEMU needs to reserve some address space to be mapped to PCI devices registers.

QEMU today reserves space only on 32-bit area (< 4G) memory. The problem is the "window" reserved is not always enough, especially if the VM has a lot of devices (they also use the same pool).
However (>4G) memory space is huge and the most part is not actually used, the only limit is the CPU addressable bits.

The question is how much space to reserve? Reserving too much can limit the migration only to the host where the VMs can support the same addressable bits.

Who can "guess" how much reservation we "need"? I suppose libvirt/nova, they can query the pyhsical PCI devices that may be attached in the future, the hosts CPU limitations and so on.

However, this is too low level and we want to come up with a solution that does not involve libvirt/nova, at least we will try.

Thanks,
Marcel

Comment 5 Stephen Finucane 2018-08-28 09:49:12 UTC

There's no clear explanation as to what use cases this feature will resolve. Until such a time as this is provided, I'm going to mark this as closed.

Note You need to log in before you can comment on or make changes to this bug.

ailan
berrange
chayang
dasmith
dyuan
eglynn
jinzhao
juzhang
kchamart
lhuang
libvirt-maint
lyarwood
marcel
mtessun
sbauza
sferdjao
sgordon
srevivo
stephenfin
virt-maint
vromanso
xuzhang