+++ This bug was initially created as a clone of Bug #1408813 +++
+++ This bug was initially created as a clone of Bug #1390346 +++
QEMU reserves only 32-bit memory range for hotplug. The range can be rather limited and not enough to hot-plug devices with large BARs.
Add a parameter to QEMU to reserve a 64-bit range in CRS
starting after the memory ranged reserved for memory hotplug.
Check that all the range is addressable by VM's CPU.
Add libvirt support for the new command line parameters.
This should work the same as with Memory Hotplug.
If for Memory Hotplug we need to specify the slots and max size at command line, for the PCI Hotplug over 4G we only need to specify the size.
By default the allocated size will be 0, because when we reserve the space we add a constraint on the VM CPU addressable bits. Reserving a bigger chunk may create problems when we want to migrate the VM to a host with less addressable bits.
Because of the above, libvirt can't decide by itself and the upper layers need to handle this trade-off: more hot-pluggable space VS potential migration limitations.
I take it this bug is dependent on the relevant QEMU bug that you're assigned to:
If you get time, I'd also appreciate if you have some examples of this bug-fix / behavior with practical QEMU command-line.
Also, a small clarification, I'm not clear what you mean by "over 4G for PCI hotplug", what precisely you mean?
Dave Gilbert was guessing on IRC: I suspect it's PCI bus addressing; every device on a PCI bus has a few address ranges, and something somewhere has to pick them.
Leaving un-flagged, no clear feature definition to present for Pike.
(In reply to Kashyap Chamarthy from comment #2)
> I take it this bug is dependent on the relevant QEMU bug that you're
> assigned to:
> If you get time, I'd also appreciate if you have some examples of this
> bug-fix / behavior with practical QEMU command-line.
> Also, a small clarification, I'm not clear what you mean by "over 4G for PCI
> hotplug", what precisely you mean?
> Dave Gilbert was guessing on IRC: I suspect it's PCI bus addressing; every
> device on a PCI bus has a few address ranges, and something somewhere has to
> pick them.
In order to be able to hot-plug PCI devices, QEMU needs to reserve some address space to be mapped to PCI devices registers.
QEMU today reserves space only on 32-bit area (< 4G) memory. The problem is the "window" reserved is not always enough, especially if the VM has a lot of devices (they also use the same pool).
However (>4G) memory space is huge and the most part is not actually used, the only limit is the CPU addressable bits.
The question is how much space to reserve? Reserving too much can limit the migration only to the host where the VMs can support the same addressable bits.
Who can "guess" how much reservation we "need"? I suppose libvirt/nova, they can query the pyhsical PCI devices that may be attached in the future, the hosts CPU limitations and so on.
However, this is too low level and we want to come up with a solution that does not involve libvirt/nova, at least we will try.
There's no clear explanation as to what use cases this feature will resolve. Until such a time as this is provided, I'm going to mark this as closed.