Bug 1228543 - [RFE] hot-unplug memory
Summary: [RFE] hot-unplug memory
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: RFEs
Version: ---
Hardware: All
OS: Linux
high
high
Target Milestone: ovirt-4.2.0
: 4.2.0
Assignee: Milan Zamazal
QA Contact: Israel Pinto
URL: http://www.ovirt.org/Features/Memory_...
Whiteboard:
: 1228546 (view as bug list)
Depends On: 515840 822996 1224886 1245892 1265880 1314306 1320447 1320534 1323417 1325121 1402880 1482042 1482076 1482474 1563532
Blocks: 515839 962053 1502671
TreeView+ depends on / blocked
 
Reported: 2015-06-05 07:21 UTC by Michal Skrivanek
Modified: 2019-04-28 11:12 UTC (History)
16 users (show)

Fixed In Version:
Clone Of: 1224886
Environment:
Last Closed: 2018-02-12 10:11:07 UTC
oVirt Team: Virt
Embargoed:
rule-engine: ovirt-4.2+
rule-engine: exception+
ipinto: testing_plan_complete+
mtessun: planning_ack+
michal.skrivanek: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 68585 0 'None' MERGED virt: Support for memory hotunplug 2020-10-21 13:38:31 UTC
oVirt gerrit 69764 0 'None' MERGED virt: Update VM memory in Vm, not vmdevices 2020-10-21 13:38:31 UTC
oVirt gerrit 71785 0 'None' MERGED virt: Fix of NoSuchVM misspelling in memory hotunplug 2020-10-21 13:38:44 UTC
oVirt gerrit 71877 0 'None' MERGED virt: Make Memory device XML more complete 2020-10-21 13:38:31 UTC
oVirt gerrit 72078 0 'None' MERGED virt: Update memory devices correctly 2020-10-21 13:38:31 UTC
oVirt gerrit 72079 0 'None' MERGED virt: Update VM memory in Vm, not vmdevices 2020-10-21 13:38:32 UTC
oVirt gerrit 72080 0 'None' MERGED virt: Support for memory hotunplug 2020-10-21 13:38:45 UTC
oVirt gerrit 72081 0 'None' MERGED virt: Fix of NoSuchVM misspelling in memory hotunplug 2020-10-21 13:38:32 UTC
oVirt gerrit 72082 0 'None' MERGED virt: Make Memory device XML more complete 2020-10-21 13:38:45 UTC
oVirt gerrit 72637 0 'None' MERGED webadmin: Memory hot unplug UI 2020-10-21 13:38:33 UTC
oVirt gerrit 72638 0 'None' MERGED webadmin: Unplugging state of hot mem unplug button 2020-10-21 13:38:33 UTC
oVirt gerrit 73081 0 'None' MERGED core: HotUnplugMemoryVDSCommand added 2020-10-21 13:38:33 UTC
oVirt gerrit 73082 0 'None' MERGED core, webadmin: HotUnplugMemoryCommand added 2020-10-21 13:38:34 UTC
oVirt gerrit 73411 0 'None' ABANDONED core: VdcReturnValueBase#valid auto updated 2020-10-21 13:38:34 UTC
oVirt gerrit 73640 0 'None' MERGED core: Memory hot plug always starts with min. block 2020-10-21 13:38:34 UTC
oVirt gerrit 73692 0 'None' MERGED core, webadmin: Memory hot unplug enabled 2020-10-21 13:38:34 UTC

Description Michal Skrivanek 2015-06-05 07:21:06 UTC
+++ This bug was initially created as a clone of Bug #1224886 +++

add support for dynamically plugging and unplugging of memory

--- Additional comment from Michal Skrivanek on 2015-06-05 09:19:41 CEST ---

splitting as unplug is delayed

Comment 2 Tomas Jelinek 2015-12-16 09:09:54 UTC
*** Bug 1228546 has been marked as a duplicate of this bug. ***

Comment 3 Red Hat Bugzilla Rules Engine 2015-12-16 21:35:14 UTC
This request has been proposed for two releases. This is invalid flag usage. The ovirt-future release flag has been cleared. If you wish to change the release flag, you must clear one release flag and then set the other release flag to ?.

Comment 4 Sven Kieske 2016-01-25 11:37:27 UTC
will this make it in ovirt 4 ?

Comment 5 Sven Kieske 2016-02-11 09:59:08 UTC
BZ 1245892 is restricted, I can not view it.

Would it be possible to open this BZ to the public?

Thank you in advance.

Comment 6 Michal Skrivanek 2016-02-12 09:57:19 UTC
(In reply to Sven Kieske from comment #5)
> BZ 1245892 is restricted, I can not view it.
> 
> Would it be possible to open this BZ to the public?

sure, seems like it was private by mistake. done

Comment 7 Michal Skrivanek 2016-02-17 16:29:41 UTC
still looking at feasibility as it seems unplug doesn't work that well in Linux in general. There might be some annoying constraints. Let's see...

Comment 8 Milan Zamazal 2016-03-18 15:25:24 UTC
Here is a summary of currently known issues:

- In order to make some reasonable guarantees about being able to remove a previously hotplugged memory, the memory must be enabled as `online_movable' instead of just `online'.  However, it's possible to do so only when plugged memory blocks are onlined in particular order and it doesn't work with current udev rule for memory onlining.  See https://bugzilla.redhat.com/1314306, that kernel bug must be fixed to make memory hotplug usable.

- Currently, the udev rule for memory hotplug (/usr/lib/udev/rules.d/40-redhat.rules) enables the hotplugged memory as `online' instead of `online_movable', see the bug above.  This often results in the inserted memory blocks being used for kernel (non-movable) memory, preventing hotunplug of memory devices containing any of those blocks.  That must be changed, but only after the kernel bug mentioned above is fixed.

- When I try to remove more memory than it can be freed, it results in OOM kills, not in failure.

- I'm not sure whether memory onlined as `online_movable' is guaranteed to be removable and I doubt anybody knows for sure.  I once (and only once so far) met a situation when even `online_movable' memory wasn't removable.  I couldn't reproduce it later but we should probably be prepared for occasional hotunplug failures.

- As with other hotplug/hotunplug devices, libvirt doesn't report hotunplug failure and we have to rely on timeouts.  I was able to make memory hotunplug last for a few seconds even in my simple testing environment (when swapping out the used memory was involved).

- An additional issue is that kernel reports invalid information about used and free memory (e.g. claiming there's more available memory than physical memory) under some hotplug-hotunplug scenarios, see https://bugzilla.redhat.com/1265880.

Comment 9 Sven Kieske 2016-03-21 13:55:47 UTC
Hi,

Thanks for all those details.

https://bugzilla.redhat.com/show_bug.cgi?id=1314306 is also marked as private, would you mind open it to the public?

Thank you!

Comment 10 Moran Goldboim 2016-03-24 10:48:59 UTC
postponing for 4.1 due to the lack of readiness for production in lower level of the stack.

Comment 11 Milan Zamazal 2016-05-04 11:17:35 UTC
As explained in https://bugzilla.redhat.com/show_bug.cgi?id=1314306#c15, reliable memory hotunplug functionality in the kernel is a complicated matter and is unlikely to be available in a foreseeable future.  It is suggested to utilize memory ballooning mechanism instead, which should be more reliable, flexible and providing better performance.  So we discussed possible alternatives to the DIMM device based memory hotunplug and we consider the proposal described below.

The VM memory sizes can be defined by the following values:

- Minimum guaranteed memory ("minimum").
- Maximum memory ("maximum").
- Maximum memory actually assigned to the guest in libvirt ("assigned").
- Absolute memory limit ("limit").
- Free memory as reported by the guest operating system ("free"), not including caches and buffers.

The values must satisfy the following constraint:

  minimum <= maximum <= assigned <= limit

"Limit" is a hardcoded value and total guest RAM may not exceed it.

"Maximum" and "minimum" are already present in Engine UI as Memory Size and Physical Memory Guaranteed respectively.  User can currently change the values, but the only supported operation in runtime is increasing Memory Size (i.e. performing memory hotplug).

"Assigned" is a newly introduced value.  Wrt. current situation there is no distinction between "assigned" and "maximum".  But we want to emulate memory hotunplug by decreasing "maximum" below "assigned".  This doesn't change the current memory size set in libvirt/QEMU, which we track as "assigned". "Assigned" can't be set by the user, it's only changed indirectly by actual memory hotplug.

With this concept "maximum" and "minimum" could be modifiable in runtime in
either direction.  The following rules would apply:

- "Assigned" is initially set to "maximum" as defined in the VM.
- When the user increases "maximum" above "assigned", the amount of memory above "assigned" must be hotplugged and "assigned" is set to "maximum" or some higher value (depending on hotplug granularity) on success.
- Otherwise when the user adjusts any of the "maximum" or "minimum" values, the balloon must be adjusted accordingly (with the help of MoM).
- "Minimum" is guaranteed memory and can't be taken by the balloon.
- The memory between "maximum" and "minimum" is not a guaranteed memory and may be taken by the balloon.
- The memory between "assigned" and "maximum" is always taken by the balloon.
- If the user decreases "maximum" value, the balloon driver in the guest is instructed to react accordingly.  If it doesn't fulfill the order then we may kill the guest as it refuses to cooperate.

We must be careful about MoM.  Vdsm should report "minimum", "maximum" and "free" values to MoM.  But the actual balloon value as set by MoM must be increased by "assigned" minus "maximum" in Vdsm.  We must check whether this trick is safe wrt. MoM operation and its interaction with the host.

To know "free", oVirt guest agent must be running within the guest.  This is an extra requirement on the VMs so it would be better if we could obtain the value even without oVirt guest agent (could libvirt provide it?).

Comment 12 Sven Kieske 2016-05-09 13:56:50 UTC
(In reply to Milan Zamazal from comment #11)
> As explained in https://bugzilla.redhat.com/show_bug.cgi?id=1314306#c15,
> reliable memory hotunplug functionality in the kernel is a complicated
> matter and is unlikely to be available in a foreseeable future.

It seems that BZ 1314306 is not available to the public, could you maybe mark it as not private, if it's possible?

kind regards

Sven

Comment 15 Michal Skrivanek 2016-12-16 15:26:51 UTC
likely will be ready in 4.1.z

Comment 20 Milan Zamazal 2017-02-14 10:19:45 UTC
The Engine part is missing.

Comment 22 Israel Pinto 2017-08-17 10:59:34 UTC
Failed QA,
See BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1482076 and
https://bugzilla.redhat.com/show_bug.cgi?id=1482042

Reasons:
1. Defined memory is not update after each hot unplug of memory device
2. After reboot the VM the memory is restore to the value before unplug memory

Comment 23 Red Hat Bugzilla Rules Engine 2017-08-17 10:59:43 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 24 Michal Skrivanek 2017-09-07 12:33:28 UTC
related bugs are ON_QA, please next time either open those new bugs and close this one, or reopen this one and do not open new bugs

Comment 27 Israel Pinto 2017-10-01 06:09:48 UTC
Tested with:
4.2.0-0.0.master.20170917124606.gita804ef7.el7.centos
Memory hot unplug status:
I opened the following BZ:
1. BZ1496395
[Memory hot unplug] After commit snapshot with memory hot unplug failed since device not found
2. BZ1496366
[Memory hotplug] [UI] The Memory size in edit vm dialog is not updated after failing hotplug the 17th device
3. RFE for REST API BZ1496382
[RFE][REST API] Add support for VM devices under VM resource

Test run summary:
https://polarion.engineering.redhat.com/polarion/#/project/RHEVM3/testrun?id=42-123

Comment 34 Sandro Bonazzola 2018-02-12 10:11:07 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.