Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
when memory hot-plugged kernel onlines only some last memory sections
of hoplugged range if udev is configured to online them as movable.
Version-Release number of selected component (if applicable):
kernel-3.10.0-356
systemd-219-19
How reproducible:
100%
Steps to Reproduce:
1. run RHEL7.2 guest and modify /usr/lib/udev/rules.d/40-redhat.rule as follows:
from:
SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"
to:
SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online_movable"
then install the latest kernel or update initrd image by running: dracut -f
configure kernel to use serial as console and shutdown guest
On host, install qemu-kvm-rhev-2.3.0-31.el7_2.8 or later
2. start guest with following CLI:
/usr/libexec/qemu-kvm -enable-kvm -m 2G,slots=16,maxmem=16G -smp 2 -numa node -object memory-backend-ram,id=m1,size=1G -device pc-dimm,id=d1,memdev=m1 -nographic rhel72-image-file
Actual results:
in guest:
#cat /sys/devices/system/memory/memory32/state
offline
and it's the same till
#cat /sys/devices/system/memory/memory37/state
offline
and only sections 38, 39 are onlined
It's possible to online sections as movable manually only if it's done
in reverse order, like:
echo online_movable > /sys/devices/system/memory/memory37/state
echo online_movable > /sys/devices/system/memory/memory36/state
...
if onlining is done out of that order is fails, however section could be
onlined successfully in any order if it's onlined as not movable i.e.:
echo online > /sys/devices/system/memory/memory32/state
Expected results:
memory sections should be onlined as movable successfully in any order,
i.e. the same behaviour when 'online' is echoed into 'state'.
Additional info:
Upstream kernel also broken the same way.
Issue is not virt specific, baremetal also should be affected, to reproduce
one need physically hotplug a memory module. (i.e. prereq to trigger issue
is that memory module is not present at boot time).
Here is what happens:
echo online_movable > memoryXX/online
triggers following code path:
memory_subsys_online -> memory_block_change_state -> memory_block_action ->
online_pages():
if (online_type == MMOP_ONLINE_MOVABLE &&
zone_idx(zone) == ZONE_MOVABLE - 1) {
if (move_pfn_range_right(zone, zone + 1, pfn, pfn + nr_pages))
return -EINVAL;
}
where move_pfn_range_right() fails following check:
/* the move out part mast at the right most of @z1 */
if (zone_end_pfn(z1) > end_pfn)
goto out_fail;
since we are trying to online as movable not the last section in
ZONE_NORMAL.
Here is what makes hotplugged memory end up in ZONE_NORMAL:
acpi_memory_enable_device() -> add_memory -> add_memory_resource ->
-> arch/x86/mm/init_64.c
:w
/*
* Memory is added always to NORMAL zone. This means you will never get
* additional DMA/DMA32 memory.
*/
int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
{
struct pglist_data *pgdat = NODE_DATA(nid);
struct zone *zone = pgdat->node_zones +
zone_for_memory(nid, start, size, ZONE_NORMAL, for_device);
i.e. all hot-plugged memory modules always go to ZONE_NORMAL.
Issue here is not in current movable zone design and that it design allows to move only memory section on border between zones.
Issue is that if movable zone doesn't exists in hotplug time, zone_for_memory()
defaults to ZONE_NORMAL. As result all all hotplugged memory sections go to
ZONE_NORMAL except the last one for which valid_zones == "NORMAL MOVABLE". i.e.
it is also in ZONE_NORMAL but could be moved to ZONE_MOVABLE.
Idea I've have had to fix it is to allow caller of add_memory() to specify default zone. That way ACPI hotplug acpi_memory_enable_device() could pick
to which zone hotplugged memory goes in.
Like if corresponding ACPI node is removable /has _EJ0 method/, pass to ad_memory() as default ZONE_MOVABLE, if hotplugged memory is not removable
set default to ZONE_NORMAL. That way removable memory goes by default to
ZONE_MOVABLE and onlining with echo [online|online_movable] > memoryXX/online makes it go to ZONE_MOVABLE by default without failing onlining memory sections.
Since I've been preemted and don't have time to work on this bug in near future, I'll attach WIP patch that does above and fixes this particular issue.
Whoever picks this up post something upstream pls CC me on it so I could help with review.
Created attachment 1146332[details]
PATCH add removable memory sections to ZONE_MOVABLE by default.
it's just a proof of concept [i.e. on t ready to post upstream as is]
Description of problem: when memory hot-plugged kernel onlines only some last memory sections of hoplugged range if udev is configured to online them as movable. Version-Release number of selected component (if applicable): kernel-3.10.0-356 systemd-219-19 How reproducible: 100% Steps to Reproduce: 1. run RHEL7.2 guest and modify /usr/lib/udev/rules.d/40-redhat.rule as follows: from: SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online" to: SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online_movable" then install the latest kernel or update initrd image by running: dracut -f configure kernel to use serial as console and shutdown guest On host, install qemu-kvm-rhev-2.3.0-31.el7_2.8 or later 2. start guest with following CLI: /usr/libexec/qemu-kvm -enable-kvm -m 2G,slots=16,maxmem=16G -smp 2 -numa node -object memory-backend-ram,id=m1,size=1G -device pc-dimm,id=d1,memdev=m1 -nographic rhel72-image-file Actual results: in guest: #cat /sys/devices/system/memory/memory32/state offline and it's the same till #cat /sys/devices/system/memory/memory37/state offline and only sections 38, 39 are onlined It's possible to online sections as movable manually only if it's done in reverse order, like: echo online_movable > /sys/devices/system/memory/memory37/state echo online_movable > /sys/devices/system/memory/memory36/state ... if onlining is done out of that order is fails, however section could be onlined successfully in any order if it's onlined as not movable i.e.: echo online > /sys/devices/system/memory/memory32/state Expected results: memory sections should be onlined as movable successfully in any order, i.e. the same behaviour when 'online' is echoed into 'state'. Additional info: Upstream kernel also broken the same way.