RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1314306 - memory hotplug: kernel fails to online some memory sections as movable
Summary: memory hotplug: kernel fails to online some memory sections as movable
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel
Version: 7.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Andrea Arcangeli
QA Contact: Yumei Huang
URL:
Whiteboard:
Depends On:
Blocks: 1228543 1320534
TreeView+ depends on / blocked
 
Reported: 2016-03-03 11:41 UTC by Igor Mammedov
Modified: 2018-09-07 22:13 UTC (History)
22 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-09-07 22:13:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
PATCH add removable memory sections to ZONE_MOVABLE by default. (4.25 KB, patch)
2016-04-12 10:11 UTC, Igor Mammedov
no flags Details | Diff

Description Igor Mammedov 2016-03-03 11:41:04 UTC
Description of problem:

when memory hot-plugged kernel onlines only some last memory sections
of hoplugged range if udev is configured to online them as movable.

Version-Release number of selected component (if applicable):
kernel-3.10.0-356
systemd-219-19

How reproducible:
100%

Steps to Reproduce:
1. run RHEL7.2 guest and modify /usr/lib/udev/rules.d/40-redhat.rule as follows:
   from:
SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"
to:
SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online_movable"

  then install the latest kernel or update initrd image by running: dracut -f
  configure kernel to use serial as console and shutdown guest

On host, install qemu-kvm-rhev-2.3.0-31.el7_2.8 or later

2. start guest with following CLI:

/usr/libexec/qemu-kvm -enable-kvm -m 2G,slots=16,maxmem=16G -smp 2 -numa node -object memory-backend-ram,id=m1,size=1G -device pc-dimm,id=d1,memdev=m1 -nographic rhel72-image-file

Actual results:
in guest:
#cat /sys/devices/system/memory/memory32/state
offline

and it's the same till
#cat /sys/devices/system/memory/memory37/state
offline

and only sections 38, 39 are onlined

It's possible to online sections as movable manually only if it's done
in reverse order, like:
echo online_movable > /sys/devices/system/memory/memory37/state
echo online_movable > /sys/devices/system/memory/memory36/state
...

if onlining is done out of that order is fails, however section could be
onlined successfully in any order if it's onlined as not movable i.e.:

echo online > /sys/devices/system/memory/memory32/state

Expected results:

memory sections should be onlined as movable successfully in any order,
i.e. the same behaviour when 'online' is echoed into 'state'.

Additional info:

Upstream kernel also broken the same way.

Comment 2 Milan Zamazal 2016-03-07 16:11:08 UTC
FWIW, online_movable works in any order on memory blocks from memory0 to memory31. From memory32 up, the reported problem is exhibited.

Comment 7 Igor Mammedov 2016-03-30 14:35:36 UTC
Issue is not virt specific, baremetal also should be affected, to reproduce
one need physically hotplug a memory module. (i.e. prereq to trigger issue
is that memory module is not present at boot time).

Here is what happens:

  echo online_movable > memoryXX/online

triggers following code path:

memory_subsys_online -> memory_block_change_state -> memory_block_action ->
      online_pages():
        if (online_type == MMOP_ONLINE_MOVABLE &&
            zone_idx(zone) == ZONE_MOVABLE - 1) {
                if (move_pfn_range_right(zone, zone + 1, pfn, pfn + nr_pages))
                        return -EINVAL;
        }

where move_pfn_range_right() fails following check:

        /* the move out part mast at the right most of @z1 */
        if (zone_end_pfn(z1) >  end_pfn)
                goto out_fail;

since we are trying to online as movable not the last section in
ZONE_NORMAL.

Here is what makes hotplugged memory end up in ZONE_NORMAL:
 acpi_memory_enable_device() -> add_memory -> add_memory_resource ->
   -> arch/x86/mm/init_64.c  
     :w
/*
      * Memory is added always to NORMAL zone. This means you will never get
      * additional DMA/DMA32 memory.
      */

     int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
     {
        struct pglist_data *pgdat = NODE_DATA(nid);
        struct zone *zone = pgdat->node_zones +
                zone_for_memory(nid, start, size, ZONE_NORMAL, for_device);

i.e. all hot-plugged memory modules always go to ZONE_NORMAL.

Comment 10 Igor Mammedov 2016-04-12 10:08:54 UTC
Issue here is not in current movable zone design and that it design allows to move only memory section on border between zones.

Issue is that if movable zone doesn't exists in hotplug time, zone_for_memory()
defaults to ZONE_NORMAL. As result all all hotplugged memory sections go to
ZONE_NORMAL except the last one for which valid_zones == "NORMAL MOVABLE". i.e.
it is also in ZONE_NORMAL but could be moved to ZONE_MOVABLE.

Idea I've have had to fix it is to allow caller of add_memory() to specify default zone. That way ACPI hotplug acpi_memory_enable_device() could pick
to which zone hotplugged memory goes in.
Like if corresponding ACPI node is removable /has _EJ0 method/, pass to ad_memory() as default ZONE_MOVABLE, if hotplugged memory is not removable
set default to ZONE_NORMAL. That way removable memory goes by default to
ZONE_MOVABLE and onlining with echo [online|online_movable] > memoryXX/online makes it go to ZONE_MOVABLE by default without failing onlining memory sections.

Since I've been preemted and don't have time to work on this bug in near future, I'll attach WIP patch that does above and fixes this particular issue.
Whoever picks this up post something upstream pls CC me on it so I could help with review.

Comment 11 Igor Mammedov 2016-04-12 10:11:36 UTC
Created attachment 1146332 [details]
PATCH add removable memory sections to ZONE_MOVABLE by default.

it's just a proof of concept [i.e. on t ready to post upstream as is]


Note You need to log in before you can comment on or make changes to this bug.