1360584 – Bind policy don't work well when numad is running

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1360584 - Bind policy don't work well when numad is running

Summary: Bind policy don't work well when numad is running

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	numad
Sub Component:
Version:	7.3
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Lukáš Nykrýn
QA Contact:	qe-baseos-daemons
Docs Contact:	Yehuda Zimmerman
URL:
Whiteboard:
Duplicates (1):	1361058 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-07-27 06:11 UTC by Yumei Huang
Modified:	2023-09-14 03:28 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Known Issue
Doc Text:	numad changes QEMU memory bindings Currently, the numad daemon cannot distinguish between memory bindings that numad sets and memory bindings set explicitly by the memory mappings of a process. As a consequence, numad changes QEMU memory bindings, even when the NUMA memory policy is specified in the QEMU command line. To work around this problem, if manual NUMA bindings are specified in the guest, disable numad. This ensures that manual bindings configured in virtual machines are not changed by numad.
Clone Of:
Environment:
Last Closed:	2020-12-15 07:43:30 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Yumei Huang 2016-07-27 06:11:06 UTC

Description of problem:
Boot guest with ram for one numa node and hugepages for another numa node, and bind them to two different host nodes. According to numa_maps, the hugepages are bound to wrong host node. 

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.6.0-15.el7
kernel-3.10.0-478.el7.x86_64

How reproducible:
always

Steps to Reproduce:
1. prepare hugepages in host
# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
node 0 size: 16349 MB
node 0 free: 7775 MB
node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
node 1 size: 12287 MB
node 1 free: 6788 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 

# echo 3000 > /proc/sys/vm/nr_hugepages

# cat /proc/meminfo  | grep -i huge
AnonHugePages:     14336 kB
HugePages_Total:    3000
HugePages_Free:     3000
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

# mount -t hugetlbfs none /mnt/kvm_hugepage/

# cat /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages 
1500

# cat /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages 
1500


2.Boot guest with two numa nodes, specify 4G ram for node0, and bind to host node0; specify 2G hugepages for node1, and bind to host node1. 

# /usr/libexec/qemu-kvm -name rhel73 -m 6G,slots=240,maxmem=20G -smp 4  \

-realtime mlock=off  -no-user-config -nodefaults \

-drive file=/home/guest/RHEL-Server-7.3-64-virtio-scsi.qcow2,if=none,id=drive-disk,format=qcow2,cache=none  -device virtio-scsi-pci,id=scsi0 -device scsi-hd,drive=drive-disk,bus=scsi0.0,id=scsi-hd0 \

-netdev tap,id=hostnet1 -device virtio-net-pci,mac=42:ce:a9:d2:4d:d8,id=idlbq7eA,netdev=hostnet1 -usb -device usb-tablet,id=input0 -vga qxl -spice port=5902,addr=0.0.0.0,disable-ticketing,image-compression=off,seamless-migration=on -monitor stdio \

-object memory-backend-ram,id=mem0,size=4G,prealloc=yes,policy=bind,host-nodes=0  -numa node,nodeid=0,memdev=mem0 \

-object memory-backend-file,id=mem1,size=2G,mem-path=/mnt/kvm_hugepage,prealloc=yes,host-nodes=1,policy=bind  -numa node,nodeid=1,memdev=mem1

3. check memdev in HMP 
(qemu) info memdev

4. check smaps and numa_maps under /proc 


Actual results:
HMP shows same with cmdline:
(qemu) info memdev 
memory backend: 0
  size:  2147483648
  merge: true
  dump: true
  prealloc: true
  policy: bind
  host nodes: 1
memory backend: 1
  size:  4294967296
  merge: true
  dump: true
  prealloc: true
  policy: bind
  host nodes: 0

numastat shows the guest use host node0's hugepage:
# numastat -p `pgrep qemu`

Per-node process memory usage (in MBs) for PID 38662 (qemu-kvm)
                           Node 0          Node 1           Total
                  --------------- --------------- ---------------
Huge                      2048.00            0.00         2048.00
Heap                       126.29            0.00          126.29
Stack                        2.12            0.00            2.12
Private                   4154.45            0.04         4154.48
----------------  --------------- --------------- ---------------
Total                     6330.86            0.04         6330.90


numa_maps shows the hugepages are bound to node0:

# grep -2 `expr 2048 \* 1024`  /proc/`pgrep qemu`/smaps
VmFlags: rd wr mr mw me ac sd 
7f1593a00000-7f1613a00000 rw-p 00000000 00:29 184518                     /mnt/kvm_hugepage/qemu_back_mem._objects_mem1.3HDQU4 (deleted)
Size:            2097152 kB
Rss:                   0 kB
Pss:                   0 kB

# grep  7f1593a00000 /proc/`pgrep qemu`/numa_maps
7f1593a00000 bind:0 file=/mnt/kvm_hugepage/qemu_back_mem._objects_mem1.3HDQU4\040(deleted) huge anon=1024 dirty=1024 N0=1024 kernelpagesize_kB=2048


Expected results:
Hugepages should be bound to host node 1. 

Additional info:
When boot guest without ram but hugepages, the bind policy can work as expected.

Comment 2 Yumei Huang 2016-07-28 08:37:52 UTC

Hit same issue when specify ram for both nodes with prealloc=yes. 

And when set prealloc=false,  the issue is gone.  So change the summary.

Comment 3 Yumei Huang 2016-08-08 05:49:38 UTC

QE retest again, seems it has nothing to do with prealloc. 
When numad is inactive, the bind policy can work well, both memory objects are bound to right host nodes. 
When numad is running, one of the two memory objects is bound to wrong host node.

Comment 5 Eduardo Habkost 2016-08-08 14:29:49 UTC

Moving to numad component.

Comment 6 Yehuda Zimmerman 2016-08-18 13:52:49 UTC

Doc Text updated for Release Notes

Comment 8 Eduardo Habkost 2016-09-19 17:48:40 UTC

*** Bug 1361058 has been marked as a duplicate of this bug. ***

Comment 13 RHEL Program Management 2020-12-15 07:43:30 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 14 Red Hat Bugzilla 2023-09-14 03:28:44 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days

Note You need to log in before you can comment on or make changes to this bug.