1254402 – libvirt should improve the way to bind cpu when specify nodeset in numatune

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1254402 - libvirt should improve the way to bind cpu when specify nodeset in numatune

Summary: libvirt should improve the way to bind cpu when specify nodeset in numatune

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	7.2
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Martin Kletzander
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-08-18 02:38 UTC by Luyao Huang
Modified:	2016-07-07 13:11 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-07-07 13:11:26 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Luyao Huang 2015-08-18 02:38:08 UTC

Description of problem:
libvirt should improve the way to bind cpu when specify nodeset in numatune, which somtimes (depends on numad return) will cause resource loss

Version-Release number of selected component (if applicable):
libvirt-1.2.17-5.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. prepare a guest like this in a NUMA machine:
# virsh dumpxml rhel7.0-rhel
...
  <vcpu placement='auto'>4</vcpu>
  <iothreads>2</iothreads>
  <iothreadids>
    <iothread id='1'/>
  </iothreadids>
  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>
...

2. check the numa topology

# numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
node 0 size: 65514 MB
node 0 free: 58344 MB
node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
node 1 size: 65536 MB
node 1 free: 57864 MB
node distances:
node   0   1 
  0:  10  11 
  1:  11  10 

3. start guest and recheck the cpu and mem in cgroup and taskset:

# cgget -g cpuset /machine.slice/machine-qemu\\x2drhel7.0\\x2drhel.scope/emulator
/machine.slice/machine-qemu\x2drhel7.0\x2drhel.scope/emulator:
cpuset.memory_spread_slab: 0
cpuset.memory_spread_page: 0
cpuset.memory_pressure: 0
cpuset.memory_migrate: 1
cpuset.sched_relax_domain_level: -1
cpuset.sched_load_balance: 1
cpuset.mem_hardwall: 0
cpuset.mem_exclusive: 0
cpuset.cpu_exclusive: 0
cpuset.mems: 0
cpuset.cpus: 8-15,24-31

# cgget -g cpuset /machine.slice/machine-qemu\\x2drhel7.0\\x2drhel.scope/vcpu1
/machine.slice/machine-qemu\x2drhel7.0\x2drhel.scope/vcpu1:
cpuset.memory_spread_slab: 0
cpuset.memory_spread_page: 0
cpuset.memory_pressure: 0
cpuset.memory_migrate: 1
cpuset.sched_relax_domain_level: -1
cpuset.sched_load_balance: 1
cpuset.mem_hardwall: 0
cpuset.mem_exclusive: 0
cpuset.cpu_exclusive: 0
cpuset.mems: 0                         <------
cpuset.cpus: 8-15,24-31                <--------cpus near node1

4. we can find libvirt use numad's return to bind cpus in libvirtd.log:

2015-08-18 02:07:11.033+0000: 16640: debug : virCommandRunAsync:2428 : About to run /bin/numad -w 4:19555
2015-08-18 02:07:11.035+0000: 16640: debug : virCommandRunAsync:2431 : Command result 0, with PID 16986
2015-08-18 02:07:13.042+0000: 16640: debug : virCommandRun:2279 : Result status 0, stdout: '1
' stderr: ''
2015-08-18 02:07:13.042+0000: 16640: debug : qemuProcessStart:4648 : Nodeset returned from numad: 1


Actual results:

libvirt still try to use numad to determine use which nodeset even we already specify the node in numatune, and then bind emulator/vcpu/iothread to cpu which in different node we bind memory

Expected results:

Do not use numad to determine use which node as we already specify it in numatune.

Additional info:

Comment 1 Martin Kletzander 2015-11-12 11:22:18 UTC

What do you mean by that?  Do you mwan we should run numad with only the number of CPUs and not the memory size?  That could make sense, but using automatic vcpu placement with static strict memory binding doesn't make sense anyway.  I don't get what the use case for this kind of configuration is.

Comment 2 Luyao Huang 2015-11-13 06:59:10 UTC

(In reply to Martin Kletzander from comment #1)
> What do you mean by that?  Do you mwan we should run numad with only the
> number of CPUs and not the memory size?  That could make sense, but using
> automatic vcpu placement with static strict memory binding doesn't make
> sense anyway.  I don't get what the use case for this kind of configuration
> is.

I think no need call numad in this case, user already specify the memory bind policy, numad just give a advise about which node is good to bind, but if we bind the memory and cpu in different node, it will waste some resource, shouldn't libvirt not use the numad advise in this case ? or forbid this use case ?

Comment 3 Martin Kletzander 2015-11-13 08:19:20 UTC

(In reply to Luyao Huang from comment #2)
We need to call numad because the user specified vcpu placement='auto'.

Comment 4 Martin Kletzander 2016-06-22 15:48:10 UTC

The users are effectively shooting themselves in the feet by doing this and we generally allow such behaviour as long as the specification is correct for us.  Although we could provide a warning in the logs, so I'll add that.

Comment 5 Martin Kletzander 2016-06-22 16:38:00 UTC

Patch proposed upstream:

https://www.redhat.com/archives/libvir-list/2016-June/msg01537.html

Comment 6 Luyao Huang 2016-06-23 01:38:52 UTC

(In reply to Martin Kletzander from comment #4)
> The users are effectively shooting themselves in the feet by doing this and
> we generally allow such behaviour as long as the specification is correct
> for us.  Although we could provide a warning in the logs, so I'll add that.

Okay, make sense, a warning is good enough.

Comment 7 Martin Kletzander 2016-07-07 13:11:26 UTC

Looks like it is way too much trouble just for a warning that's seen only in the logs.  Since this behaviour might be intentional (although very unlikely, mostly for testing purposes only) we shouldn't forbid it.  Hence closing as NOTABUG.

More info here:

https://www.redhat.com/archives/libvir-list/2016-July/msg00173.html

Note You need to log in before you can comment on or make changes to this bug.