Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1394402

Summary:	[Doc] provide steps to achieve zero-loss networking with DPDK, openvswitch, vhostuser, and testpmd
Product:	Red Hat OpenStack	Reporter:	Andrew Theurer <atheurer>
Component:	openvswitch	Assignee:	Andrew Theurer <atheurer>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	medium	Docs Contact:
Priority:	medium
Version:	10.0 (Newton)	CC:	aloughla, apevec, atelang, atheurer, atragler, berrange, chrisw, djuran, dnavale, dshaks, ealcaniz, fbaudin, fherrman, fleitner, krister, lbopf, mbabushk, nyechiel, rhos-maint, srevivo, yrachman
Target Milestone:	async	Keywords:	Triaged, ZStream
Target Release:	10.0 (Newton)
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Known Issue
Doc Text:	In order to reduce any interruptions to the allocated CPUs while running either Open vSwitch, virtual machine CPUs or the VNF threads within the virtual machines as much as possible, CPUs should be isolated. However, CPUAffinity cannot prevent all kernel threads from running on these CPUs. To prevent most of the kernel threads, you must use the boot option 'isolcpus=<cpulist>'. This uses the same CPU list as 'nohz_full' and 'rcu_nocbs'. The 'isolcpus' is engaged right at the kernel boot, and can thus prevent many kernel threads from being scheduled on the CPUs. This could be run on both the hypervisor and guest server. #!/bin/bash isol_cpus=`awk '{ for (i = 1; i <= NF; i++) if ($i ~ /nohz/) print $i };' /proc/cmdline \| cut -d"=" -f2` if [ ! -z "$isol_cpus" ]; then grubby --update-kernel=grubby --default-kernel --args=isolcpus=$isol_cpus fi 2) The following snippet re-pins the emulator thread action and is not recommended unless you experience specific performance problems. #!/bin/bash cpu_list=`grep -e "^CPUAffinity=.*" /etc/systemd/system.conf \| sed -e 's/CPUAffinity=//' -e 's/ /,/'` if [ ! -z "$cpu_list" ]; then virsh_list=`virsh list\| sed -e '1,2d' -e 's/\s\+/ /g' \| awk -F" " '{print $2}'` if [ ! -z "$virsh_list" ]; then for vm in $virsh_list; do virsh emulatorpin $vm --cpulist $cpu_list; done fi fi	Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-11-16 21:58:21 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1516017

Description Andrew Theurer 2016-11-11 21:18:27 UTC

Description of problem:

Achieving zero-packet-loss for NFV workloads on Openstack with current guidelines may not always be achievable. This will document further steps to achieve zero loss. This BZ is not intended to track enhancements or fixes.

Version-Release number of selected component (if applicable):

OSP-10
openvswitch-2.5.0-*.git20160727
RHEL 7.3

Current guidelines make use of cpu-partitioning tuned profile for a OSP compute node and VM running in that compute node, which does the following:

1) Uses boot options nohz_full and rcu_nocbs, specifying a list of cpus which are reserved for openvswitch and the VM vcpus
2) Uses systemd's CPUAffinity option to launch all user processes within a subset of online cpus, this list = inverse of the cpulist used for nohz_full and rcu_nocbs. This prevents user processes from running on the cpus reserved for openvswitch and the VM.
3) Uses the IRQBALANCE_BANNED_CPUS= option for irqbalance to avoid sending interrupts to the reserved cpus.

Both oenvswitch and the VM's vcpus must be configured so that they use, 1 thread per cpu, a cpu in the reserved cpus list.

All of this in an effort to reduce as much as possible any interruptions to these cpus while running either openvswitch, VM-vcpu, or the inside-VM VNF threads. However, CPUAffinity cannot remove or prevent all kernel threads from running on these cpus. In order to prevent [most] of the kernel threads, one must use the boot option "isolcpus=<cpulist>". This uses the same cpu list as nohz_full and rcu_nocbs uses. Isolcpus is engaged right at kernel boot, and thus can prevent many kernel threads from be scheduled on those cpus.

To enable isolcpus, first find out which cpus are being used for rcu_nocbs and nohz_full:

# cat /proc/cmdline

Look for the option, for example:

nohz_full=1,3,5,7,9,11,13

Use the cpulist and add isolcpus option. One way to do this is with grubby utility:

grubby --update-kernel=`grubby --default-kernel` --args="isolcpus=<cpulist>"

This should be done on both the host and the VM. A reboot will be required for this option to take effect.

Note that not all kernel threads are removed from these cpus. kworker, migration, and ksoftirqd will still be present on every cpu. However, kworker and migration threads should not have to run, and ksoftirqd runs typically once per second for a period of up to 20 microseconds, as long as only 1 user task is running on that cpu.

After this option is configured and openvswitch and the VM is running, you can check to see what threads are scheduled on these cpus by looking at the contents of /proc/sched_debug (on both the host and the VM)

For each cpu, there will be a section called "runnable tasks" like the following:

runnable tasks:
task PID tree-key switches prio wait-time sum-exec sum-sleep
----------------------------------------------------------------------------------------------------------
migration/47 244 0.000000 634 0 0.000000 112.216681 0.000000 1 /
ksoftirqd/47 245 12404.353193 29 120 0.000000 0.241945 0.000000 1 /
kworker/47:0 246 1072.586238 21 120 0.000000 0.105430 0.000000 1 /
kworker/47:0H 247 7740.331124 14 100 0.000000 0.096559 0.000000 1 /
kworker/47:1 625 13234.768505 302 120 0.000000 1.904559 0.000000 1 /
kworker/47:1H 99955 7752.330393 2 100 0.000000 0.033163 0.000000 1 /

On the host, you may have a openvswitch or a qemu task listed here. On the VM, you may have a thread for your VNF, like testpmd listed here. When that thread is interrupted, its 'switches' count will increment. Monitoring this over time and 'switche's not increasing will confirm that no other thread has interrupted it. Note that with non-RT kernels, IRQs still can interrupt this thread and will not be reflected here. So, it is then also important that you monitor /proc/interrupts to check if any are happening on this cpu.

When using isolcpus, the scheduler will not attempt to load balance tasks on those cpus. This is usually fine because there is only 1 task per cpu, and moving tasks to/from these cpus automatically is not desirable. One problem may arise, however, when using OpenStack. When starting a VM, the Nova service will set the list of allowed cpus for the Qemu emulator thread to the same cpulist that is used for all of the vcpus. For example, if the vcpu pinning is:

vcpu: host-cpu:
0 4
1 5
2 6
3 7

The Qemu emulator thread will only be able to run on host cpus 4-7. When using isolcpus, this thread will be initially placed on the first cpus in the list, in this case cpu4. And, with isolcpus present, it will never be load balanced to a different cpu. Therefore, the thread for vcpu0 and the emulator will have to share host cpu4. This in some cases can cause significant degrades in performance, especially during booting the VM. In order to resolve this problem, the emulator thread must be migrated to a different host cpu. You can achieve this with the virsh command on the host:

# virsh emulatorpin <vm-name>

This will show the current range of cpus allowed for the emulator thread. To change it, use:

# virsh emulatorpin vm1 --cpulist <cpulist>

Simply using the cpu list from the CPUAffinity option in systemd should work here:

# grep CPUAffinity /etc/systemd/system.conf
CPUAffinity=0 1 2 3

And now run the virsh command to update the cpulist for the emulator thread. Just make sure to use "," or "-" to describe the list of cpus:

# virsh emulatorpin <vm-name> --cpulist 1,2,3,4

This is not persistent, so any time a VM is started from Nova, the emulator thread will need to be moved. However, if you do not experience a degrade (boot-up time is adequate and you don't use vcpu0 for packet processing), this step is not necessary.

Comment 3 Daniel Berrangé 2016-11-23 12:04:54 UTC

(In reply to Andrew Theurer from comment #0)
> The Qemu emulator thread will only be able to run on host cpus 4-7.  When
> using isolcpus, this thread will be initially placed on the first cpus in
> the list, in this case cpu4.  And, with isolcpus present, it will never be
> load balanced to a different cpu.  Therefore, the thread for vcpu0 and the
> emulator will have to share host cpu4.  This in some cases can cause
> significant degrades in performance, especially during booting the VM.  In
> order to resolve this problem, the emulator thread must be migrated to a
> different host cpu.  You can achieve this with the virsh command on the host:
> 
> # virsh emulatorpin <vm-name>
> 
> This will show the current range of cpus allowed for the emulator thread. 
> To change it, use:
> 
> # virsh emulatorpin vm1 --cpulist <cpulist>
> 
> Simply using the cpu list from the CPUAffinity option in systemd should work
> here:
> 
> # grep CPUAffinity /etc/systemd/system.conf
> CPUAffinity=0 1 2 3
> 
> And now run the virsh command to update the cpulist for the emulator thread.
> Just make sure to use "," or "-" to describe the list of cpus:
> 
> # virsh emulatorpin <vm-name> --cpulist 1,2,3,4
> 
> 
> This is not persistent, so any time a VM is started from Nova, the emulator
> thread will need to be moved.  However, if you do not experience a degrade
> (boot-up time is adequate and you don't use vcpu0 for packet processing),
> this step is not necessary.

Manually changing CPU pinning behind Nova's back is *not* something that is considered a supported scenario from OpenStack Product POV. This is invisible to the Nova and thus may cause Nova schedular to make incorrect decisions for future guests, and is liable to be reverted by Nova during certain operations. IOW, if you do this and it subsequently breaks, you get to keep both pieces. As such we should *not* be documenting this as a supported configuration in context of OpenStack.

Comment 4 Franck Baudin 2016-11-23 12:35:24 UTC

(In reply to Daniel Berrange from comment #3)
> (In reply to Andrew Theurer from comment #0)
> > The Qemu emulator thread will only be able to run on host cpus 4-7.  When
> > using isolcpus, this thread will be initially placed on the first cpus in
> > the list, in this case cpu4.  And, with isolcpus present, it will never be
> > load balanced to a different cpu.  Therefore, the thread for vcpu0 and the
> > emulator will have to share host cpu4.  This in some cases can cause
> > significant degrades in performance, especially during booting the VM.  In
> > order to resolve this problem, the emulator thread must be migrated to a
> > different host cpu.  You can achieve this with the virsh command on the host:
> > 
> > # virsh emulatorpin <vm-name>
> > 
> > This will show the current range of cpus allowed for the emulator thread. 
> > To change it, use:
> > 
> > # virsh emulatorpin vm1 --cpulist <cpulist>
> > 
> > Simply using the cpu list from the CPUAffinity option in systemd should work
> > here:
> > 
> > # grep CPUAffinity /etc/systemd/system.conf
> > CPUAffinity=0 1 2 3
> > 
> > And now run the virsh command to update the cpulist for the emulator thread.
> > Just make sure to use "," or "-" to describe the list of cpus:
> > 
> > # virsh emulatorpin <vm-name> --cpulist 1,2,3,4
> > 
> > 
> > This is not persistent, so any time a VM is started from Nova, the emulator
> > thread will need to be moved.  However, if you do not experience a degrade
> > (boot-up time is adequate and you don't use vcpu0 for packet processing),
> > this step is not necessary.
> 
> Manually changing CPU pinning behind Nova's back is *not* something that is
> considered a supported scenario from OpenStack Product POV. This is
> invisible to the Nova and thus may cause Nova schedular to make incorrect
> decisions for future guests, and is liable to be reverted by Nova during
> certain operations. IOW, if you do this and it subsequently breaks, you get
> to keep both pieces. As such we should *not* be documenting this as a
> supported configuration in context of OpenStack.

Will https://bugzilla.redhat.com/show_bug.cgi?id=1298079 permit to configure this behavior in nova.conf?

Comment 5 Yariv 2016-12-06 21:33:59 UTC

(In reply to Andrew Theurer from comment #0)
> Description of problem:
> 
> Achieving zero-packet-loss for NFV workloads on Openstack with current
> guidelines may not always be achievable.  This will document further steps
> to achieve zero loss.  This BZ is not intended to track enhancements or
> fixes.
> 
> 
> Version-Release number of selected component (if applicable):
> 
> OSP-10
> openvswitch-2.5.0-*.git20160727
> RHEL 7.3
> 
> 
> Current guidelines make use of cpu-partitioning tuned profile for a OSP
> compute node and VM running in that compute node, which does the following:
> 
> 1) Uses boot options nohz_full and rcu_nocbs, specifying a list of cpus
> which are reserved for openvswitch and the VM vcpus
> 2) Uses systemd's CPUAffinity option to launch all user processes within a
> subset of online cpus, this list = inverse of the cpulist used for nohz_full
> and rcu_nocbs.  This prevents user processes from running on the cpus
> reserved for openvswitch and the VM.
> 3) Uses the IRQBALANCE_BANNED_CPUS= option for irqbalance to avoid sending
> interrupts to the reserved cpus.
> 
> Both oenvswitch and the VM's vcpus must be configured so that they use, 1
> thread per cpu, a cpu in the reserved cpus list.
> 
> All of this in an effort to reduce as much as possible any interruptions to
> these cpus while running either openvswitch, VM-vcpu, or the inside-VM VNF
> threads.  However, CPUAffinity cannot remove or prevent all kernel threads
> from running on these cpus.  In order to prevent [most] of the kernel
> threads, one must use the boot option "isolcpus=<cpulist>".  This uses the
> same cpu list as nohz_full and rcu_nocbs uses.  Isolcpus is engaged right at
> kernel boot, and thus can prevent many kernel threads from be scheduled on
> those cpus.
> 
> To enable isolcpus, first find out which cpus are being used for rcu_nocbs
> and nohz_full:
> 
> # cat /proc/cmdline
> 
> Look for the option, for example:
> 
> nohz_full=1,3,5,7,9,11,13
> 
> Use the cpulist and add isolcpus option.  One way to do this is with grubby
> utility:
> 
> grubby --update-kernel=`grubby --default-kernel` --args="isolcpus=<cpulist>"

For the host done already in Director first-boot.yaml


For guest:
script as following:

isol_cpus=`awk '{ for (i = 1; i <= NF; i++) if ($i ~ /nohz/) print $i };' proc-cmdline | cut -d"=" -f2`

if [ ! -z "$isol_cpus" ]; then
  grubby --update-kernel=grubby --default-kernel --args=isolcpus=$isol_cpus
fi

> 
> This should be done on both the host and the VM.  A reboot will be required
> for this option to take effect.
> 
> Note that not all kernel threads are removed from these cpus.  kworker,
> migration, and ksoftirqd will still be present on every cpu.  However,
> kworker and migration threads should not have to run, and ksoftirqd runs
> typically once per second for a period of up to 20 microseconds, as long as
> only 1 user task is running on that cpu.
> 
> After this option is configured and openvswitch and the VM is running, you
> can check to see what threads are scheduled on these cpus by looking at the
> contents of /proc/sched_debug (on both the host and the VM)
> 
> For each cpu, there will be a section called "runnable tasks" like the
> following:
> 
> runnable tasks:
>             task   PID         tree-key  switches  prio     wait-time       
> sum-exec        sum-sleep
> -----------------------------------------------------------------------------
> -----------------------------
>     migration/47   244         0.000000       634     0         0.000000    
> 112.216681         0.000000 1 /
>     ksoftirqd/47   245     12404.353193        29   120         0.000000    
> 0.241945         0.000000 1 /
>     kworker/47:0   246      1072.586238        21   120         0.000000    
> 0.105430         0.000000 1 /
>    kworker/47:0H   247      7740.331124        14   100         0.000000    
> 0.096559         0.000000 1 /
>     kworker/47:1   625     13234.768505       302   120         0.000000    
> 1.904559         0.000000 1 /
>    kworker/47:1H 99955      7752.330393         2   100         0.000000    
> 0.033163         0.000000 1 /
> 
> On the host, you may have a openvswitch or a qemu task listed here.  On the
> VM, you may have a thread for your VNF, like testpmd listed here.  When that
> thread is interrupted, its 'switches' count will increment.  Monitoring this
> over time and 'switche's not increasing will confirm that no other thread
> has interrupted it.  Note that with non-RT kernels, IRQs still can interrupt
> this thread and will not be reflected here.  So, it is then also important
> that you monitor /proc/interrupts to check if any are happening on this cpu.
> 
> When using isolcpus, the scheduler will not attempt to load balance tasks on
> those cpus.  This is usually fine because there is only 1 task per cpu, and
> moving tasks to/from these cpus automatically is not desirable.  One problem
> may arise, however, when using OpenStack.  When starting a VM, the Nova
> service will set the list of allowed cpus for the Qemu emulator thread to
> the same cpulist that is used for all of the vcpus.  For example, if the
> vcpu pinning is:
> 
> vcpu:  host-cpu:
> 0      4
> 1      5
> 2      6
> 3      7
> 
> The Qemu emulator thread will only be able to run on host cpus 4-7.  When
> using isolcpus, this thread will be initially placed on the first cpus in
> the list, in this case cpu4.  And, with isolcpus present, it will never be
> load balanced to a different cpu.  Therefore, the thread for vcpu0 and the
> emulator will have to share host cpu4.  This in some cases can cause
> significant degrades in performance, especially during booting the VM.  In
> order to resolve this problem, the emulator thread must be migrated to a
> different host cpu.  You can achieve this with the virsh command on the host:
> 
> # virsh emulatorpin <vm-name>
> 
> This will show the current range of cpus allowed for the emulator thread. 
> To change it, use:
> 
> # virsh emulatorpin vm1 --cpulist <cpulist>
> 
> Simply using the cpu list from the CPUAffinity option in systemd should work
> here:
> 
> # grep CPUAffinity /etc/systemd/system.conf
> CPUAffinity=0 1 2 3
> 
> And now run the virsh command to update the cpulist for the emulator thread.
> Just make sure to use "," or "-" to describe the list of cpus:
> 
> # virsh emulatorpin <vm-name> --cpulist 1,2,3,4
> 

#!/bin/bash

cpu_list=`grep -e "^CPUAffinity=.*" /etc/systemd/system.conf | sed -e 's/CPUAffinity=//' -e 's/ /,/'`
if [ ! -z "$cpu_list" ]; then
        virsh_list=`virsh list| sed -e '1,2d' -e 's/\s\+/ /g' | awk -F" " '{print $2}'`
        if [ ! -z "$virsh_list" ]; then
                for vm in $virsh_list; do virsh emulatorpin $vm --cpulist $cpu_list; done
        fi
fi



> 
> This is not persistent, so any time a VM is started from Nova, the emulator
> thread will need to be moved.  However, if you do not experience a degrade
> (boot-up time is adequate and you don't use vcpu0 for packet processing),
> this step is not necessary.

Comment 6 Yariv 2016-12-06 21:42:58 UTC

(In reply to Yariv from comment #5)
> (In reply to Andrew Theurer from comment #0)
> > Description of problem:
> > 
> > Achieving zero-packet-loss for NFV workloads on Openstack with current
> > guidelines may not always be achievable.  This will document further steps
> > to achieve zero loss.  This BZ is not intended to track enhancements or
> > fixes.
> > 
> > 
> > Version-Release number of selected component (if applicable):
> > 
> > OSP-10
> > openvswitch-2.5.0-*.git20160727
> > RHEL 7.3
> > 
> > 
> > Current guidelines make use of cpu-partitioning tuned profile for a OSP
> > compute node and VM running in that compute node, which does the following:
> > 
> > 1) Uses boot options nohz_full and rcu_nocbs, specifying a list of cpus
> > which are reserved for openvswitch and the VM vcpus
> > 2) Uses systemd's CPUAffinity option to launch all user processes within a
> > subset of online cpus, this list = inverse of the cpulist used for nohz_full
> > and rcu_nocbs.  This prevents user processes from running on the cpus
> > reserved for openvswitch and the VM.
> > 3) Uses the IRQBALANCE_BANNED_CPUS= option for irqbalance to avoid sending
> > interrupts to the reserved cpus.
> > 
> > Both oenvswitch and the VM's vcpus must be configured so that they use, 1
> > thread per cpu, a cpu in the reserved cpus list.
> > 
> > All of this in an effort to reduce as much as possible any interruptions to
> > these cpus while running either openvswitch, VM-vcpu, or the inside-VM VNF
> > threads.  However, CPUAffinity cannot remove or prevent all kernel threads
> > from running on these cpus.  In order to prevent [most] of the kernel
> > threads, one must use the boot option "isolcpus=<cpulist>".  This uses the
> > same cpu list as nohz_full and rcu_nocbs uses.  Isolcpus is engaged right at
> > kernel boot, and thus can prevent many kernel threads from be scheduled on
> > those cpus.
> > 
> > To enable isolcpus, first find out which cpus are being used for rcu_nocbs
> > and nohz_full:
> > 
> > # cat /proc/cmdline
> > 
> > Look for the option, for example:
> > 
> > nohz_full=1,3,5,7,9,11,13
> > 
> > Use the cpulist and add isolcpus option.  One way to do this is with grubby
> > utility:
> > 
> > grubby --update-kernel=`grubby --default-kernel` --args="isolcpus=<cpulist>"
> 
> For the host done already in Director first-boot.yaml
> 
> 
> For guest:
> script as following:
> 
 isol_cpus=`awk '{ for (i = 1; i <= NF; i++) if ($i ~ /nohz/) print $i };'
 /proc/cmdline | cut -d"=" -f2`
 
 if [ ! -z "$isol_cpus" ]; then
   grubby --update-kernel=grubby --default-kernel --args=isolcpus=$isol_cpus
 fi
> 
> > 
> > This should be done on both the host and the VM.  A reboot will be required
> > for this option to take effect.
> > 
> > Note that not all kernel threads are removed from these cpus.  kworker,
> > migration, and ksoftirqd will still be present on every cpu.  However,
> > kworker and migration threads should not have to run, and ksoftirqd runs
> > typically once per second for a period of up to 20 microseconds, as long as
> > only 1 user task is running on that cpu.
> > 
> > After this option is configured and openvswitch and the VM is running, you
> > can check to see what threads are scheduled on these cpus by looking at the
> > contents of /proc/sched_debug (on both the host and the VM)
> > 
> > For each cpu, there will be a section called "runnable tasks" like the
> > following:
> > 
> > runnable tasks:
> >             task   PID         tree-key  switches  prio     wait-time       
> > sum-exec        sum-sleep
> > -----------------------------------------------------------------------------
> > -----------------------------
> >     migration/47   244         0.000000       634     0         0.000000    
> > 112.216681         0.000000 1 /
> >     ksoftirqd/47   245     12404.353193        29   120         0.000000    
> > 0.241945         0.000000 1 /
> >     kworker/47:0   246      1072.586238        21   120         0.000000    
> > 0.105430         0.000000 1 /
> >    kworker/47:0H   247      7740.331124        14   100         0.000000    
> > 0.096559         0.000000 1 /
> >     kworker/47:1   625     13234.768505       302   120         0.000000    
> > 1.904559         0.000000 1 /
> >    kworker/47:1H 99955      7752.330393         2   100         0.000000    
> > 0.033163         0.000000 1 /
> > 
> > On the host, you may have a openvswitch or a qemu task listed here.  On the
> > VM, you may have a thread for your VNF, like testpmd listed here.  When that
> > thread is interrupted, its 'switches' count will increment.  Monitoring this
> > over time and 'switche's not increasing will confirm that no other thread
> > has interrupted it.  Note that with non-RT kernels, IRQs still can interrupt
> > this thread and will not be reflected here.  So, it is then also important
> > that you monitor /proc/interrupts to check if any are happening on this cpu.
> > 
> > When using isolcpus, the scheduler will not attempt to load balance tasks on
> > those cpus.  This is usually fine because there is only 1 task per cpu, and
> > moving tasks to/from these cpus automatically is not desirable.  One problem
> > may arise, however, when using OpenStack.  When starting a VM, the Nova
> > service will set the list of allowed cpus for the Qemu emulator thread to
> > the same cpulist that is used for all of the vcpus.  For example, if the
> > vcpu pinning is:
> > 
> > vcpu:  host-cpu:
> > 0      4
> > 1      5
> > 2      6
> > 3      7
> > 
> > The Qemu emulator thread will only be able to run on host cpus 4-7.  When
> > using isolcpus, this thread will be initially placed on the first cpus in
> > the list, in this case cpu4.  And, with isolcpus present, it will never be
> > load balanced to a different cpu.  Therefore, the thread for vcpu0 and the
> > emulator will have to share host cpu4.  This in some cases can cause
> > significant degrades in performance, especially during booting the VM.  In
> > order to resolve this problem, the emulator thread must be migrated to a
> > different host cpu.  You can achieve this with the virsh command on the host:
> > 
> > # virsh emulatorpin <vm-name>
> > 
> > This will show the current range of cpus allowed for the emulator thread. 
> > To change it, use:
> > 
> > # virsh emulatorpin vm1 --cpulist <cpulist>
> > 
> > Simply using the cpu list from the CPUAffinity option in systemd should work
> > here:
> > 
> > # grep CPUAffinity /etc/systemd/system.conf
> > CPUAffinity=0 1 2 3
> > 
> > And now run the virsh command to update the cpulist for the emulator thread.
> > Just make sure to use "," or "-" to describe the list of cpus:
> > 
> > # virsh emulatorpin <vm-name> --cpulist 1,2,3,4
> > 
> 
#!/bin/bash
 
 cpu_list=`grep -e "^CPUAffinity=.*" /etc/systemd/system.conf | sed -e
 's/CPUAffinity=//' -e 's/ /,/'`
 if [ ! -z "$cpu_list" ]; then
         virsh_list=`virsh list| sed -e '1,2d' -e 's/\s\+/ /g' | awk -F" "
 '{print $2}'`
         if [ ! -z "$virsh_list" ]; then
                 for vm in $virsh_list; do virsh emulatorpin $vm --cpulist
 $cpu_list; done
         fi
 fi
> 
> 
> 
> > 
> > This is not persistent, so any time a VM is started from Nova, the emulator
> > thread will need to be moved.  However, if you do not experience a degrade
> > (boot-up time is adequate and you don't use vcpu0 for packet processing),
> > this step is not necessary.

Comment 7 Andrew Theurer 2016-12-07 22:37:22 UTC

In regards to comment 3, we do not recommend anyone re-pin the emulator thread unless the user experiences specific performance problems.  We realize this is not desirable.  I do believe https://review.openstack.org/#/c/284094/10/specs/ocata/approved/libvirt-emulator-threads-policy.rst will completely eliminate any desire to manually pin emulator threads.