Bug 1011786

Summary: Virt Tuning/Optimization Guide: Feedback from Perf Team to Integrate
Product: Red Hat Enterprise Linux 7 Reporter: Dayle Parker <dayleparker>
Component: doc-Virtualization_Tuning_and_Optimization_GuideAssignee: Dayle Parker <dayleparker>
Status: CLOSED CURRENTRELEASE QA Contact: ecs-bugs
Severity: high Docs Contact:
Priority: high    
Version: 7.0CC: bgray, jeder, lnovich, perfbz
Target Milestone: rcKeywords: Documentation
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 912431
: 1126115 (view as bug list) Environment:
Last Closed: 2014-06-13 06:05:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 912431    
Bug Blocks: 1126115    
Deadline: 2014-04-04   

Comment 4 Bill Gray 2014-04-03 18:38:28 UTC
2.3.1 Option: Available CPUs

[[ Some background info follows, but you should probably rewrite it if you decide to use any of it. ]]

CPUs are overcommitted if the sum of VCPUs for all guests on the system is greater than the number of native CPUs on the system.  You can overcommit CPUs with one or multiple guests, if the total number of VCPUs is greater than the number of host CPUs.

If guests will not be simultaneously busy, it might be fine to overcommit resources, knowing that not all CPUs will be busy at the same time.  If there will be heavy or unpredicatable load on the guests, overcommitting resources can lead to poor performance.

If hyper-threads are active on the host, we should distinguish between logical CPUs and physical CPUs, since they share hardware resources.  Though a hyper-thread (aka logical) CPU is counted as a real CPU, there is actually little hardware to back them up.  Depending on the application, the best performance might be achieved by limiting the number of VCPUs to the number of real, physical CPUs on the host.

If possible, best performance on NUMA systems can be achieved by limiting guest size to the amount of resources on a single NUMA node.   If small to medium guests are being used, try to size them so they pack evenly into NUMA nodes, and avoid necessarily splitting across NUMA nodes.


2.3.4. Option: CPU Pinning:

[[ It is not clear where a NUMA mask info would be used here. ]]


7.4: libvirt NUMA Tuning

[[ will want to use "# numastat -cm qemu"  -- still need to provide output ]]


7.5: in the Important box:

[[ "When KSM is used on a NUMA host..." should be clarified to be "When KSM is merging across nodes on a NUMA host..."  This change will make it more obviously an eplanation of the impact of the option in the previous paragraph.  ]]

Comment 6 Bill Gray 2014-04-15 20:26:28 UTC
Numastat showing some virtual guests with mediocre memory alignment:

numastat -c qemu-kvm

Per-node process memory usage (in MBs)
PID              Node 0 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Total
---------------  ------ ------ ------ ------ ------ ------ ------ ------ -----
51722 (qemu-kvm)     68     16    357   6936      2      3    147    598  8128
51747 (qemu-kvm)    245     11      5     18   5172   2532      1     92  8076
53736 (qemu-kvm)     62    432   1661    506   4851    136     22    445  8116
53773 (qemu-kvm)   1393      3      1      2     12      0      0   6702  8114
---------------  ------ ------ ------ ------ ------ ------ ------ ------ -----
Total              1769    463   2024   7462  10037   2672    169   7837 32434


and after running numad to align the guests' CPUs and memory resources:

# numastat -c qemu-kvm

Per-node process memory usage (in MBs)
PID              Node 0 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Total
---------------  ------ ------ ------ ------ ------ ------ ------ ------ -----
51747 (qemu-kvm)      0      0      7      0   8072      0      1      0  8080
53736 (qemu-kvm)      0      0      7      0      0      0   8113      0  8120
53773 (qemu-kvm)      0      0      7      0      0      0      1   8110  8118
59065 (qemu-kvm)      0      0   8050      0      0      0      0      0  8051
---------------  ------ ------ ------ ------ ------ ------ ------ ------ -----
Total                 0      0   8072      0   8072      0   8114   8110 32368


The "-c" option just makes the display somewhat more compact.  See numastat manpage.  The "-m" option would just add system-wide memory information displayed on a per-node basis, for example:

# numastat -cm 

Per-node system memory usage (in MBs):
                 Node 0 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7   Total
                 ------ ------ ------ ------ ------ ------ ------ ------ -------
MemTotal         131061 131072 131072 131072 131072 131072 131072 131072 1048565
MemFree          126550 128178 128244 127849 128275 128209 128273 127545 1023123
MemUsed            4511   2894   2828   3223   2797   2863   2799   3527   25442
Active             1305     30     29    298      1     53      8    544    2269
Inactive              8      5      1     68      8      7      4     60     160
Active(anon)          3      7     19      9      0     21      4     15      79
Inactive(anon)        0      0      0      0      8      0      0      8      16
Active(file)       1302     23     10    288      1     32      5    529    2191
Inactive(file)        8      5      1     67      0      7      4     52     144
Unevictable           0      0      0      0      0      0      0      0       0
Mlocked               0      0      0      0      0      0      0      0       0
Dirty                 0      0      0      0      0      0      0      0       0
Writeback             0      0      0      0      0      0      0      0       0
FilePages          1311     28     11    356     10     39      9    588    2352
Mapped                0      0      4      0      8      1      3     16      32
AnonPages             3      7     19      9      0     21      3     15      78
Shmem                 0      0      0      0      8      0      0      8      17
KernelStack           5      1      1      1      1      1      1      1      10
PageTables            0      0      1      1      0      1      1      1       5
NFS_Unstable          0      0      0      0      0      0      0      0       0
Bounce                0      0      0      0      0      0      0      0       0
WritebackTmp          0      0      0      0      0      0      0      0       0
Slab                 82     78     23     76     22     34     19    127     460
SReclaimable         47     56      3     48      5      9      3     96     267
SUnreclaim           34     23     20     28     17     24     16     31     193
AnonHugePages         0      0      0      0      0      0      0      6       6
HugePages_Total       0      0      0      0      0      0      0      0       0
HugePages_Free        0      0      0      0      0      0      0      0       0
HugePages_Surp        0      0      0      0      0      0      0      0       0