Hide Forgot
Description of problem: STEAL time doesn't work on POWER Version-Release number of selected component (if applicable): kernel-3.10.0-495.el7.ppc64le qemu-kvm-rhev-2.6.0-22.el7.ppc64le RHEL7.3 - with kernel-3.10.0-495.el7.ppc64le How reproducible: 5/5 Steps to Reproduce: Settings #mount -t cgroup -o cpuset cpuset /cgroup #cd /cgroup 1. Create cgroups # mkdir cpuset1 2. set cpus/mems # echo 0 > cpuset1/cpuset.cpus [1 means the host cpu 1] # echo 0 > cpuset1/cpuset.mems [0 means the host numa node 0] or # echo 120 > cpuset1/cpuset.cpus # echo 1 > cpuset1/cpuset.mems My hosts available: 2 nodes (0-1) node 0 cpus: 0 8 16 24 32 40 48 56 node 0 size: 131072 MB node 0 free: 121412 MB node 1 cpus: 64 72 80 88 96 104 112 120 node 1 size: 131072 MB node 1 free: 126452 MB node distances: node 0 1 0: 10 40 1: 40 10 3.Boot two guests /usr/libexec/qemu-kvm -smp 1... root 6252 41.2 0.6 2446592 1845056 pts/2 SLl+ 01:18 5:06 /usr/libexec/qemu-kvm -name virt-tests-vm1 -sandbox off -machine pseries -nodefaults -vga std -serial unix:/tmp/socket-mazhang,server,nowait -qmp tcp:0:2221,server,nowait -m 2G -smp 1 -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=06,disable-legacy=off,disable-modern=on -drive id=drive_disk1,if=none,snapshot=off,aio=threads,file=/home/RHEL2.qcow2 -device scsi-hd,id=disk1,drive=drive_disk1,bootindex=0 -vnc :0 -rtc base=utc,clock=host -boot menu=on -enable-kvm -monitor stdio -device virtio-mouse-pci,id=mouse0 -device virtio-keyboard-pci,id=kbd0 -chardev pty,id=pty0 root 6274 43.1 0.6 2406528 1842496 pts/0 SLl+ 01:18 5:13 /usr/libexec/qemu-kvm -name virt-tests-vm1 -sandbox off -machine pseries -nodefaults -vga std -serial unix:/tmp/socket-mazhang,server,nowait -qmp tcp:0:6661,server,nowait -m 2G -smp 1 -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=06,disable-legacy=off,disable-modern=on -drive id=drive_disk1,if=none,snapshot=off,aio=threads,file=/home/RHEL.qcow2 -device scsi-hd,id=disk1,drive=drive_disk1,bootindex=0 -vnc :1 -rtc base=utc,clock=host -boot menu=on -enable-kvm -monitor stdio -device virtio-mouse-pci,id=mouse0 -device virtio-keyboard-pci,id=kbd0 -chardev pty,id=pty0 3. echo two guess pid to tasks #echo xxx > cpuset1/tasks (contain threads) 4. Run stress in both guests. EG: for((;;));do x=1;done 5.Check the steal time inside both guests #top Actual results: All st is zero % Expected results: The two guests' st time both about 50% Additional info: Communicate with x86 guys,they cannot reproduce it on x86 platform.Any issues please let me know.
I've discussed this with Paul Mackerras at IBM, and I believe we've determined the cause. This is a side-effect of the way that in normal operation a POWER host can run more guest threads than host threads - that's because hardware-level threads can be used in the guest, but not in the host, due to restrictions of the virtualization hardware. More specifically, the dynamic multithreading code we include means that although both VMs are bound to the same host thread, they could actually run on different "subcores" of the host core. When this is the case, it won't get accounted as stolen time (the two VMs still may affect each others' performance, but how much depends on whether the workloads on each are using the same functional units in the CPU, so it can't be measured as an amount of time). The test case will need to be adjusted for Power: there are two obvious ways to do this: 1) Disable dynamic multi-threading: echo 0 >/sys/module/kvm_hv/parameters/dynamic_mt_modes With this executed before performing the test, the stolen time results should be as expected. 2) Increase each VM to 8 threads, and run 8 stress threads on each VM This ensures that each VM occupies a whole host core, so they can't be packed onto the same core at the same time.