Red Hat Bugzilla – Bug 1068627
implement lazy save/restore of debug registers
Last modified: 2015-03-05 06:40:03 EST
Right now, KVM takes a vmexit for each debug register access. However, debug register accesses usually come in batches of 15-20 accesses, and at ~0.5 microseconds per access they quickly add up. We can batch debug accesses by setting a flag on the first access and synchronizing all accesses on the next vmexit (whatever the reason for that vmexit is).
The game Borderlands 2 running in a VM with an assigned Quadro GPU is a good test of this problem. Instructions for enabling the console in the game can be found here: http://forums.gearboxsoftware.com/showpost.php?p=2763900&postcount=1 After enabling, start the game and enable FPS display with 'stat fps' in the game console. The current FPS and time per frame are shown in the upper right side of the screen. Start the game and note the FPS. If we avoid debug register access exits, the FPS will double. A dirty tracking, lazy restore implementation should see nearly similar results. The hv-time cpu option is also useful for tuning this application.
Should also include commits 8246bf52c75aa9b9b336a84f31ed2248754d0f71 and 73aaf249ee2287b4686ff079dcbdbbb658156e64 to bring debug register support in par with upstream. Regarding testing, there is a testcase in kvm-unit-tests' vmexit test (mov_to_dr) that should see a large improvement after the patches.
Patch(es) available on kernel-3.10.0-143.el7
Trying to test this with GPU passthrough, but hit: Bug 1163757 - GPU passthrough with Quadro K5000 on HP Z620 host fails to work I am wondering any other way to test his? Bests,
If you run the vmexit.flat test from kvm-unit-tests, the value for the "mov_dr" test will be higher in RHEL7.0 than RHEL7.1 (lower is better).
All vmexit.flat related test, not sure which is key value: 3.10.0-142.el7.x86_64: qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 1 -append cpuid enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 cpuid 3549 Return value from qemu: 1 qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 1 -append vmcall enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 vmcall 3439 Return value from qemu: 1 qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 1 -append mov_from_cr8 enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 mov_from_cr8 11 Return value from qemu: 1 qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 1 -append mov_to_cr8 enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 mov_to_cr8 15 Return value from qemu: 1 qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 1 -append inl_from_pmtimer enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 inl_from_pmtimer 20422 Return value from qemu: 1 qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 2 -append ipi enabling apic enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 ipi 10661 Return value from qemu: 1 qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 2 -append ipi_halt enabling apic enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 Return value from qemu: 1 qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 1 -append ple_round_robin enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 Return value from qemu: 1 kernel-3.10.0-205.el7.x86_64: qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 1 -append cpuid enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 cpuid 3142 Return value from qemu: 1 qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 1 -append vmcall enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 vmcall 2986 Return value from qemu: 1 qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 1 -append mov_from_cr8 enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 mov_from_cr8 11 Return value from qemu: 1 qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 1 -append mov_to_cr8 enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 mov_to_cr8 15 Return value from qemu: 1 qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 1 -append inl_from_pmtimer enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 inl_from_pmtimer 19984 Return value from qemu: 1 qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 2 -append ipi enabling apic enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 ipi 10015 Return value from qemu: 1 qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 2 -append ipi_halt enabling apic enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 Return value from qemu: 1 qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 1 -append ple_round_robin enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 Return value from qemu: 1
Can you try running it without -append? If you do not get mov_dr, you need an updated vmexit.flat.
3.10.0-205.el7.x86_64: # qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 1 enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 cpuid 4551 vmcall 4442 mov_from_cr8 11 mov_to_cr8 15 inl_from_pmtimer 39678 inl_from_qemu 39892 inl_from_kernel 15182 outl_to_kernel 6164 mov_dr 118 ... 3.10.0-142.el7.x86_64: # qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel x86/vmexit.flat -smp 1 enabling apic paging enabled cr0 = 80010011 cr3 = 7fff000 cr4 = 20 pci-testdev at 0x20 membar febf1000 iobar c000 cpuid 3351 vmcall 3251 mov_from_cr8 11 mov_to_cr8 15 inl_from_pmtimer 20103 inl_from_qemu 20082 inl_from_kernel 6421 outl_to_kernel 3846 mov_dr 3357 ... 118 VS 3357, lower is better, verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0290.html