Description of problem: When running a large SEV VM, ksmd can be seen chugginng away at a high CPU usage with no hope of ever actually merging pages Version-Release number of selected component (if applicable): qemu-kvm-common-4.1.0-14.module+el8.1.0+5346+c31201bb.1.x86_64 How reproducible: 100%? Steps to Reproduce: 1. Start a SEV VM on a host with lots of RAM and give the guest lots of RAM (I used a 200GB guest in my case, but I doubt it needs to be that big) 2. start 'top' on the host, while leaving the guest idle Actual results: KSM is constantly using a considerable amount of CPU, it started off at abotu 20% for me, but rose to 70% (of a core) constantly for over half an hour. Expected results: Sane ksm usage Additional info: SEV encrypts pages, meaning that the host kernel never sees real page data, and the data looks random, so it can't really merge it. We should probably turn off 'mem-merge' on SEV VMs.
Posted upstream fix: [PATCH] machine/memory encryption: Disable mem merge
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks
This is now merged upstream as 4ba59be1d6d8c57941841a505cb4656628d582d0 Given that disabling ksm manually is an OK work around, I don't intend to backport it unless someone requests it. Moving to post and marking fixed in 5.0
Reproduce bug with qemu-kvm-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64. As the tested machine has 64GB memory, I installed vm with 50G RAM. After 40 mins, it takes 13% CPU usage: Version: kernel-4.18.0-193.13.2.el8_2.x86_64 qemu-kvm-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64 Steps: 1. start a vm with 50GB RAM, and leave it idle. 2. systemctl start/enable ksm 3. systemctl status ksm checked its status is enabled. 4. start top on host 5. wait for 40 mins Results: After Step 4, ksmd usage is around 1.6% CPU usage, rising up to approximately 13%. Verified bug with qemu-kvm-5.0.0-0.module+el8.3.0+6620+5d5e1420.x86_64, using top command didn't capture ksmd cpu usage. Version: kernel-4.18.0-224.el8.x86_64 qemu-kvm-5.0.0-2.module+el8.3.0+7379+0505d6ca.x86_64 Steps: 1. start a vm with 50GB RAM, and leave it idle. 2. systemctl start/enable ksm 3. systemctl status ksm checked its status is enabled. 4. start top on host 5. wait for 40 mins Actual Result: After Step 4 and Step 5, ksm service status is active,start top command can't see ksmd cpu usage both from the beginning and after 40 mins. ● ksm.service - Kernel Samepage Merging Loaded: loaded (/usr/lib/systemd/system/ksm.service; enabled; vendor preset: enabled) Active: active (exited) since Tue 2020-07-21 08:22:25 EDT; 1min 18s ago Main PID: 36612 (code=exited, status=0/SUCCESS) Tasks: 0 (limit: 407449) Memory: 0B CGroup: /system.slice/ksm.service Jul 21 08:22:25 dell-per7425-02.khw.lab.eng.bos.redhat.com systemd[1]: Starting Kernel Samepage Merging... Jul 21 08:22:25 dell-per7425-02.khw.lab.eng.bos.redhat.com systemd[1]: Started Kernel Samepage Merging. *************************************************************************************************************** From the beginning: Tasks: 989 total, 2 running, 987 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 63865.6 total, 7913.6 free, 53241.7 used, 2710.3 buff/cache MiB Swap: 32096.0 total, 32096.0 free, 0.0 used. 9981.1 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 34197 qemu 20 0 55.0g 50.1g 22052 S 1.3 80.3 1:56.46 qemu-kvm 36989 root 20 0 62684 5660 3784 R 1.0 0.0 0:01.74 top 34252 root 20 0 0 0 0 S 0.3 0.0 0:01.03 kvm-pit/34197 1 root 20 0 247864 14756 9412 S 0.0 0.0 0:06.99 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.06 kthreadd **************************************************************************************************************** After 60 mins: Tasks: 995 total, 1 running, 994 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 63865.6 total, 7879.5 free, 53266.7 used, 2719.4 buff/cache MiB Swap: 32096.0 total, 32096.0 free, 0.0 used. 9954.3 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 34197 qemu 20 0 55.0g 50.1g 22052 S 2.3 80.3 2:44.95 qemu-kvm 38106 root 20 0 62684 5680 3796 R 0.7 0.0 0:14.98 top 2304 root 20 0 125380 6028 4908 S 0.3 0.0 0:04.59 irqbalance 1 root 20 0 247864 14756 9412 S 0.0 0.0 0:07.01 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.07 kthreadd Expected results: Sane ksm usage Needinfo: could you please check the test steps and the actual result is as expected, as ksmd is not monitored in the cpu usage, thank you.
(In reply to zixchen from comment #8) > Needinfo: could you please check the test steps and the actual result is as > expected, as ksmd is not monitored in the cpu usage, thank you. The test needs to be running the VM with SEV enabled - are you doing that?
(In reply to Dr. David Alan Gilbert from comment #9) The test needs to be running the VM with SEV enabled - are you doing that? yes, sev is enabled in the VM. Steps: 1. ssh login to the VM. 2. dmesg | grep sev After Step2, [ 0.001000] AMD Secure Encrypted Virtualization (SEV) active
OK, then great, that test is fine.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:8.3 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:5137