Hide Forgot
Description of problem: Hi, There's a KVM+THP corruption in 4.5 (and current 4.6) kernels that i've triggered on Fedora while testing the QEMU Postcopy feature. There's a chance it might trigger with the use of the more common Balloon feature as well, so it's probably worth fixing since the symptom is a random guest memory corruption Fixed by Andrea Arcangeli's patch: [PATCH 1/1] mm: thp: kvm: fix memory corruption in KVM with THP enabled currently being discussed on lkml/linux-mm/qemu-devel Version-Release number of selected component (if applicable): 4.5, 4.6 kernels How reproducible: ah well, that's rather complicated; I can reproduce it 100% in a nest, and about 1/1000 runs of my test suite on a real host. It disappears if you turn THP off or include the patch noted. Steps to Reproduce: I've seen reports that migrating a busy guest using postcopy will hang on 4.5; but I'm about to post tests/postcopy-test for qemu. I run it repeatedly and it fails the first time on a VM but randomly on hardware. Actual results: A migrated VM whose contents aren't quite the same as the source. Expected results: A nice happy migrated VM Additional info:
This patch has been added to rawhide and f24 kernels. It should make the next build.
kernel-4.5.3-300.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-4ce97823af
It's worth keeping an eye out for other THP fixes going in; I know there's at least one other one being discussed effecting VFIO. Dave
kernel-4.5.3-300.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-4ce97823af
kernel-4.5.3-300.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.
I can confirm 4.5.3-300.fc24 fixes it for me.