Bug 1331113 - kvm+THP/corruption in 4.5.x kernel
Summary: kvm+THP/corruption in 4.5.x kernel
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel-aarch64
Version: 7.3
Hardware: aarch64
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Andrew Jones
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 1331092
Blocks: 1174832
TreeView+ depends on / blocked
 
Reported: 2016-04-27 18:45 UTC by Andrew Jones
Modified: 2016-06-13 17:35 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 1331092
Environment:
Last Closed: 2016-06-13 17:35:28 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Andrew Jones 2016-04-27 18:45:29 UTC
+++ This bug was initially created as a clone of Bug #1331092 +++

Description of problem:
Hi,
  There's a KVM+THP corruption in 4.5 (and current 4.6) kernels that i've triggered on Fedora while testing the QEMU Postcopy feature.  There's a chance it might trigger with the use of the more common Balloon feature as well, so it's probably worth fixing since the symptom is a random guest memory corruption

Fixed by Andrea Arcangeli's patch:
[PATCH 1/1] mm: thp: kvm: fix memory corruption in KVM with THP enabled
  currently being discussed on lkml/linux-mm/qemu-devel

Version-Release number of selected component (if applicable):
4.5, 4.6 kernels

How reproducible:
ah well, that's rather complicated;  I can reproduce it 100% in a nest, and about 1/1000 runs of my test suite on a real host.  It disappears if you turn THP off or include the patch noted.

Steps to Reproduce:
I've seen reports that migrating a busy guest using postcopy will hang on 4.5;
but I'm about to post tests/postcopy-test for qemu.  I run it repeatedly and it fails the first time on a VM but randomly on hardware.

Actual results:
A migrated VM whose contents aren't quite the same as the source.

Expected results:
A nice happy migrated VM

Additional info:

Comment 1 Andrew Jones 2016-06-13 17:35:28 UTC
I see we don't have THP enabled on the RHELSA kernel. Closing as won't fix.


Note You need to log in before you can comment on or make changes to this bug.