Bug 1331092 - kvm+THP/corruption in 4.5.x kernel
Summary: kvm+THP/corruption in 4.5.x kernel
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 24
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1331113
TreeView+ depends on / blocked
 
Reported: 2016-04-27 17:11 UTC by Dr. David Alan Gilbert
Modified: 2016-05-09 09:21 UTC (History)
8 users (show)

Fixed In Version: kernel-4.5.3-300.fc24
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1331113 (view as bug list)
Environment:
Last Closed: 2016-05-08 10:29:38 UTC
Type: Bug


Attachments (Terms of Use)

Description Dr. David Alan Gilbert 2016-04-27 17:11:51 UTC
Description of problem:
Hi,
  There's a KVM+THP corruption in 4.5 (and current 4.6) kernels that i've triggered on Fedora while testing the QEMU Postcopy feature.  There's a chance it might trigger with the use of the more common Balloon feature as well, so it's probably worth fixing since the symptom is a random guest memory corruption

Fixed by Andrea Arcangeli's patch:
[PATCH 1/1] mm: thp: kvm: fix memory corruption in KVM with THP enabled
  currently being discussed on lkml/linux-mm/qemu-devel

Version-Release number of selected component (if applicable):
4.5, 4.6 kernels

How reproducible:
ah well, that's rather complicated;  I can reproduce it 100% in a nest, and about 1/1000 runs of my test suite on a real host.  It disappears if you turn THP off or include the patch noted.

Steps to Reproduce:
I've seen reports that migrating a busy guest using postcopy will hang on 4.5;
but I'm about to post tests/postcopy-test for qemu.  I run it repeatedly and it fails the first time on a VM but randomly on hardware.

Actual results:
A migrated VM whose contents aren't quite the same as the source.

Expected results:
A nice happy migrated VM

Additional info:

Comment 1 Justin M. Forbes 2016-04-28 19:28:07 UTC
This patch has been added to rawhide and f24 kernels. It should make the next build.

Comment 2 Fedora Update System 2016-05-05 12:15:47 UTC
kernel-4.5.3-300.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-4ce97823af

Comment 3 Dr. David Alan Gilbert 2016-05-05 12:31:19 UTC
It's worth keeping an eye out for other THP fixes going in; I know there's at least one other one being discussed effecting VFIO.

Dave

Comment 4 Fedora Update System 2016-05-06 11:28:27 UTC
kernel-4.5.3-300.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-4ce97823af

Comment 5 Fedora Update System 2016-05-08 10:28:48 UTC
kernel-4.5.3-300.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.

Comment 6 Dr. David Alan Gilbert 2016-05-09 09:21:02 UTC
I can confirm 4.5.3-300.fc24 fixes it for me.


Note You need to log in before you can comment on or make changes to this bug.