Bug 1449037 - Dst qemu quit when migrate guest with hugepage and total memory is not a multiple of pagesize
Summary: Dst qemu quit when migrate guest with hugepage and total memory is not a mult...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.4
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Dr. David Alan Gilbert
QA Contact: xianwang
URL:
Whiteboard:
Depends On:
Blocks: 1376765
TreeView+ depends on / blocked
 
Reported: 2017-05-09 06:22 UTC by Yumei Huang
Modified: 2017-08-02 04:38 UTC (History)
11 users (show)

Fixed In Version: qemu-kvm-rhev-2.9.0-6.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-02 04:38:29 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:2392 normal SHIPPED_LIVE Important: qemu-kvm-rhev security, bug fix, and enhancement update 2017-08-01 20:04:36 UTC

Description Yumei Huang 2017-05-09 06:22:19 UTC
Description of problem:
Boot guest with hugepage and guest total memory is not a multiple of pagesize, then do local migration, migration fail and destination qemu process quit with error message:

(qemu) qemu-kvm: Illegal RAM offset 40100000
qemu-kvm: error while loading state section id 4(ram)
qemu-kvm: load of migration failed: Invalid argument


Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.9.0-3.el7
3.10.0-661.el7.x86_64

How reproducible:
always

Steps to Reproduce:
1. Set hugepage on host

# cat /proc/meminfo  | grep -i huge
AnonHugePages:   1705984 kB
HugePages_Total:    3000
HugePages_Free:     3000
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

# mount | grep mnt
none on /mnt/kvm_hugepage type hugetlbfs (rw,relatime,seclabel)


2. Boot src guest with hugepage and '-m 1025'

# /usr/libexec/qemu-kvm -m 1025 rhel74-1-1.qcow2 -mem-path /mnt/kvm_hugepage -monitor stdio -vnc :0


3. Boot dst guest with same cmdline and '-incoming tcp:0:5555'

# /usr/libexec/qemu-kvm -m 1025 rhel74-1-1.qcow2 -mem-path /mnt/kvm_hugepage -monitor stdio -vnc :1 -incoming tcp:0:5555

4. Do migration from src guest
(qemu)  migrate -d tcp:127.0.0.1:5555


Actual results:
Dst qemu quit with:
(qemu) qemu-kvm: Illegal RAM offset 40100000
qemu-kvm: error while loading state section id 4(ram)
qemu-kvm: load of migration failed: Invalid argument


Expected results:
Migration success and guest work well.

Additional info:
1. Hit same issue with both 2M and 1G hugepage
2. Can NOT reproduce with qemu-kvm-rhev-2.6.0-28.el7

Comment 4 Dr. David Alan Gilbert 2017-05-16 17:32:25 UTC
Yes, I can repeat that here.  2.6->2.9 works.
                              2.9->2.9 fails
                              2.9->2.6 fails

I think the problem is that the new code makes sure it sends whole hugepages, but in this case the usedlength is probably not a multiple of a hugepage.

There's then a fun question of what happens on postcopy.

Comment 5 Dr. David Alan Gilbert 2017-05-17 17:04:54 UTC
Fixes posted upstream:

0001-migration-Fix-non-multiple-of-page-size-migration.patch
0002-postcopy-Require-RAMBlocks-that-are-whole-pages.patch

Comment 6 Dr. David Alan Gilbert 2017-05-18 11:21:44 UTC
Upstream has Rb's - posted downstream while waiting for the merge.

Comment 7 Miroslav Rezanina 2017-05-23 08:15:48 UTC
Fix included in qemu-kvm-rhev-2.9.0-6.el7

Comment 9 Min Deng 2017-06-01 08:55:15 UTC
QE reproduced the bug on builds
qemu-kvm-rhev-2.9.0-3.el7
Steps,please refer to comment0
Actual results,
[root@hp-dl385pg8-13 home]# /usr/libexec/qemu-kvm -m 1025 rhel74.qcow2 -mem-path /mnt/kvm_hugepage -monitor stdio -vnc :1 -incoming tcp:0:5555
QEMU 2.9.0 monitor - type 'help' for more information
(qemu) qemu-kvm: Illegal RAM offset 40100000
qemu-kvm: error while loading state section id 4(ram)
qemu-kvm: load of migration failed: Invalid argument
Expected results,
There is no error and migration should succeed

QE verified the bug on the builds
qemu-kvm-rhev-2.9.0-7.el7.x86_64
kernel-3.10.0-671.el7.x86_64
Steps please refer to comment0
Actual results,
Migration succeeded
Expected results,
Migration succeeded 

In brief,the bug has been fixed already,thanks.

Comment 10 Min Deng 2017-06-01 08:56:21 UTC
Base on comment9 so QE move it to verified

Comment 12 errata-xmlrpc 2017-08-02 04:38:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392


Note You need to log in before you can comment on or make changes to this bug.