Bug 1242940

Summary: balloon size is incorrect after restore/migration
Product: Red Hat Enterprise Linux 7 Reporter: Peter Krempa <pkrempa>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: dyuan, fjin, honzhang, jsuchane, lhuang, lmiksik, pkrempa, rbalakri, tlavigne
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-1.2.17-10.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-19 06:48:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Peter Krempa 2015-07-14 13:02:24 UTC
Description of problem:
Libvirt incorrectly updates the balloon size when a VM state is reloaded via migration from file or other host.

Version-Release number of selected component (if applicable):
libvirt-1.2.17-2.el7

How reproducible:
100%

Steps to Reproduce:
1. Start VM with <currentMemory> less than <memory>
2. Wait until it boots and the kernel loads the balloon driver
3. virsh save/managedsave the VM
4. resume it and check the <currentMemory> size

Actual results:
<currentMemory> will be equal to <memory>

Expected results:
<currentMemory> is upated to the actual size of the balloon

Additional info:
Upstream fix:

commit c212e0c77986b0592f63e02d9ecd816aaf7aac18
Author: Peter Krempa <pkrempa>
Date:   Tue Jun 30 16:31:24 2015 +0200

    qemu: process: Improve update of maximum balloon state at startup
    
    In commit 641a145d73fdc3dd9350fd57b3d3247abf101c05 I've added code that
    resets the balloon memory value to full size prior to resuming the vCPUs
    since the size certainly was not reduced at that point.
    
    Since qemuProcessStart is used also in code paths with already booted
    up guests (migration, save/restore) the assumption is not entirely true
    since the guest might already been running before.
    
    This patch adds a function that queries the monitor rather than using
    the full size since a balloon event would not be reissued in case we are
    recovering a saved migration state.
    
    Additionally the new function is used also when reconnecting to a VM
    after libvirtd restart since we might have missed a few balloon events
    while libvirtd was not running.

Comment 3 Luyao Huang 2015-09-08 08:43:00 UTC
Hi peter,

I still can reproduce this issue via migrate in libvirt-1.2.17-8.el7.x86_64:

1. prepare a guest which memory != currentmemory:

# virsh dumpxml test3 --inactive
<domain type='kvm'>
  <name>test3</name>
  <uuid>7347d748-f7ce-448f-8d49-3d29c9bcac30</uuid>
  <memory unit='KiB'>1024000</memory>
  <currentMemory unit='KiB'>824000</currentMemory>

2. start it

# virsh start test3
Domain test3 started

3. wait currentmemory change to 824000:

# virsh dominfo test3
Id:             4
Name:           test3
UUID:           7347d748-f7ce-448f-8d49-3d29c9bcac30
OS Type:        hvm
State:          running
CPU(s):         2
CPU time:       18.3s
Max memory:     1024000 KiB
Used memory:    824000 KiB
Persistent:     yes
Autostart:      disable
Managed save:   no
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c692,c818 (enforcing)

# virsh dumpxml test3
<domain type='kvm' id='4'>
  <name>test3</name>
  <uuid>7347d748-f7ce-448f-8d49-3d29c9bcac30</uuid>
  <memory unit='KiB'>1024000</memory>
  <currentMemory unit='KiB'>824000</currentMemory>

4. migrate to target host (use libvirt-1.2.17-8.el7.x86_64):

# virsh migrate test3 qemu+ssh://test1/system --live

5. check guest memory in target host:

# virsh dominfo test3
Id:             15
Name:           test3
UUID:           7347d748-f7ce-448f-8d49-3d29c9bcac30
OS Type:        hvm
State:          running
CPU(s):         2
CPU time:       8.8s
Max memory:     1024000 KiB
Used memory:    1024000 KiB
Persistent:     yes
Autostart:      disable
Managed save:   no
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c30,c758 (permissive)

# virsh dumpxml test3
<domain type='kvm' id='15'>
  <name>test3</name>
  <uuid>7347d748-f7ce-448f-8d49-3d29c9bcac30</uuid>
  <memory unit='KiB'>1024000</memory>
  <currentMemory unit='KiB'>1024000</currentMemory>


6. and test with save/restore it works well.

Comment 5 Peter Krempa 2015-09-21 09:23:51 UTC
At start of a live migration qemu doesn't know the actual size until the state is fully restored. In this case we'll need to update the balloon value at the point where migration ends and the VM is resumed on destination.

Comment 7 Peter Krempa 2015-09-23 13:02:44 UTC
commit d7a0386e229176ec67531aac1412b8a98914da8e
Author: Peter Krempa <pkrempa>
Date:   Wed Sep 23 14:19:06 2015 +0200

    qemu: Refresh memory size only on fresh starts
    
    Qemu unfortunately doesn't update internal state right after migration
    and so the actual balloon size as returned by 'query-balloon' are
    invalid for a while after the CPUs are started after migration. If we'd
    refresh our internal state at this point we would report invalid current
    memory size until the next balloon event would arrive.

Comment 9 Luyao Huang 2015-09-29 07:49:00 UTC
Verify this bug on libvirt-1.2.17-11.el7.x86_64:

1. prepare a VM memory != current memory:

# virsh dumpxml RHEL71-lhuang
...
  <maxMemory slots='16' unit='KiB'>55600128</maxMemory>
  <memory unit='KiB'>3145728</memory>
  <currentMemory unit='KiB'>3045728</currentMemory>
...

# virsh dominfo RHEL71-lhuang
Id:             19
Name:           RHEL71-lhuang
UUID:           e7041d2f-4811-4fae-93f8-20e029817e63
OS Type:        hvm
State:          running
CPU(s):         4
CPU time:       745.2s
Max memory:     3145728 KiB
Used memory:    3045728 KiB
Persistent:     yes
Autostart:      disable
Managed save:   no
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c181,c837 (enforcing)


2. managedsave and restart and check the current memory:

# virsh managedsave RHEL71-lhuang

Domain RHEL71-lhuang state saved by libvirt

# virsh start RHEL71-lhuang
Domain RHEL71-lhuang started


# virsh dumpxml RHEL71-lhuang
...
  <maxMemory slots='16' unit='KiB'>55600128</maxMemory>
  <memory unit='KiB'>3145728</memory>
  <currentMemory unit='KiB'>3045728</currentMemory>
...

3. migrate to another host (use libvirt-1.2.17-11.el7.x86_64):

# virsh migrate RHEL71-lhuang --live qemu+ssh://test1/system

4. then check it on the target host:

# virsh dumpxml RHEL71-lhuang
...
  <maxMemory slots='16' unit='KiB'>55600128</maxMemory>
  <memory unit='KiB'>3145728</memory>
  <currentMemory unit='KiB'>3045728</currentMemory>
...

# virsh dominfo RHEL71-lhuang
Id:             15
Name:           RHEL71-lhuang
UUID:           e7041d2f-4811-4fae-93f8-20e029817e63
OS Type:        hvm
State:          running
CPU(s):         4
CPU time:       4.5s
Max memory:     3145728 KiB
Used memory:    3045728 KiB
Persistent:     no
Autostart:      disable
Managed save:   no
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c336,c618 (enforcing)

5. create an internal snapshot and revert it:

# virsh dumpxml RHEL71-lhuang
...
  <maxMemory slots='16' unit='KiB'>55600128</maxMemory>
  <memory unit='KiB'>3145728</memory>
  <currentMemory unit='KiB'>3045728</currentMemory>
...

# virsh snapshot-create-as RHEL71-lhuang s2
Domain snapshot s2 created

# virsh destroy RHEL71-lhuang
Domain RHEL71-lhuang destroyed

# virsh snapshot-revert RHEL71-lhuang s2

6. recheck the current memory:

# virsh dumpxml RHEL71-lhuang
...
  <maxMemory slots='16' unit='KiB'>55600128</maxMemory>
  <memory unit='KiB'>3145728</memory>
  <currentMemory unit='KiB'>3045728</currentMemory>


# virsh dominfo RHEL71-lhuang
Id:             29
Name:           RHEL71-lhuang
UUID:           e7041d2f-4811-4fae-93f8-20e029817e63
OS Type:        hvm
State:          running
CPU(s):         4
CPU time:       4.7s
Max memory:     3145728 KiB
Used memory:    3045728 KiB
Persistent:     yes
Autostart:      disable
Managed save:   no
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c270,c657 (enforcing)

Comment 11 errata-xmlrpc 2015-11-19 06:48:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html