Bug 927336 - QMP event shows incorrect balloon value when balloon size is grater than or equal to 4G
Summary: QMP event shows incorrect balloon value when balloon size is grater than or e...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.4
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: 6.5
Assignee: Luiz Capitulino
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 953908 1045345 (view as bug list)
Depends On:
Blocks: 1011600 958750
TreeView+ depends on / blocked
 
Reported: 2013-03-25 17:07 UTC by Francesco Ciocchetti
Modified: 2014-05-22 13:48 UTC (History)
15 users (show)

Fixed In Version: qemu-kvm-0.12.1.2-2.362.el6
Doc Type: Bug Fix
Doc Text:
Previously, the qemu-kvm utility was reporting incorrect memory size on QMP event when using virtio balloon driver with more than 4 GiB of memory. This was result of integer overflow in calculations which has now been fixed.
Clone Of:
: 958750 (view as bug list)
Environment:
Last Closed: 2013-11-21 06:47:23 UTC
Target Upstream Version:


Attachments (Terms of Use)
Example of VM startup with different MaxMemory settings and "virsh setmem" commands (14.66 KB, application/octet-stream)
2013-03-25 17:07 UTC, Francesco Ciocchetti
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:1553 normal SHIPPED_LIVE Important: qemu-kvm security, bug fix, and enhancement update 2013-11-20 21:40:29 UTC
Red Hat Knowledge Base (Solution) 355763 None None None Never

Description Francesco Ciocchetti 2013-03-25 17:07:34 UTC
Created attachment 716090 [details]
Example of VM startup with different MaxMemory settings and "virsh setmem" commands

Description of problem:

When 8 GB or more of MaxMemory are allocated to a VM and then the balloon driver is used to adjust the amount of CurrentMemory that the VM is using , then something fail in updating the LiveXML for the VM and the information reported in the Live XML or by using 'virsh dominfo $ID' commands are wrong.

Note that the command 'virsh dommemstat $ID' report the right information and also the correct behavior of the balloon can be confirmed inside of the VM.


Version-Release number of selected component (if applicable):
libvirt.x86_64 0.10.2-18.el6_4.2 @updates
libvirt-client.x86_64 0.10.2-18.el6_4.2 @updates
libvirt-python.x86_64 0.10.2-18.el6_4.2 @updates


How reproducible:
100% of the time 

Steps to Reproduce:
1. Start a VM with >=8GB of MaxMemory
2. virsh setmem $ID --live --config --size 4194304
3. dominfo $ID (Wrong information will be reported)
4. dommemstat $ID (Right information will be reported)
  
Actual results:
 A wrong amount of RAM (different from what was set with setmem) is reported by virsh domminfo


Expected results:
 The correct amount of CurrentRam should be reported  

Additional info:

This broke in version , 'libvirt-0.9.10-21.el6.3.7' 

In particular the patch that is causing the issue is (in that version of libvirt) number 398 : libvirt-Wire-up-handling-for-QMP-s-BALLOON_EVENT.patch
This patch was commited for https://bugzilla.redhat.com/show_bug.cgi?id=884713

For testing purposes i recompiled a version of that RPM without the patch described above and the informations were correctly displayed by all the virsh commands

Note that when the VM is started with 4GB of Ram then the problem does not show up. 

Attached is a LOG with the examples of starting the same VM first time with 12GB of MaxMemory and second time with 4GB of MaxMemory and the effect of using the setmem command on it. 

You can see the qemuProcessHandleBalloonChange function updating the balloon (but i think is actually only updating the informations about it) in both cases ... but when the VM is started with a MaxMemory of 12GB the value to which the function set the CurrentMemory is wrong and different from what specified on the command line.

Even if so, the balloon inside of the VM get inflated/deflated properly to the correct amount.

Comment 1 Martin Kletzander 2013-04-09 12:15:34 UTC
Thanks a lot for finding out the patch that caused the trouble.  However, I must disappoint you, because this patch just adds the possibility for us not to ask for the balloon info when working with it, it just automatically updates the data according to QEMU's event.  And that's what seems broken to me:

# setmem 2 --live --config --size 8388608
does:
{"execute":"balloon","arguments":{"value":8589934592},"id":"libvirt-9"}
and qemu respons with:
{"timestamp": {"seconds": 1363986825, "microseconds": 528314}, "event": "BALLOON_CHANGE", "data": {"actual": 12883853312}}
{"timestamp": {"seconds": 1363986826, "microseconds": 352941}, "event": "BALLOON_CHANGE", "data": {"actual": 12884901888}}

But when running with 4GB of RAM:

# setmem 3 --live --config --size 1048576
does:
{"execute":"balloon","arguments":{"value":1073741824},"id":"libvirt-11"}
and qemu responds with:
{"timestamp": {"seconds": 1363987568, "microseconds": 789570}, "event": "BALLOON_CHANGE", "data": {"actual": 4293918720}}
{"timestamp": {"seconds": 1363987569, "microseconds": 408889}, "event": "BALLOON_CHANGE", "data": {"actual": 1073741824}}

So I'm reassigning to qemu-kvm to investigate further.

Comment 2 juzhang 2013-04-09 12:26:01 UTC
Hi Qunfang,

Would you please reproduce this issue by using qemu-kvm directly?

Comment 3 Qunfang Zhang 2013-04-10 07:26:10 UTC
I can reproduce this issue on qemu-kvm-0.12.1.2-2.355.el6_4.2.x86_64 with direct qemu command line. When the memory value in command line >=6G, and then balloon mem, the qmp event is not correct though "info balloon" returns a correct value.
Tested mem <6G, does not hit this problem.

1. Boot up a guest with mem >= 6G, take 12G as example.

2. After guest boot up, balloon mem to a smaller one, like 8G.

{"execute":"balloon","arguments":{"value":8589934592},"id":"libvirt-9"}
{"return": {}, "id": "libvirt-9"}
{"timestamp": {"seconds": 1365575550, "microseconds": 828844}, "event": "BALLOON_CHANGE", "data": {"actual": 12883853312}}

{"timestamp": {"seconds": 1365575551, "microseconds": 827992}, "event": "BALLOON_CHANGE", "data": {"actual": 11989417984}}
{"timestamp": {"seconds": 1365575552, "microseconds": 828569}, "event": "BALLOON_CHANGE", "data": {"actual": 11088691200}}
{"timestamp": {"seconds": 1365575553, "microseconds": 828087}, "event": "BALLOON_CHANGE", "data": {"actual": 10186915840}}
{"timestamp": {"seconds": 1365575554, "microseconds": 828927}, "event": "BALLOON_CHANGE", "data": {"actual": 9283043328}}
{"timestamp": {"seconds": 1365575555, "microseconds": 602622}, "event": "BALLOON_CHANGE", "data": {"actual": 12884901888}}
                                     ^^^^^^^^^^^ this is wrong

3. (qemu)info balloon
(qemu) info balloon 
balloon: actual=8192 ==> this is correct

As a result, the balloon even shows the current mem is still 12G.

4. Test with 6G mem in the command line, have same problem

5. Test with 5G and 4G mem in the command line, no problem.

{"execute":"balloon","arguments":{"value":2147483648},"id":"libvirt-9"}
{"return": {}, "id": "libvirt-9"}
{"timestamp": {"seconds": 1365575658, "microseconds": 324791}, "event": "BALLOON_CHANGE", "data": {"actual": 4293918720}}

{"timestamp": {"seconds": 1365575659, "microseconds": 325085}, "event": "BALLOON_CHANGE", "data": {"actual": 3482320896}}
{"timestamp": {"seconds": 1365575660, "microseconds": 325291}, "event": "BALLOON_CHANGE", "data": {"actual": 2625634304}}
{"timestamp": {"seconds": 1365575660, "microseconds": 877129}, "event": "BALLOON_CHANGE", "data": {"actual": 2147483648}}
                                     ^^^^^^^^^^  this is correct.

Comment 4 Luiz Capitulino 2013-04-18 17:11:35 UTC
Thanks for debugging this down to qemu-kvm Qunfang.

What's happening is a integer overflow in the event code, which causes the original RAM size to be reported again. Also, it's not the exact RAM size that matters, but the balloon size which has to be >= 4G. So, you'll also reproduce the problem if you have a VM with 5G but goes down to 1G.

I've just posted the patch upstream, test package is on the way.

Comment 6 Luiz Capitulino 2013-04-19 15:17:21 UTC
*** Bug 953908 has been marked as a duplicate of this bug. ***

Comment 12 Francesco Ciocchetti 2013-04-30 14:16:06 UTC
Hi,

is an RPM for this qemu-kvm-0.12.1.2-2.362.el6 version already available ? 

I would really love to be able to test it , 

Thanks
Francesco

Comment 13 Ademar Reis 2013-04-30 14:50:17 UTC
(In reply to comment #12)
> Hi,
> 
> is an RPM for this qemu-kvm-0.12.1.2-2.362.el6 version already available ? 
> 
> I would really love to be able to test it , 
> 

Francesco, the package has been built and is going through our QA process. We can't distribute packages before they go through our QA, get proper gpg signatures and documentation via the errata.

If this issue is critical or in any way time sensitive to you or your organization, please raise a ticket through your regular Red Hat support channels so that they can give you a hotfix. For information on how to contact the Red Hat production support team, please visit: https://www.redhat.com/support/process/production/#howto

Comment 15 langfang 2013-05-09 03:20:46 UTC
Reproduce this problem as following version:
Host 
# uname -r
2.6.32-373.el6.x86_64
# rpm -q qemu-kvm
qemu-kvm-0.12.1.2-2.355.el6.x86_64
Guest:

2.6.32-373.el6.x86_64

Steps:
1.Boot guest with "-M 6G"
2.#telnet 10.66.7.148 4445
...
{"execute": "query-balloon"}
{"return": {"actual": 6442450944}}
{"execute":"balloon","arguments":{"value":2147483648},"id":"libvirt-9"}
{"return": {}, "id": "libvirt-9"}
{"timestamp": {"seconds": 1368066774, "microseconds": 749845}, "event": "BALLOON_CHANGE", "data": {"actual": 6441402368}}
{"timestamp": {"seconds": 1368066775, "microseconds": 749809}, "event": "BALLOON_CHANGE", "data": {"actual": 3876585472}}
{"timestamp": {"seconds": 1368066776, "microseconds": 427317}, "event": "BALLOON_CHANGE", "data": {"actual": 6442450944}}-->wrong size

And can reproduce boot with 5G goes down to 1G

Verify this bug as fowllong version:
HOst:
# uname -r
2.6.32-373.el6.x86_64
# rpm -q qemu-kvm
qemu-kvm-0.12.1.2-2.362.el6.x86_64

Guest:
2.6.32-373.el6.x86_64

Results:
Senario 1:boot with  5G goes down to 1G
..
{"execute": "query-balloon"}
{"return": {"actual": 5368709120}}
{"execute":"balloon","arguments":{"value":1073741824},"id":"libvirt-9"}
{"return": {}, "id": "libvirt-9"}
{"timestamp": {"seconds": 1368067600, "microseconds": 431894}, "event": "BALLOON_CHANGE", "data": {"actual": 5367660544}}
{"timestamp": {"seconds": 1368067601, "microseconds": 431629}, "event": "BALLOON_CHANGE", "data": {"actual": 2826960896}}
{"timestamp": {"seconds": 1368067602, "microseconds": 122977}, "event": "BALLOON_CHANGE", "data": {"actual": 1073741824}}


Senario 2:boot with  5G goes down to 7G
...
{"execute":"balloon","arguments":{"value":7516192768},"id":"libvirt-9"}
{"return": {}, "id": "libvirt-9"}
{"timestamp": {"seconds": 1368067966, "microseconds": 588221}, "event": "BALLOON_CHANGE", "data": {"actual": 806354944}}
{"timestamp": {"seconds": 1368067967, "microseconds": 588007}, "event": "BALLOON_CHANGE", "data": {"actual": 4543479808}}
{"timestamp": {"seconds": 1368067967, "microseconds": 825158}, "event": "BALLOON_CHANGE", "data": {"actual": 5368709120}}

Senario 3:boot with 6G goes down to 1G
...
{"return": {"actual": 6442450944}}
{"execute":"balloon","arguments":{"value":1073741824},"id":"libvirt-9"}
{"return": {}, "id": "libvirt-9"}
{"timestamp": {"seconds": 1368068266, "microseconds": 411995}, "event": "BALLOON_CHANGE", "data": {"actual": 6441402368}}
{"timestamp": {"seconds": 1368068267, "microseconds": 412380}, "event": "BALLOON_CHANGE", "data": {"actual": 3958374400}}
{"timestamp": {"seconds": 1368068268, "microseconds": 412598}, "event": "BALLOON_CHANGE", "data": {"actual": 1473249280}}
{"timestamp": {"seconds": 1368068268, "microseconds": 579517}, "event": "BALLOON_CHANGE", "data": {"actual": 1073741824}}

Other senarios:Tried 7G -->1G 8G-->2G  work well

Comment 18 errata-xmlrpc 2013-11-21 06:47:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-1553.html

Comment 19 Luiz Capitulino 2014-05-22 13:48:39 UTC
*** Bug 1045345 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.