Red Hat Bugzilla – Bug 927336
QMP event shows incorrect balloon value when balloon size is grater than or equal to 4G
Last modified: 2014-05-22 09:48:39 EDT
Created attachment 716090 [details] Example of VM startup with different MaxMemory settings and "virsh setmem" commands Description of problem: When 8 GB or more of MaxMemory are allocated to a VM and then the balloon driver is used to adjust the amount of CurrentMemory that the VM is using , then something fail in updating the LiveXML for the VM and the information reported in the Live XML or by using 'virsh dominfo $ID' commands are wrong. Note that the command 'virsh dommemstat $ID' report the right information and also the correct behavior of the balloon can be confirmed inside of the VM. Version-Release number of selected component (if applicable): libvirt.x86_64 0.10.2-18.el6_4.2 @updates libvirt-client.x86_64 0.10.2-18.el6_4.2 @updates libvirt-python.x86_64 0.10.2-18.el6_4.2 @updates How reproducible: 100% of the time Steps to Reproduce: 1. Start a VM with >=8GB of MaxMemory 2. virsh setmem $ID --live --config --size 4194304 3. dominfo $ID (Wrong information will be reported) 4. dommemstat $ID (Right information will be reported) Actual results: A wrong amount of RAM (different from what was set with setmem) is reported by virsh domminfo Expected results: The correct amount of CurrentRam should be reported Additional info: This broke in version , 'libvirt-0.9.10-21.el6.3.7' In particular the patch that is causing the issue is (in that version of libvirt) number 398 : libvirt-Wire-up-handling-for-QMP-s-BALLOON_EVENT.patch This patch was commited for https://bugzilla.redhat.com/show_bug.cgi?id=884713 For testing purposes i recompiled a version of that RPM without the patch described above and the informations were correctly displayed by all the virsh commands Note that when the VM is started with 4GB of Ram then the problem does not show up. Attached is a LOG with the examples of starting the same VM first time with 12GB of MaxMemory and second time with 4GB of MaxMemory and the effect of using the setmem command on it. You can see the qemuProcessHandleBalloonChange function updating the balloon (but i think is actually only updating the informations about it) in both cases ... but when the VM is started with a MaxMemory of 12GB the value to which the function set the CurrentMemory is wrong and different from what specified on the command line. Even if so, the balloon inside of the VM get inflated/deflated properly to the correct amount.
Thanks a lot for finding out the patch that caused the trouble. However, I must disappoint you, because this patch just adds the possibility for us not to ask for the balloon info when working with it, it just automatically updates the data according to QEMU's event. And that's what seems broken to me: # setmem 2 --live --config --size 8388608 does: {"execute":"balloon","arguments":{"value":8589934592},"id":"libvirt-9"} and qemu respons with: {"timestamp": {"seconds": 1363986825, "microseconds": 528314}, "event": "BALLOON_CHANGE", "data": {"actual": 12883853312}} {"timestamp": {"seconds": 1363986826, "microseconds": 352941}, "event": "BALLOON_CHANGE", "data": {"actual": 12884901888}} But when running with 4GB of RAM: # setmem 3 --live --config --size 1048576 does: {"execute":"balloon","arguments":{"value":1073741824},"id":"libvirt-11"} and qemu responds with: {"timestamp": {"seconds": 1363987568, "microseconds": 789570}, "event": "BALLOON_CHANGE", "data": {"actual": 4293918720}} {"timestamp": {"seconds": 1363987569, "microseconds": 408889}, "event": "BALLOON_CHANGE", "data": {"actual": 1073741824}} So I'm reassigning to qemu-kvm to investigate further.
Hi Qunfang, Would you please reproduce this issue by using qemu-kvm directly?
I can reproduce this issue on qemu-kvm-0.12.1.2-2.355.el6_4.2.x86_64 with direct qemu command line. When the memory value in command line >=6G, and then balloon mem, the qmp event is not correct though "info balloon" returns a correct value. Tested mem <6G, does not hit this problem. 1. Boot up a guest with mem >= 6G, take 12G as example. 2. After guest boot up, balloon mem to a smaller one, like 8G. {"execute":"balloon","arguments":{"value":8589934592},"id":"libvirt-9"} {"return": {}, "id": "libvirt-9"} {"timestamp": {"seconds": 1365575550, "microseconds": 828844}, "event": "BALLOON_CHANGE", "data": {"actual": 12883853312}} {"timestamp": {"seconds": 1365575551, "microseconds": 827992}, "event": "BALLOON_CHANGE", "data": {"actual": 11989417984}} {"timestamp": {"seconds": 1365575552, "microseconds": 828569}, "event": "BALLOON_CHANGE", "data": {"actual": 11088691200}} {"timestamp": {"seconds": 1365575553, "microseconds": 828087}, "event": "BALLOON_CHANGE", "data": {"actual": 10186915840}} {"timestamp": {"seconds": 1365575554, "microseconds": 828927}, "event": "BALLOON_CHANGE", "data": {"actual": 9283043328}} {"timestamp": {"seconds": 1365575555, "microseconds": 602622}, "event": "BALLOON_CHANGE", "data": {"actual": 12884901888}} ^^^^^^^^^^^ this is wrong 3. (qemu)info balloon (qemu) info balloon balloon: actual=8192 ==> this is correct As a result, the balloon even shows the current mem is still 12G. 4. Test with 6G mem in the command line, have same problem 5. Test with 5G and 4G mem in the command line, no problem. {"execute":"balloon","arguments":{"value":2147483648},"id":"libvirt-9"} {"return": {}, "id": "libvirt-9"} {"timestamp": {"seconds": 1365575658, "microseconds": 324791}, "event": "BALLOON_CHANGE", "data": {"actual": 4293918720}} {"timestamp": {"seconds": 1365575659, "microseconds": 325085}, "event": "BALLOON_CHANGE", "data": {"actual": 3482320896}} {"timestamp": {"seconds": 1365575660, "microseconds": 325291}, "event": "BALLOON_CHANGE", "data": {"actual": 2625634304}} {"timestamp": {"seconds": 1365575660, "microseconds": 877129}, "event": "BALLOON_CHANGE", "data": {"actual": 2147483648}} ^^^^^^^^^^ this is correct.
Thanks for debugging this down to qemu-kvm Qunfang. What's happening is a integer overflow in the event code, which causes the original RAM size to be reported again. Also, it's not the exact RAM size that matters, but the balloon size which has to be >= 4G. So, you'll also reproduce the problem if you have a VM with 5G but goes down to 1G. I've just posted the patch upstream, test package is on the way.
*** Bug 953908 has been marked as a duplicate of this bug. ***
Hi, is an RPM for this qemu-kvm-0.12.1.2-2.362.el6 version already available ? I would really love to be able to test it , Thanks Francesco
(In reply to comment #12) > Hi, > > is an RPM for this qemu-kvm-0.12.1.2-2.362.el6 version already available ? > > I would really love to be able to test it , > Francesco, the package has been built and is going through our QA process. We can't distribute packages before they go through our QA, get proper gpg signatures and documentation via the errata. If this issue is critical or in any way time sensitive to you or your organization, please raise a ticket through your regular Red Hat support channels so that they can give you a hotfix. For information on how to contact the Red Hat production support team, please visit: https://www.redhat.com/support/process/production/#howto
Reproduce this problem as following version: Host # uname -r 2.6.32-373.el6.x86_64 # rpm -q qemu-kvm qemu-kvm-0.12.1.2-2.355.el6.x86_64 Guest: 2.6.32-373.el6.x86_64 Steps: 1.Boot guest with "-M 6G" 2.#telnet 10.66.7.148 4445 ... {"execute": "query-balloon"} {"return": {"actual": 6442450944}} {"execute":"balloon","arguments":{"value":2147483648},"id":"libvirt-9"} {"return": {}, "id": "libvirt-9"} {"timestamp": {"seconds": 1368066774, "microseconds": 749845}, "event": "BALLOON_CHANGE", "data": {"actual": 6441402368}} {"timestamp": {"seconds": 1368066775, "microseconds": 749809}, "event": "BALLOON_CHANGE", "data": {"actual": 3876585472}} {"timestamp": {"seconds": 1368066776, "microseconds": 427317}, "event": "BALLOON_CHANGE", "data": {"actual": 6442450944}}-->wrong size And can reproduce boot with 5G goes down to 1G Verify this bug as fowllong version: HOst: # uname -r 2.6.32-373.el6.x86_64 # rpm -q qemu-kvm qemu-kvm-0.12.1.2-2.362.el6.x86_64 Guest: 2.6.32-373.el6.x86_64 Results: Senario 1:boot with 5G goes down to 1G .. {"execute": "query-balloon"} {"return": {"actual": 5368709120}} {"execute":"balloon","arguments":{"value":1073741824},"id":"libvirt-9"} {"return": {}, "id": "libvirt-9"} {"timestamp": {"seconds": 1368067600, "microseconds": 431894}, "event": "BALLOON_CHANGE", "data": {"actual": 5367660544}} {"timestamp": {"seconds": 1368067601, "microseconds": 431629}, "event": "BALLOON_CHANGE", "data": {"actual": 2826960896}} {"timestamp": {"seconds": 1368067602, "microseconds": 122977}, "event": "BALLOON_CHANGE", "data": {"actual": 1073741824}} Senario 2:boot with 5G goes down to 7G ... {"execute":"balloon","arguments":{"value":7516192768},"id":"libvirt-9"} {"return": {}, "id": "libvirt-9"} {"timestamp": {"seconds": 1368067966, "microseconds": 588221}, "event": "BALLOON_CHANGE", "data": {"actual": 806354944}} {"timestamp": {"seconds": 1368067967, "microseconds": 588007}, "event": "BALLOON_CHANGE", "data": {"actual": 4543479808}} {"timestamp": {"seconds": 1368067967, "microseconds": 825158}, "event": "BALLOON_CHANGE", "data": {"actual": 5368709120}} Senario 3:boot with 6G goes down to 1G ... {"return": {"actual": 6442450944}} {"execute":"balloon","arguments":{"value":1073741824},"id":"libvirt-9"} {"return": {}, "id": "libvirt-9"} {"timestamp": {"seconds": 1368068266, "microseconds": 411995}, "event": "BALLOON_CHANGE", "data": {"actual": 6441402368}} {"timestamp": {"seconds": 1368068267, "microseconds": 412380}, "event": "BALLOON_CHANGE", "data": {"actual": 3958374400}} {"timestamp": {"seconds": 1368068268, "microseconds": 412598}, "event": "BALLOON_CHANGE", "data": {"actual": 1473249280}} {"timestamp": {"seconds": 1368068268, "microseconds": 579517}, "event": "BALLOON_CHANGE", "data": {"actual": 1073741824}} Other senarios:Tried 7G -->1G 8G-->2G work well
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-1553.html
*** Bug 1045345 has been marked as a duplicate of this bug. ***