Bug 904160
Summary: | libvirt should re-enable memballoon stat reporting | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Eric Blake <eblake> |
Component: | libvirt | Assignee: | John Ferlan <jferlan> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.0 | CC: | acathrow, armbru, berrange, cwei, dconsoli, dyuan, eblake, gsun, honzhang, lcapitulino, mzhan, pbonzini |
Target Milestone: | rc | ||
Target Release: | 7.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | libvirt-1.1.1-1.el7 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-06-13 09:21:57 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Eric Blake
2013-01-25 15:51:07 UTC
The new memory balloon reporting in QEMU, depends on their new QObject object model, which I believe KVM team have already rejected as a backport candidate due to its huge impact across the QEMU source tree. So I'm fairly certain this is going to have to be RHEL7 only material Luiz, is the qom model for memballoon stats something that will be backported to 6.5, or should we push out to RHEL 7? Daniel is right on comment 1. I'm not aware of any plans to backport QOM to RHEL6.5. If this is really needed in RHEL6.5, I guess I'd prefer to add a RHEL6 only extension w/o QOM deps. So, my take on this is: if there isn't a strong requirement behind this for RHEL6 (eg. a customer request) then let's push it to RHEL7. While investigating what it will take in order to re-enable this for libvirt I found that if the memballoon device is added using the "-device virtio-balloon-pci, ..." (etc.) syntax as opposed to the "-balloon virtio" syntax, then it's not possible to start/get the statistics. So my first question - is this intentional? I don't have the "history" that others do, but it seems the "-device ..." syntax is newer and preferred within many code paths. When memballoon support was added in libvirt the choice was to use -device (see qemu_conf.c within the following commit, search for QEMUD_CMD_FLAG_DEVICE): http://libvirt.org/git/?p=libvirt.git;a=commit;h=b2f1863533774a39b3cd5642c183ec842a897c9b I also note that when using "-balloon virtio" in my definition that unlike the description from the qemu change to reenable stats, the memballoon ends on "device[0]" rather than "device[1]". I assume that more or less means we cannot 'assume' specific [n] placement, rather we have to walk the results of a : "{"execute":"qom-list", "arguments": { "path": "/machine/peripheral-anon"}}" looking for "type": "child<virtio-balloon-pci>" in order to grab the "name" value, correct? Or is there a different/preferred mechanism? tks (In reply to John Ferlan from comment #4) > While investigating what it will take in order to re-enable this for libvirt > I found that if the memballoon device is added using the "-device > virtio-balloon-pci, ..." (etc.) syntax as opposed to the "-balloon virtio" > syntax, then it's not possible to start/get the statistics. > > So my first question - is this intentional? No, that's a bug, but it was introduced and fixed during 1.5 development cycle so I don't think it should be visible. Which QEMU version are you running? > I don't have the "history" that others do, but it seems the "-device ..." > syntax is newer and preferred within many code paths. > > When memballoon support was added in libvirt the choice was to use -device > (see qemu_conf.c within the following commit, search for > QEMUD_CMD_FLAG_DEVICE): > > http://libvirt.org/git/?p=libvirt.git;a=commit; > h=b2f1863533774a39b3cd5642c183ec842a897c9b > > > I also note that when using "-balloon virtio" in my definition that unlike > the description from the qemu change to reenable stats, the memballoon ends > on "device[0]" rather than "device[1]". I assume that more or less means we > cannot 'assume' specific [n] placement, rather we have to walk the results > of a : > > "{"execute":"qom-list", "arguments": { "path": "/machine/peripheral-anon"}}" > > looking for "type": "child<virtio-balloon-pci>" in order to grab the "name" > value, correct? I think so. Paolo or Markus, can you confirm this is correct? > > Or is there a different/preferred mechanism? > > tks I'm still a bit new at this, so maybe I did something wrong. I have an f18 host with an older qemu-kvm (1.2) installed. So I grabbed and built/install locally according to instructions I found (http://en.wikibooks.org/wiki/QEMU/Linux). My .git/config repo is: $ cat .git/config [core] repositoryformatversion = 0 filemode = true bare = false logallrefupdates = true [remote "origin"] url = git://git.qemu.org/qemu.git fetch = +refs/heads/*:refs/remotes/origin/* ... Then, I took the output of one of my guests running after using virsh start <domain> to get it started. Then I put that into a file and ran it directly changing the "/usr/bin/qemu-kvm" to the "/usr/local/bin/qemu-system-x86_64": /usr/local/bin/qemu-system-x86_64 -name rh64 \ -S -machine pc-1.2,accel=kvm \ -cpu SandyBridge,+vmx -m 1000 \ -smp 1,sockets=1,cores=1,threads=1 \ -uuid ab912f5d-f482-385a-cc68-6eb016e73234 \ -no-user-config -nodefaults \ -qmp tcp:localhost:4444,server --monitor stdio \ -rtc base=utc -no-shutdown \ -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ -drive file=/home/vm-images/rh64,if=none,id=drive-virtio-disk0,format=raw \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \ -chardev pty,id=charserial0 \ -device isa-serial,chardev=charserial0,id=serial0 \ -device usb-tablet,id=input0 \ -vga cirrus \ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 In one terminal session I run: # ./qemu-kvm.rh64.sh QEMU waiting for connection on: tcp:[::1]:4444,server In another terminal session I do: # telnet localhost 4444 Trying ::1... Connected to localhost. Escape character is '^]'. {"QMP": {"version": {"qemu": {"micro": 50, "minor": 5, "major": 1}, "package": ""}, "capabilities": []}} which results in the following in the first one: char device redirected to /dev/pts/24 (label charserial0) QEMU 1.5.50 monitor - type 'help' for more information (qemu) VNC server running on `::1:5900' info version 1.5.50 (qemu) info balloon balloon: actual=1000 (qemu) back in the other session... { "execute": "qmp_capabilities" } {"return": {}} {"execute":"cont"} {"timestamp": {"seconds": 1371567769, "microseconds": 948070}, "event": "RESUME"} {"return": {}} {"execute":"qom-list", "arguments": { "path": "/machine/peripheral-anon"}} {"return": [{"name": "type", "type": "string"}]} If I change the last line to "-balloon virtio", I get: {"return": [{"name": "device[0]", "type": "child<virtio-balloon-pci>"}, {"name": "type", "type": "string"}]} {"execute":"qom-list", "arguments": { "path": "/machine/peripheral-anon/device[0]"}} {"return": [{"name": "virtio-bus", "type": "child<virtio-pci-bus>"}, {"name": "guest-stats-polling-interval", "type": "int"}, {"name": "guest-stats", "type": "guest statistics"}, {"name": "virtio-backend", "type": "child<virtio-balloon-device>"}, {"name": "parent_bus", "type": "link<bus>"}, {"name": "command_serr_enable", "type": "boolean"}, {"name": "legacy-command_serr_enable", "type": "legacy<on/off>"}, {"name": "multifunction", "type": "boolean"}, {"name": "legacy-multifunction", "type": "legacy<on/off>"}, {"name": "rombar", "type": "uint32"}, {"name": "romfile", "type": "string"}, {"name": "legacy-romfile", "type": "legacy<string>"}, {"name": "addr", "type": "int32"}, {"name": "legacy-addr", "type": "legacy<pci-devfn>"}, {"name": "class", "type": "uint32"}, {"name": "legacy-class", "type": "legacy<hex32>"}, {"name": "event_idx", "type": "boolean"}, {"name": "legacy-event_idx", "type": "legacy<on/off>"}, {"name": "indirect_desc", "type": "boolean"}, {"name": "legacy-indirect_desc", "type": "legacy<on/off>"}, {"name": "realized", "type": "bool"}, {"name": "type", "type": "string"}]} You have the correct version, so please forget what I say in comment 5. Try looking for the balloon virtio-balloon-pci device in the /machine/i440fx/pci.0/ path. For example: { "execute":"qom-list", "arguments": { "path": "/machine/i440fx/pci.0/" } } Ah - yes that makes more sense... here's my results: { "execute":"qom-list", "arguments": { "path": "/machine/i440fx/pci.0/" } } {"return": [{"name": "child[7]", "type": "link<virtio-balloon-pci>"}, {"name": "child[6]", "type": "link<virtio-blk-pci>"}, {"name": "child[5]", "type": "link<piix3-usb-uhci>"}, {"name": "child[4]", "type": "link<PIIX4_PM>"}, {"name": "child[3]", "type": "link<piix3-ide>"}, {"name": "child[2]", "type": "link<cirrus-vga>"}, {"name": "child[1]", "type": "link<PIIX3>"}, {"name": "child[0]", "type": "link<i440FX>"}, {"name": "type", "type": "string"}]} and { "execute":"qom-list", "arguments": { "path": "/machine/i440fx/pci.0/child[7]"} } {"return": [{"name": "virtio-bus", "type": "child<virtio-pci-bus>"}, {"name": "guest-stats-polling-interval", "type": "int"}, {"name": "guest-stats", "type": "guest statistics"}, {"name": "virtio-backend", "type": "child<virtio-balloon-device>"}, {"name": "parent_bus", "type": "link<bus>"}, {"name": "command_serr_enable", "type": "boolean"}, {"name": "legacy-command_serr_enable", "type": "legacy<on/off>"}, {"name": "multifunction", "type": "boolean"}, {"name": "legacy-multifunction", "type": "legacy<on/off>"}, {"name": "rombar", "type": "uint32"}, {"name": "romfile", "type": "string"}, {"name": "legacy-romfile", "type": "legacy<string>"}, {"name": "addr", "type": "int32"}, {"name": "legacy-addr", "type": "legacy<pci-devfn>"}, {"name": "class", "type": "uint32"}, {"name": "legacy-class", "type": "legacy<hex32>"}, {"name": "event_idx", "type": "boolean"}, {"name": "legacy-event_idx", "type": "legacy<on/off>"}, {"name": "indirect_desc", "type": "boolean"}, {"name": "legacy-indirect_desc", "type": "legacy<on/off>"}, {"name": "realized", "type": "bool"}, {"name": "type", "type": "string"}]} Now to "devise" a methodology know how to put together that path. It's too bad there wasn't a : {"execute":"qom-find", "arguments" : {"type": "child<virtio-balloon-device>"} } that would return just the path "/machine/i440fx/pci.0/child[7]" Having to recursively use qom-list to find what I'm looking for could be painful... Re comment#5: "-balloon virtio,PROPS..." is syntactic sugar for "-device virtio-balloon,PROPS,...", except "driver" properties in PROPS are silently ignored. I'd say any differences between the two forms are either undefined behavior (things that could also change whenever you upgrade QEMU), or bugs. I generally recommend to stick to unsugared syntax, especially in tools. Re comment#6: The two options you used are *not* equivalent! "-balloon virtio" is sugar for "-device virtio-balloon". "virtio-balloon" is an alias for "virtio-balloon-pci". "-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5" sets additional properties id=balloon0,bus=pci.0,addr=0x5. bus=pci.0,addr=0x5 select a specific PCI address, which may or may not match the one the system picks for you if you don't select one. In my quick test (which doesn't match your test case), it changes the PCI address from bus 0, device 4 to bus 0, device 5. id=balloon0 sets the qdev ID. Anonymous devices (no qdev ID) are put in machine/peripheral/anon/device[N], where N is a number picked by the system. Named devices machine/peripheral/ID. As far as I know, libvirt always sets an ID. Makes sense. (In reply to Markus Armbruster from comment #9) > The two options you used are *not* equivalent! > > "-balloon virtio" is sugar for "-device virtio-balloon". > "virtio-balloon" is an alias for "virtio-balloon-pci". > > "-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5" sets > additional properties id=balloon0,bus=pci.0,addr=0x5. > > bus=pci.0,addr=0x5 select a specific PCI address, which may or may not > match the one the system picks for you if you don't select one. In my > quick test (which doesn't match your test case), it changes the PCI > address from bus 0, device 4 to bus 0, device 5. > > id=balloon0 sets the qdev ID. Anonymous devices (no qdev ID) are put > in machine/peripheral/anon/device[N], where N is a number picked by > the system. Named devices machine/peripheral/ID. > > As far as I know, libvirt always sets an ID. Makes sense. Yep, libvirt will always use '-device virtio-balloon-pci' for any QEMU which is new enough to support the balloon stats reporting. So we do not need to consider the '-balloon virtio' scenario whatsoever. I have some working code, but wanted to "validate" things especially since I don't have the "prior model" as a reference point... Prior code essentially has the following to fill in data: virJSONValueObjectGetNumberUlong(data, "mem_swapped_in", &mem) stats[got].tag = VIR_DOMAIN_MEMORY_STAT_SWAP_IN; stats[got].val = (mem/1024); virJSONValueObjectGetNumberUlong(data, "mem_swapped_out", &mem) stats[got].tag = VIR_DOMAIN_MEMORY_STAT_SWAP_OUT; stats[got].val = (mem/1024); virJSONValueObjectGetNumberUlong(data, "major_page_faults", &mem) stats[got].tag = VIR_DOMAIN_MEMORY_STAT_MAJOR_FAULT; stats[got].val = mem; virJSONValueObjectGetNumberUlong(data, "minor_page_faults", &mem) stats[got].tag = VIR_DOMAIN_MEMORY_STAT_MINOR_FAULT; stats[got].val = mem; virJSONValueObjectGetNumberUlong(data, "free_mem", &mem) stats[got].tag = VIR_DOMAIN_MEMORY_STAT_UNUSED; stats[got].val = (mem/1024); virJSONValueObjectGetNumberUlong(data, "total_mem", &mem) stats[got].tag = VIR_DOMAIN_MEMORY_STAT_AVAILABLE; stats[got].val = (mem/1024); Taking a recent (eg, today 6/28/13) git update of the git://git.qemu.org/qemu.git and building/using in a local area, I get the following results from a guest where I haven't enabled collection yet: QEMU_MONITOR_RECV_REPLY: mon=0x7ff54800ac40 reply={"return": {"stats": {"stat-swap-out": -1, "stat-free-memory": -1, "stat-minor-faults": -1, "stat-major-faults": -1, "stat-total-memory": -1, "stat-swap-in": 0}, "last-update": 1372430096}, "id": "libvirt-10"} First off - I'm not sure I understand why 'stat-swap-in' is not -1, but that could be "environmental" too, although I do note that in multiple attempts to collect that the 'last-update' field stays the same. My primary question is should the new code use the same algorithm as before with regard to division of the 'stat-free-memory', 'stat-total-memory', 'stat-swap-in', and 'stat-swap-out' by 1024? That is - I assume they are being returned in 'bytes' and the libvirt code displayed differently. (In reply to John Ferlan from comment #11) > I have some working code, but wanted to "validate" things especially since I > don't have the "prior model" as a reference point... I think the most appropriate thing to do is to post an RFC version of your patch(es) to the libvirt list. Then you can also ask questions about the current code. It was posted yesterday: https://www.redhat.com/archives/libvir-list/2013-July/msg00108.html However, the questions/issues were directed more towards qemu than libvirt. I am following the prior example with regard to data manipulation. That is the 'swap' and 'memory' values are divided by 1024. The data I'm seeing shows two possible anomalies and I'd rather vet them now that I'm thinking about them! #1. When no collection was ever attempted - as above - you'll note that "stat-swap-in": 0 - which is odd since I'd expect it to be -1 just as every other value is -1. Is that expected? The issue is that in the event collection is never done, a swap-in return of 0 results in 3 values getting displayed in the output because I only avoid any data with -1. If that's expected - that's fine - I'll just be sure to document it so as to avoid future questions! #2. When collection is started neither stat-swap-in nor stat-swap-out has any value - they are both 0 (zero). That could be expected, but I wasn't sure. I note that the data from the qemu checkin note referenced above also has them at 0 (zero). I am going to assume this is OK, but sometimes assumptions like this aren't good, so I was double checking. Once libvirt exposes the data to access easier questions could arise. (In reply to John Ferlan from comment #13) > It was posted yesterday: > > https://www.redhat.com/archives/libvir-list/2013-July/msg00108.html > > However, the questions/issues were directed more towards qemu than libvirt. > I am following the prior example with regard to data manipulation. That is > the 'swap' and 'memory' values are divided by 1024. The stats values are all in bytes, so dividing by 1024 might make sense to end-users. > The data I'm seeing shows two possible anomalies and I'd rather vet them now > that I'm thinking about them! > > #1. When no collection was ever attempted - as above - you'll note that > "stat-swap-in": 0 - which is odd since I'd expect it to be -1 just as every > other value is -1. Is that expected? The issue is that in the event > collection is never done, a swap-in return of 0 results in 3 values getting > displayed in the output because I only avoid any data with -1. If that's > expected - that's fine - I'll just be sure to document it so as to avoid > future questions! It's not expected if polling wasn't enabled. I don't seem to be able to reproduce this though, can you share how you're doing it? > #2. When collection is started neither stat-swap-in nor stat-swap-out has > any value - they are both 0 (zero). That could be expected, but I wasn't > sure. I note that the data from the qemu checkin note referenced above also > has them at 0 (zero). I am going to assume this is OK, but sometimes > assumptions like this aren't good, so I was double checking. Once libvirt > exposes the data to access easier questions could arise. This is fine, it probably means the VM didn't swap (yet). w/r/t: stat-swap-in value Like I said - it could have been environmental - I had qemu 1.2 'installed' via yum on the host and a qemu 1.5 build in /usr/local and my guest was configured to use the /usr/local/bin/qemu-system-x86_64 (see above comment 6). After a reboot (for some f18 update) I no longer see the same thing - which is fine. I just wanted to be sure. I've posted a v2 upstream: https://www.redhat.com/archives/libvir-list/2013-July/msg00435.html just waiting on reviews now. Thanks for your help and validation on the data! Final changes have been pushed upstream, see http://libvirt.org/git/?p=libvirt.git;a=commit;h=57b65c58d0e935ce31a0d3ea3e820edc4ddc3d10 John Can reproduce this with build: libvirt-0.10.2-16.el6.x86_64 Verify with libvirt-1.1.1-1.el7.x86_64 Steps: 1. #virsh dommemstat $running_domain actual 1026048 swap_in 0 rss 315596 According above,the swap stats is included in the dommemstat list. So change the status to VERIFIED. Do more test about this feature Steps: 1. #virsh list Id Name State ---------------------------------------------------- 7 r6 running 2. #virsh dommemstat $r6 --period 3 --live 3. #virsh dommemstat r6 actual 1026048 swap_in 0 swap_out 0 major_fault 482 minor_fault 481205 unused 829644 available 998332 rss 328656 After restarting the domain: 4. #virsh dommemstat r6 actual 1026048 swap_in 0 rss 309756 5. #virsh dommemstat r6 --period 3 --config 6. #virsh dommemstat r6 actual 1026048 swap_in 0 rss 343980 After restarting the domain: 7. #virsh dommemstat r6 actual 1026048 swap_in 0 swap_out 0 major_fault 276 minor_fault 370852 unused 860888 available 998332 rss 313840 8. #virsh dommemstat r6 --period 10 --current actual 1026048 swap_in 0 swap_out 0 major_fault 485 minor_fault 487078 unused 827724 available 998332 rss 326200 As above steps,all the new functions have been verified. Do more test ablout this feature. Make it clearly that the period value take affect. Steps: 1. Start a guest: #virsh list Id Name State ---------------------------------------------------- 7 r6 running 2. #virsh dommemstat r6 actual 1026048 swap_in 0 rss 351096 3. #virsh dommemstat r6 --period 3 4. #virsh dommemstat r6 actual 1026048 swap_in 0 swap_out 0 major_fault 484 minor_fault 484572 unused 827856 available 998332 rss 340164 Now the period value takes affect. 5.As setting the <--period> to 0 will stop the balloon driver collection, but does not clear the statistics in the balloon driver. #virsh dommemstat r6 --period 0 #virsh dommemstat r6 actual 1026048 swap_in 0 swap_out 0 major_fault 484 minor_fault 484588 unused 827864 available 998332 rss 340164 6. After rebooting the guest #virsh dommemstat r6 actual 1026048 swap_in 444596224 rss 338248 Not quite sure if there's a question here or not, but libvirt doesn't control whether the stats are cleared or not - just the setting of the collection interval > 0 tells qemu to start collecting and setting to 0 says stop collecting. There is no "and zero out the current stats" type option. As I understand it - it's not a feature supported by qemu. What you show is expected behaviour... Determining whether statistics change or not is an exercise of continually fetching the data in a script, comparing the fields/data vs. a previous collection, and then displaying the difference. Depending on guest activity those stats may change or not. Depending on hypervisor there are different stats collected. Programmatically one could keep an array of the last 'n' collections and attempt to come up with some fancy algorithm to determine the rate of change. As for determining what value is currently used for stats period... That's a bit trickier and requires a bit of knowledge as to whether the guest was started with stats enabled or not and whether the "--live" switch was used on any adjustment or not... If a guest was started without stats, then a 'virsh dumpxml guest_name | grep "stats period"' would show nothing; however a 'virsh dumpxml guest_name --inactive | grep "stats period"' could show the most recently changed value - if it's not 0 (zero). If the "--live" switch was used with the 'virsh dommemstat guest_name --period 4' command, then the '4' would never show up in a configuration file. Confused yet? More simply said - there's no mechanism currently in the virsh display code to say this is the current value being used other than dumping the xml to see if there is a value set and even that may not be correct because the --live switch could override the value in the config file. If you want the display of the period, then that'd have to be a new bz with the reasoning behind needing to see the value. This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |