Bug 1172153 - [RFE] Collect CPU, IO and network accounting information from qemu cgroups
Summary: [RFE] Collect CPU, IO and network accounting information from qemu cgroups
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 4.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.6.0
Assignee: Vinzenz Feenstra [evilissimo]
QA Contact: Nikolai Sednev
URL:
Whiteboard: sla
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-12-09 13:32 UTC by ernest.beinrohr
Modified: 2016-02-10 19:42 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-04 13:00:03 UTC
oVirt Team: SLA
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 38366 0 master MERGED virt: Additional reporting of IO statistics Never
oVirt gerrit 38412 0 master MERGED virt: Additional reporting of CPU usage in ns Never

Description ernest.beinrohr 2014-12-09 13:32:36 UTC
This is a feature request:

Please add to VDSM the ability to collect cgroups accounting info. All my el6 hypervisors have cgroups enabled by default and the QEMU process generates very detailed information about the running VM.

For each hypervisor the currently running VMs have the accounting info in the folder:

/cgroup/blkio/libvirt/qemu/$vm/blkio.throttle.io_serviced
/cgroup/blkio/libvirt/qemu/$vm/blkio.throttle.io_service_bytes
/cgroup/cpuacct/libvirt/qemu/$vm/cpuacct.usage


Here are some of my mrtg graphs drawn from these cgroups accountings : https://imgur.com/a/Enafw


My idea is that this data should be collected on demand my vdsm and then drawn by ovirt in the management console.

Comment 1 Dan Kenigsberg 2014-12-09 18:25:31 UTC
Could you list all the cgroups elements that you find interesting?

oVirt cannot copy them blindly, since it would have to sum some of them accross migrations.

FYI, network usage data is planned in bug 1066570 for 3.6.

Comment 2 ernest.beinrohr 2014-12-10 08:12:52 UTC
oVirt would need to track the migrations of course. But it shouldn't be difficult, here is how I do it with mrtg:

on my mrtg server I periodicly run this loop:

- for every hypervisor:
  - collect cgroup accounting info for every there running VM
- merge the outputs for my 6 hypervisors into a single data file
- save the current data

This way even if a machine migrates

I collect this data from qemu ($vm is the name of the VM):

IOPS:
/cgroup/blkio/libvirt/qemu/$vm/blkio.throttle.io_serviced

Bytes read/written to disk:
/cgroup/blkio/libvirt/qemu/$vm/blkio.throttle.io_service_bytes

CPU shares accounting:
/cgroup/cpuacct/libvirt/qemu/$vm/cpuacct.usage


PS: this is the script I use to get stats of each hypervisor, it may be run directly on any HW with qemu: https://gist.github.com/oernii/6caa43a942f1b3b2410a

Comment 3 Dan Kenigsberg 2014-12-22 09:53:59 UTC
Would I be correct that your statistics are being reset upon migration? Or are you accumulating them somehow across hosts?

Comment 4 ernest.beinrohr 2014-12-22 10:02:05 UTC
No, migration does NOT reset the stats. Of course the counters themselves are zeroed, but it affects the stats only for a short period of time. MRTG is collecting only the difference between the current state and the last one. If the number is negative, it knows the counters have been zeroed and uses the last number. 

oVirt could also collect only periodic diffs and from these then calculate accounting information for VMs and even generate and graphs.

Comment 5 Doron Fediuck 2015-01-12 16:00:39 UTC
Adjusting for 3.6.0 as best effort.

Comment 6 Martin Sivák 2015-02-17 11:07:09 UTC
I think the cgroup accounting info collection and migration should be done by libvirt. That is the only piece that knows the proper cgroup paths, manages them and cooperates with systemd when the groups have to be refreshed. It can also save and forward the numbers during migration.

We can then poll libvirt and present the data to the user.

Comment 7 Vinzenz Feenstra [evilissimo] 2015-03-05 08:16:31 UTC
No need info needed

Comment 8 Nikolai Sednev 2015-05-07 12:01:22 UTC
Works for me on these components:
ovirt-release-master-001-0.7.master.noarch
ovirt-host-deploy-1.4.0-0.0.master.20150505205623.giteabc23b.el7.noarch
vdsm-4.17.0-743.gite5856da.el7.x86_64
ovirt-hosted-engine-setup-1.3.0-0.0.master.20150505102602.gitb2151c7.el7.noarch
sanlock-3.2.2-2.el7.x86_64
qemu-kvm-rhev-2.1.2-23.el7_1.2.x86_64
mom-0.4.3-1.el7.noarch
ovirt-hosted-engine-ha-1.3.0-0.0.master.20150424113553.20150424113551.git7c14f4c.el7.noarch
ovirt-engine-sdk-python-3.6.0.0-0.12.20150506.git1066fb3.el7.centos.noarch
libvirt-client-1.2.8-16.el7_1.2.x86_64

Comment 9 Sandro Bonazzola 2015-11-04 13:00:03 UTC
oVirt 3.6.0 has been released on November 4th, 2015 and should fix this issue.
If problems still persist, please open a new BZ and reference this one.


Note You need to log in before you can comment on or make changes to this bug.