Bug 715017
Summary: | Report disk latency (read and write) for each storage device | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Andrew Cathrow <acathrow> |
Component: | qemu-kvm | Assignee: | chellwig <chellwig> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 6.2 | CC: | bsarathy, cpelland, ehabkost, gjansen, iheim, juzhang, mkenneth, mshao, rbalakri, shuang, tburke, virt-maint, wdai |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-0.12.1.2-2.189.el6 | Doc Type: | Bug Fix |
Doc Text: |
Cause: no mechanism to report block device read and write latency.
Consequence: management system can't report what is the average read and write latency for block devices of virtual machines.
Change: implemented a mechanism so that qemu-kvm report disk latency stats on "info blockstats"
Result: management system can fetch and use those stats from qemu-kvm.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2011-12-06 15:52:39 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 693510, 715019, 720379, 740918, 743047 |
Comment 3
Geert Jansen
2011-07-18 12:32:53 UTC
And according to Christoph we should add the flush counter too Christoph, Geert has a vlaid point that we need this on the hypervisor side - we can't integrate our measurements with every guest OS that is out there. The hypervisor management needs a simple mechanism that will work. what's the planned resolution for latency and how is it aggregated for averages? I now have code that basically implements the suggestion in comment 3. It'll need a bit more polishing, but I should be able to send it out to the list tomorrow. I'm not sure the reception of such a raw and more or less useless for qemu interface will be overly nice. Hi chellwig, For QE part, we only need to check if latency info is provided though qemu monitor and QMP? is there anything others we need to test. Thanks Suqin will also run O_DIRECT reads/writes on raw device in guest to check if qemu report the correct data. Hi chellwig, I test with qemu-kvm-0.12.1.2-2.190.el6.x86_64, some data reported by qemu is incorrect. 1. blk: virtio 2. test scenario: write/read with dd, get blk info before and after dd, and then subtract data before_dd from after_dd 3. write operation 1). # dd if=/dev/zero of=/mnt/test bs=1M count=1000 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 7.31712 s, 143 MB/s 2). result a. wr_operations is incorrect: actual result: wr_operations: 9062 - 6976 = 2086 expected result: around 1000 b. wr_total_time_ns is incorrect: actual result: wr_total_time_ns=874167049027 - 667511125049 = 206.655923978s expected result: around 7.31712 s c. wr_bytes is acceptable: actual result: 4198605824 - 3149878272 = 1048727552 expected result: around 1048576000 3). detail data: before dd: drive-virtio-disk1: rd_bytes=199632384 wr_bytes=3149878272 rd_operations=5131 wr_operations=6976 flush_operations=136 wr_total_time_ns=667511125049 rd_total_time_ns=7999186637 flush_total_time_ns=150010121 after dd: drive-virtio-disk1: rd_bytes=199636480 wr_bytes=4198605824 rd_operations=5132 wr_operations=9062 flush_operations=138 wr_total_time_ns=874167049027 rd_total_time_ns=8002082444 flush_total_time_ns=172090043 4. read operation: 1). # dd if=/mnt/test of=/dev/null 2048000+0 records in 2048000+0 records out 1048576000 bytes (1.0 GB) copied, 2.54448 s, 412 MB/s 2). result a. rd_bytes actual result: rd_bytes = 713486848 - 666972672 = 46514176 expected result: around 1048576000 b. rd_operation: actual result: 36777 - 36219 = 558 expected result: 1000 c. rd_total_time_ns: actual result: 55262579137 - 53854678137 = 1.407901000 s expected result: around 2.54448 s 3). detail data before dd: drive-virtio-disk1: rd_bytes=666972672 wr_bytes=4565480448 rd_operations=36219 wr_operations=12882 flush_operations=233 wr_total_time_ns=994495533746 rd_total_time_ns=53854678137 flush_total_time_ns=358894900 after dd: drive-virtio-disk1: rd_bytes=713486848 wr_bytes=4640609280 rd_operations=36777 wr_operations=13960 flush_operations=235 wr_total_time_ns=1020617582408 rd_total_time_ns=55262579137 flush_total_time_ns=381067762 5. ide1-cd0 I didn't attach cd, but there is rd data for ide1-cd0 ide1-cd0: rd_bytes=172 wr_bytes=0 rd_operations=5 wr_operations=0 flush_operations=0 wr_total_time_ns=0 rd_total_time_ns=85268 flush_total_time_ns=0 6. detail cmd /usr/libexec/qemu-kvm -monitor stdio -drive file=/home/RHEL-Server-6.2-32-virtio.qcow2,index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,format=qcow2,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 -device virtio-net-pci,netdev=idOMdpQo,mac=0a:b3:b1:f0:9f:19,id=ndev00idOMdpQo,bus=pci.0,addr=0x3 -netdev tap,id=idOMdpQo,vhost=on,script='/usr/local/staf/test/RHEV/kvm-new/autotest/client/tests/kvm/scripts/qemu-ifup-switch',downscript='no' -m 2048 -smp 2,cores=1,threads=1,sockets=2 -cpu cpu64-rhel6,+sse2,+x2apic -vnc :1 -rtc base=utc,clock=host,driftfix=none -M rhel6.2.0 -boot order=cdn,once=c,menu=off -usbdevice tablet -no-kvm-pit-reinjection -enable-kvm wr_bytes and rd_bytes reported by qemu are the same as dd report, but operation and time are not the same sometime, Christoph , kevin: is it acceptable? Run w/r on attached disk, both ide and virtio qemu-kvm-0.12.1.2-2.193.el6.x86_64 I. ide: 1. write operation: [root@dhcp-66-83-25 ~]# dd if=/dev/zero of=/dev/sdb bs=1M count=1024 oflag=direct 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 14.9952 s, 71.6 MB/s reported by qemu: wr_bytes: 1073741824 ----> the same time: 14328843120 ----> almost the same operation: 2048 ----> the same 2. read operation: [root@dhcp-66-83-25 ~]# dd if=/dev/sdb of=/dev/null bs=1M 10240+0 records in 10240+0 records out 10737418240 bytes (11 GB) copied, 33.0387 s, 325 MB/s reported by qemu: rd_bytes: 10737418240 ----> the same time: 12.936062195 ----> less than dd operation: 131388 ----> almost 5 times than dd II. virtio: 1. write operation: [root@localhost ~]# dd if=/dev/zero of=/dev/vdb bs=1M count=1024 oflag=direct 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 10.9087 s, 98.4 MB/s reported by qemu: wr_bytes: 1073741824 ----> the same operation: 3072 ----> more than dd, should be 2048 time: 31.322404716 ----> more than dd 2. read operation [root@localhost ~]# dd if=/dev/vdb of=/dev/null bs=1M 10240+0 records in 10240+0 records out 10737418240 bytes (11 GB) copied, 15.2586 s, 704 MB/s rd_bytes: 10737418240 ----> the same operation: 109955 ----> almost 5 times than dd time: 14.287022666 ----> almost the same According to Christoph's comment, this bug is fixed. " If you really want to make sure the I/O operations are correct we'd have to write a kernel level test driver to issue I/Os 100% under our control. But the numbers really look like what I expected. " So closing this bug Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: no mechanism to report block device read and write latency. Consequence: management system can't report what is the average read and write latency for block devices of virtual machines. Change: implemented a mechanism so that qemu-kvm report disk latency stats on "info blockstats" Result: management system can fetch and use those stats from qemu-kvm. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2011-1531.html |