Bug 750202
Summary: | Completely different (and bogus) performance for two identical virtual disks | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Dennis Jacobfeuerborn <dennisml> | ||||
Component: | qemu | Assignee: | Fedora Virtualization Maintainers <virt-maint> | ||||
Status: | CLOSED CANTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 15 | CC: | amit.shah, berrange, crobinso, dougsland, dwmw2, ehabkost, itamar, jaswinder, jforbes, knoel, kwolf, rjones, scottt.tw, tburke, virt-maint | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-05-29 12:37:45 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Dennis Jacobfeuerborn
2011-10-31 10:46:44 UTC
I forgot to mention that I run the fedora virt-preview repo with the following package versions: qemu-kvm-0.15.0-4.fc15.x86_64 libvirt-0.9.6-1.fc15.x86_64 virt-manager-0.9.0-6.fc15.noarch It would be good to know what the qemu command line looks like. Can you provide that or maybe attach the libvirt log that should contain it as well? The trouble with the test is that you're going through the ext4 filesystem on the host. There could easily be fragmentation or alignment issues that affect the two files differently. I'd be interested to know: - are the host partitions aligned? # parted /dev/sdb unit b print # parted /dev/sdb unit b print - instead of using host files, use host LVs (or if you like, properly aligned host partitions, but LVs are more flexible) - do the SATA disks have 512 byte or 4K sectors? http://libguestfs.org/virt-alignment-scan.1.html#linux_host_block_and_i_o_size (In reply to comment #3) > The trouble with the test is that you're going through the > ext4 filesystem on the host. There could easily be > fragmentation or alignment issues that affect the two files > differently. Fragmentation does not explain the gigantic difference between the results. The entire machine a just a few weeks old and the partitions the libvirt images reside on haven't seen almost no deletions and have 88% and 96% space free respectively so fragmentation should be pretty much non-existent. > I'd be interested to know: > > - are the host partitions aligned? > # parted /dev/sdb unit b print > # parted /dev/sdb unit b print Since the drives have been partitioned identically they are identically aligned or unaligned so I would expect to see the same impact on performance good or bad: [dennis@nexus ~]$ sudo parted /dev/sdb unit b print Model: ATA SAMSUNG HD103SJ (scsi) Disk /dev/sdb: 1000204886016B Sector size (logical/physical): 512B/512B Partition Table: msdos Number Start End Size Type File system Flags 1 1048576B 419431448575B 419430400000B primary ntfs 2 419431448576B 848928178175B 429496729600B primary raid 3 848928178176B 1000204886015B 151276707840B primary ext4 [dennis@nexus ~]$ sudo parted /dev/sdc unit b print Model: ATA SAMSUNG HD103SJ (scsi) Disk /dev/sdc: 1000204886016B Sector size (logical/physical): 512B/512B Partition Table: msdos Number Start End Size Type File system Flags 1 1048576B 419431448575B 419430400000B primary ntfs 2 419431448576B 848928178175B 429496729600B primary raid 3 848928178176B 1000204886015B 151276707840B primary ext4 > - instead of using host files, use host LVs (or if you like, > properly aligned host partitions, but LVs are more flexible) I use LVs on my servers but here i use images because I don't care about performance (for now). > - do the SATA disks have 512 byte or 4K sectors? They have a 512 byte physical sector size but again since these are the same drives I'm not sure why this should matter. I'm aware that image file vs. LV, aligned vs. non-aligned, 4k vs 512 byte sectors, etc. all have an impact on performance and having administrated server for a decade now I'm familiar will all kinds of I/O patterns that could influence these measurements from the sneaky background raid verify to the major database system that really pounds the disks but I have no explanation for the *magnitude* of difference I'm seeing here. It might be the case that due to some form of caching the numbers are inflated or that because of the virtualization overhead the numbers are lower than on the host side but what I would expect in either case is that I would get the same result on both disks. Likewise if one of the physical drives was damaged in some weird way then this could be the culprit but in that case I would expect to see the same difference in performance on the host side yet on the host side the drives behave identical as they should. Lastly the fact that when running the tests on /dev/vda I can see I/O on the host side on /dev/sdb but when running them on /dev/vdb I can *not* see any I/O whatsoever on the host side makes me wonder what is going on: seekmark -s 500 -f /dev/vda: (test takes about 5 seconds) iostat on the Host: Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 1.00 0.00 11.20 0 56 sdb 100.00 902.40 4.80 4512 24 sdc 0.00 0.00 0.00 0 0 seekmark -s 500 -f /dev/vdb: (test return immediately) iostat on the Host: Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 1.40 0.00 17.60 0 88 sdb 0.00 0.00 0.00 0 0 sdc 0.00 0.00 0.00 0 0 Notice how in the first case I get 100tps which for a 500 seeks taking 5 seconds to run is exactly what I would expect. On /dev/vdb the host doesn't see an traffic at all! Created attachment 531005 [details]
libvirt log excerpt
This is the libvirt log from the last run of the guest. It really just contains the command line.
I'm only trying to help here. The way to understand the root cause of this problem is to gradually remove elements of complexity until one single change makes a difference. In this case I think it would be helpful to eliminate the host ext4 filesystem, and see if that makes any difference. Maybe it won't but you won't know until you've tried it. The partitions (/dev/sdb3, /dev/sdc3) are aligned, so I would just put /dev/vda and /dev/vdb directly on these partitions. After moving all data off the partitions, dd'ing the image files onto them and reconfiguring the guest accordingly I now get the expected result and see about 95 seeks/s on both disks. So I went ahead an reformatted the partitions and restored the files again. This time even with the files I got reasonable results. Next I added a new disk /dev/vdc with an image file on host /dev/sdc3 (just like /dev/vdb). This disk showed the strange behavior again. Lastly I copied the image file into a new file and then replaced the original image file with the copy and then the disk showed the same correct behavior as the other two. Either the file is sparse even though I chose full allocation and "du -h" shows the file as using 2gb of space or the FS optimizes the I/O away because it sees the client accessing a block that is allocated but not yet initialized (due to ext4 pre-allocation) and as a result can simply hand back a block filled with zeroes without actually having to access the disk. This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. Dennis, are you still seeing this issue with a more recent Fedora? F15 is end of life in a month. If so, please comment to that effect and we can escalate from there. No. As I mentioned in comment 7 this turned out to be a side effect of how ext4 allocates blocks so this isn't a real problem but just something to look out for when benchmarking anything on ext4 (i.e. ensuring that you create non-sparse files is not enough to make sure you get reasonable results. You actually have to write to all blocks of the files before starting the benchmarking). Sorry Dennis, I missed that detail. Closing this as CANTFIX then. |