Red Hat Bugzilla – Full Text Bug Listing
|Summary:||Completely different (and bogus) performance for two identical virtual disks|
|Product:||[Fedora] Fedora||Reporter:||Dennis Jacobfeuerborn <dennisml>|
|Component:||qemu||Assignee:||Fedora Virtualization Maintainers <virt-maint>|
|Status:||CLOSED CANTFIX||QA Contact:||Fedora Extras Quality Assurance <extras-qa>|
|Version:||15||CC:||amit.shah, berrange, crobinso, dougsland, dwmw2, ehabkost, itamar, jaswinder, jforbes, knoel, kwolf, rjones, scottt.tw, tburke, virt-maint|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2012-05-29 08:37:45 EDT||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description Dennis Jacobfeuerborn 2011-10-31 06:46:44 EDT
I've set up a CentOS 6 guest that has two virtual disks that show a completely different (and bogus) behavior even though they are configured exactly the same. Host configuration: CentOS 6 guest on Fedora 15 host using KVM Two identical 1TB SATA SAMSUNG HD103SJ host disks (/dev/sdb and /dev/sdc) used for the volume images for the two guest disks. One separate 128GB SSD drive (/dev/sda) for the Fedora System (i.e. the SATA disks are not used by the host system itself). Guest configuration: /dev/vda and /dev/vdb are each 2gb in size and each placed on it's own SATA host disk mentioned above as raw fully allocated image files. The guest disks are attached using virtio and cache=none. Using hdparm and seekmark I get the following results: seekmark: /dev/vda: 130 seeks/s /dev/vdb: 9615 seeks/s hdparm -t: /dev/vda: 95 MB/s /dev/vdb: 1691 MB/s /dev/vda looks ok but the numbers for /dev/vdb make no sense at all. When I test /dev/vda I can hear the drive work and see the I/O on the host. When doing the same with /dev/vdb both tests finish almost immediately and I don't see any I/O on the host side. Testing the physical disks on the hosts show the following numbers: /dev/sdb: 74 seeks/s /dev/sdc: 73 seeks/s hdparm -t: /dev/sdb: 145 MB/s /dev/sdc: 138 MB/s As you can see the physical drives themselves behave pretty identical as expected. This bug is not about the actual absolute performance of the virtual disks but the bogus behavior that I get event though the drives are identically configured. Some mount settings for the backing disks: [dennis@nexus ~]$ cat /proc/mounts|grep backup /dev/sdb3 /mnt/backup01 ext4 rw,seclabel,relatime,user_xattr,acl,barrier=1,data=ordered 0 0 /dev/sdc3 /mnt/backup02 ext4 rw,seclabel,relatime,user_xattr,acl,barrier=1,data=ordered 0 0 Image files: [dennis@nexus seekmark]$ ls -l /mnt/backup01/libvirt/images/gw1.img /mnt/backup02/libvirt/images/gw1-data.img -rw-------. 1 root root 2097152000 Oct 26 14:07 /mnt/backup01/libvirt/images/gw1.img -rw-------. 1 root root 2097152000 Oct 23 00:56 /mnt/backup02/libvirt/images/gw1-data.img Disk definition in the guest: ... <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='none'/> <source file='/mnt/backup01/libvirt/images/gw1.img'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </disk> <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='none'/> <source file='/mnt/backup02/libvirt/images/gw1-data.img'/> <target dev='vdb' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </disk> ... I brough this up on the centos-virt mailing list but didn't get any feedback which is why I'm opening this bug as I'm trying to benchmark various configuration settings but cannot do so reliably unless I can explain this behavior seen above.
Comment 1 Dennis Jacobfeuerborn 2011-10-31 06:50:50 EDT
I forgot to mention that I run the fedora virt-preview repo with the following package versions: qemu-kvm-0.15.0-4.fc15.x86_64 libvirt-0.9.6-1.fc15.x86_64 virt-manager-0.9.0-6.fc15.noarch
Comment 2 Kevin Wolf 2011-10-31 11:47:35 EDT
It would be good to know what the qemu command line looks like. Can you provide that or maybe attach the libvirt log that should contain it as well?
Comment 3 Richard W.M. Jones 2011-10-31 11:47:50 EDT
The trouble with the test is that you're going through the ext4 filesystem on the host. There could easily be fragmentation or alignment issues that affect the two files differently. I'd be interested to know: - are the host partitions aligned? # parted /dev/sdb unit b print # parted /dev/sdb unit b print - instead of using host files, use host LVs (or if you like, properly aligned host partitions, but LVs are more flexible) - do the SATA disks have 512 byte or 4K sectors? http://libguestfs.org/virt-alignment-scan.1.html#linux_host_block_and_i_o_size
Comment 4 Dennis Jacobfeuerborn 2011-10-31 13:35:46 EDT
(In reply to comment #3) > The trouble with the test is that you're going through the > ext4 filesystem on the host. There could easily be > fragmentation or alignment issues that affect the two files > differently. Fragmentation does not explain the gigantic difference between the results. The entire machine a just a few weeks old and the partitions the libvirt images reside on haven't seen almost no deletions and have 88% and 96% space free respectively so fragmentation should be pretty much non-existent. > I'd be interested to know: > > - are the host partitions aligned? > # parted /dev/sdb unit b print > # parted /dev/sdb unit b print Since the drives have been partitioned identically they are identically aligned or unaligned so I would expect to see the same impact on performance good or bad: [dennis@nexus ~]$ sudo parted /dev/sdb unit b print Model: ATA SAMSUNG HD103SJ (scsi) Disk /dev/sdb: 1000204886016B Sector size (logical/physical): 512B/512B Partition Table: msdos Number Start End Size Type File system Flags 1 1048576B 419431448575B 419430400000B primary ntfs 2 419431448576B 848928178175B 429496729600B primary raid 3 848928178176B 1000204886015B 151276707840B primary ext4 [dennis@nexus ~]$ sudo parted /dev/sdc unit b print Model: ATA SAMSUNG HD103SJ (scsi) Disk /dev/sdc: 1000204886016B Sector size (logical/physical): 512B/512B Partition Table: msdos Number Start End Size Type File system Flags 1 1048576B 419431448575B 419430400000B primary ntfs 2 419431448576B 848928178175B 429496729600B primary raid 3 848928178176B 1000204886015B 151276707840B primary ext4 > - instead of using host files, use host LVs (or if you like, > properly aligned host partitions, but LVs are more flexible) I use LVs on my servers but here i use images because I don't care about performance (for now). > - do the SATA disks have 512 byte or 4K sectors? They have a 512 byte physical sector size but again since these are the same drives I'm not sure why this should matter. I'm aware that image file vs. LV, aligned vs. non-aligned, 4k vs 512 byte sectors, etc. all have an impact on performance and having administrated server for a decade now I'm familiar will all kinds of I/O patterns that could influence these measurements from the sneaky background raid verify to the major database system that really pounds the disks but I have no explanation for the *magnitude* of difference I'm seeing here. It might be the case that due to some form of caching the numbers are inflated or that because of the virtualization overhead the numbers are lower than on the host side but what I would expect in either case is that I would get the same result on both disks. Likewise if one of the physical drives was damaged in some weird way then this could be the culprit but in that case I would expect to see the same difference in performance on the host side yet on the host side the drives behave identical as they should. Lastly the fact that when running the tests on /dev/vda I can see I/O on the host side on /dev/sdb but when running them on /dev/vdb I can *not* see any I/O whatsoever on the host side makes me wonder what is going on: seekmark -s 500 -f /dev/vda: (test takes about 5 seconds) iostat on the Host: Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 1.00 0.00 11.20 0 56 sdb 100.00 902.40 4.80 4512 24 sdc 0.00 0.00 0.00 0 0 seekmark -s 500 -f /dev/vdb: (test return immediately) iostat on the Host: Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 1.40 0.00 17.60 0 88 sdb 0.00 0.00 0.00 0 0 sdc 0.00 0.00 0.00 0 0 Notice how in the first case I get 100tps which for a 500 seeks taking 5 seconds to run is exactly what I would expect. On /dev/vdb the host doesn't see an traffic at all!
Comment 5 Dennis Jacobfeuerborn 2011-10-31 13:38:29 EDT
Created attachment 531005 [details] libvirt log excerpt This is the libvirt log from the last run of the guest. It really just contains the command line.
Comment 6 Richard W.M. Jones 2011-10-31 14:12:39 EDT
I'm only trying to help here. The way to understand the root cause of this problem is to gradually remove elements of complexity until one single change makes a difference. In this case I think it would be helpful to eliminate the host ext4 filesystem, and see if that makes any difference. Maybe it won't but you won't know until you've tried it. The partitions (/dev/sdb3, /dev/sdc3) are aligned, so I would just put /dev/vda and /dev/vdb directly on these partitions.
Comment 7 Dennis Jacobfeuerborn 2011-10-31 15:30:41 EDT
After moving all data off the partitions, dd'ing the image files onto them and reconfiguring the guest accordingly I now get the expected result and see about 95 seeks/s on both disks. So I went ahead an reformatted the partitions and restored the files again. This time even with the files I got reasonable results. Next I added a new disk /dev/vdc with an image file on host /dev/sdc3 (just like /dev/vdb). This disk showed the strange behavior again. Lastly I copied the image file into a new file and then replaced the original image file with the copy and then the disk showed the same correct behavior as the other two. Either the file is sparse even though I chose full allocation and "du -h" shows the file as using 2gb of space or the FS optimizes the I/O away because it sees the client accessing a block that is allocated but not yet initialized (due to ext4 pre-allocation) and as a result can simply hand back a block filled with zeroes without actually having to access the disk.
Comment 8 Fedora Admin XMLRPC Client 2012-03-15 13:54:24 EDT
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
Comment 9 Cole Robinson 2012-05-28 20:37:15 EDT
Dennis, are you still seeing this issue with a more recent Fedora? F15 is end of life in a month. If so, please comment to that effect and we can escalate from there.
Comment 10 Dennis Jacobfeuerborn 2012-05-29 07:13:27 EDT
No. As I mentioned in comment 7 this turned out to be a side effect of how ext4 allocates blocks so this isn't a real problem but just something to look out for when benchmarking anything on ext4 (i.e. ensuring that you create non-sparse files is not enough to make sure you get reasonable results. You actually have to write to all blocks of the files before starting the benchmarking).
Comment 11 Cole Robinson 2012-05-29 08:37:45 EDT
Sorry Dennis, I missed that detail. Closing this as CANTFIX then.