+++ This bug was initially created as a clone of Bug #1243548 +++ Description of problem: While running I/O workloads on RHEL 7.1 VM on RHEL 7.1 host using qemu-kvm-rhev: We realised aio=native is 50% lesser than aio=threads. There are several requests to process. But AIO is proceesing one at a time. Systemtap logs to monitor virtio_queue_notify: ts=1436461565480052,vdev=140628771509224,n=0,vq=140628770978384 virtio_blk_handle_read: ts=1436461565480071,req=140628771310288,sector=0,nsectors=8 virtio_blk_handle_read: ts=1436461565480134,req=140628771703808,sector=8,nsectors=8 virtio_blk_handle_read: ts=1436461565480150,req=140628771788928,sector=16,nsectors=8 virtio_blk_handle_read: ts=1436461565480165,req=140628771838192,sector=24,nsectors=8 virtio_blk_handle_read: ts=1436461565480179,req=140628818049344,sector=32,nsectors=8 virtio_blk_handle_read: ts=1436461565480193,req=140628818098608,sector=40,nsectors=8 virtio_blk_handle_read: ts=1436461565480207,req=140628818147872,sector=48,nsectors=8 virtio_blk_handle_read: ts=1436461565480221,req=140628818197136,sector=56,nsectors=8 virtio_blk_handle_read: ts=1436461565480271,req=140628818246400,sector=64,nsectors=8 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Start RHEL7.1 VM on RHEL7.1 host with qemu-kvm-rhev 2. Trigger any IO workload on VM start reading from a disk in chunks. For ex: 4K sequential read. 3. Monitor systemtap traces for io_submit Actual results: AIO processing only one request Expected results: AIO should batch IO reqeusts/ Additional info: --- Additional comment from Pradeep Kumar Surisetty on 2015-11-11 23:20:11 EST --- stefan has provided upstream fix for this. Commit id: fc73548e444ae3239f6cef44a5200b5d2c3e85d1 The raw-posix block driver implements Linux AIO batching so multiple requests can be submitted with a single io_submit(2) system call. Batching is currently only used by virtio-scsi and virtio-blk-data-plane. Enable batching for regular virtio-blk so the number of io_submit(2) system calls is reduced for workloads with queue depth > 1. --- Additional comment from Pradeep Kumar Surisetty on 2015-11-22 22:16:49 EST --- But if we go with aio=threads, user would have to take huge performance impact especially xfs With files too, i see native performing better. Especially xfs. Multi VM: http://psuriset.github.io/pbench-graphs/multi_vm_xfs_ssd_native_vs_threads_raw_sync_iodepth_1_jobs_32.html Single VM: http://psuriset.github.io/pbench-graphs/single_vm_xfs_ssd_native_vs_threads_raw_sync_iodepth_1_jobs_32.html multi vm/qcow2: http://psuriset.github.io/pbench-graphs/multi_vm_xfs_ssd_native_vs_threads_qcow2_sync_iodepth_1_jobs_32.html single vm/qcow2 http://psuriset.github.io/pbench-graphs/single_vm_xfs_ssd_native_vs_threads_qcow2_sync_iodepth_1_jobs_32.html --- Additional comment from Stefan Hajnoczi on 2015-11-26 00:15:28 EST --- (In reply to Pradeep Kumar Surisetty from comment #7) > But if we go with aio=threads, user would have to take huge performance > impact especially xfs > > > With files too, i see native performing better. Especially xfs. > > Multi VM: > > http://psuriset.github.io/pbench-graphs/ > multi_vm_xfs_ssd_native_vs_threads_raw_sync_iodepth_1_jobs_32.html ext4 behaves completely differently. Have you filed a bug against XFS?
After discussing with Yaniv K - we should just stop configuring this, and trust libvirt/qemu to have sane defaults.
Karen, RHV is using this logic when creating libvirt xml: For block device: <driver cache="none" error_policy="stop" io="native" name="qemu" type="raw"/> For files: <driver cache="none" error_policy="stop" io="threads" name="qemu" type="raw"/> What is the recommended configuration? or should we let libvirt decide?
For block devices io="native" is consistently the best choice. For files the results are mixed. QEMU and libvirt leave the choice up to the user. They do not automatically pick the best option (because it's not possible to know the answer in general). io="native" tends to perform well on local files although the results are not always consistent. On remote file systems like NFS io="threads" has been the recommendation.
(In reply to Stefan Hajnoczi from comment #4) > For block devices io="native" is consistently the best choice. > > For files the results are mixed. QEMU and libvirt leave the choice up to > the user. They do not automatically pick the best option (because it's not > possible to know the answer in general). So what is the result of not specifying the io attribute? <driver cache="none" error_policy="stop" name="qemu" type="raw"/> Does it use always "native", or the behavior can changed based on other conditions? > io="native" tends to perform well on local files although the results are > not always consistent. We don't use normally local files, although we can optimize the local file case to use io="threads". > On remote file systems like NFS io="threads" has > been the recommendation. This is the common case when using file based storage.
(In reply to Nir Soffer from comment #7) > (In reply to Stefan Hajnoczi from comment #4) > > For block devices io="native" is consistently the best choice. > > > > For files the results are mixed. QEMU and libvirt leave the choice up to > > the user. They do not automatically pick the best option (because it's not > > possible to know the answer in general). > > So what is the result of not specifying the io attribute? > > <driver cache="none" error_policy="stop" > name="qemu" type="raw"/> > > Does it use always "native", or the behavior can changed based on > other conditions? When <driver io=> is omitted QEMU always defaults to aio=threads.
Yaniv, according to the discussion here, it seems the premise of this BZ is wrong. Should we close it?