Description of problem:I was using vdbench for testing at scale, but just recently went back to doing basic FIO tests. I'm trying to work down through the layers to make sure I isolate as much as possible.
I'm now testing with a compute node that uses a single NVME SSD device that's plugged directly in to the PCI slot to remove any RAID/Array controller variables.
My FIO tests that I'm using are:
fio --name=randread --ioengine=libaio --iodepth=16 --rw=randread --bs=4k --direct=1 --size=512M --numjobs=3 --runtime=60 --group_reporting
fio --name=randwrite --ioengine=libaio --iodepth=16 --rw=randwrite --bs=4k --direct=1 --size=512M --numjobs=3 --runtime=60 --group_reporting
When I run this in on the underlying hypervisor, I got the following results:
WRITE: io=1536.0MB, aggrb=670445KB/s, minb=670445KB/s, maxb=670445KB/s, mint=2346msec, maxt=2346msec
READ: io=1536.0MB, aggrb=1326.5MB/s, minb=1326.5MB/s, maxb=1326.5MB/s, mint=1158msec, maxt=1158msec
But when I run on a QCOW backed VM on the same NVME device on OpenStack, this is what I am getting:
WRITE: io=1536.0MB, aggrb=111828KB/s, minb=111828KB/s, maxb=111828KB/s, mint=14065msec, maxt=14065msec
READ: io=1536.0MB, aggrb=41774KB/s, minb=41774KB/s, maxb=41774KB/s, mint=37651msec, maxt=37651msec
This seems like a huge penalty that I don't expect. I've found a few articles online that suggest that there should be a penalty for using raw/qcow versus LVM, but it shouldn't be substanial.
I did find that OpenStack sets the "cache" value to "none", which will have a performance hit, but according to the documentation, that's the only caching mode that supports migration, which makes sense.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
slow ephemeral disk performance using lvm
no susbstantial performance hit when using lvm backed disks
Is it expected to have such performance loss when using lvm backed instance disks.
As an update, customer configured a host to use raw today instead of qcow, launched a RHEL 7.2 guest instance, and was still seeing 25 MB/s for both read and write. Something is definitely not right here. The underlying disk is capable for 320 MB/s write and 700 MB/s read.
I found that if I update the io mode to be "native" instead of threads, I'm seeing an increase to 100 MB/S and 300 MB/s with preallocation of the space. There still is a hit if I don't pre-allocate but if you write to space that has previously been written, it's fine.
I did this based on Sebastian Han's article:
Although - the last warning scares me a bit about making the files sparse versus not-sparse and causing corruption. Could we get some clarification on that?
Just as an update from the customer:
All result in 25 MB/S both read and write. As soon as I switch to LVM the results go up to ~160 MB/s and ~400 MB/s. Something is definitely not right here.
I found that if I switch the IO mode to "native" manually, I can get performance up to 100 MB/s and 300 Mb/s. But Red Hat cautions against using IO=Native with sparse images, and OpenStack's default is IO=threads, so I would have to hack the code per Sebastians suggestion in order to accomplish this.
One of the Red Hat consultants that we've worked with in the past (Jon Jozwiak) did a quick test in his Kilo environment and was seeing 100 mb/s with qcow and sparse. I just can't seem to find the problem.
Update from customer:
I believe I have resolved the problem on my own. It appears that our /dev/sdb1 partition was not aligned properly with the block sizes.
I fixed this by doing the following on the parted command:
parted -a optimal /dev/sdb mkpart primary 0% 100%
Once I did that, our performance on qcow jumped to 100 MB/s for write and 450 mb/s for read for 4k block sizes which is much closer to what I would expect. Read was the same across the board regardless of LVM/Raw/Qcow. Write was 100/130/180 for qcow/raw/LVM, respectively. Given this, we'll take the trade off of performance for the ability to migrate and thin provision.
This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.
From comment#5, the customer has resolved the issue by aligning the block size with the GNU `parted`.
So I'm turning the state of the bug to: CLOSED, NOTABUG.
Please feel free to re-open (with more data) if this reoccurs.