1803785 – Increase virtio-blk performance to pass SAP PBO test

Bug 1803785 - Increase virtio-blk performance to pass SAP PBO test

Summary: Increase virtio-blk performance to pass SAP PBO test

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux Advanced Virtualization
Classification:	Red Hat
Component:	qemu-kvm
Sub Component:
Version:	---
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	8.0
Assignee:	Stefan Hajnoczi
QA Contact:	Tingting Mao
Docs Contact:
URL:
Whiteboard:
Depends On:	1806887
Blocks:
TreeView+	depends on / blocked

Reported:	2020-02-17 12:58 UTC by Nils Koenig
Modified:	2022-04-05 07:33 UTC (History)
CC List:	13 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-08-17 07:27:09 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	nkoenig: needinfo-

Attachments	(Terms of Use)

Description Nils Koenig 2020-02-17 12:58:03 UTC

Using virtio-blk, e.g. with LUN PT, does not deliver the required performance to certify SAP HANA test: PBO (PB Offline, available as on SAP market place).

A general rule of thumb should be that overall IO degradation between virtual and bare metal workload should be <10%. 

There have been tests investigation the degradation:

https://docs.google.com/spreadsheets/d/17Fhlt4NtU-iQLC9jY38eO25qZXGqVG6MC5tY34imb7w/edit?ts=5dadaf6d#gid=1161109638

Also Stefan Hajnoczi has investigated this issue and did some improvements,
but I am not sure what has been included to RHEL / RHV and with what ETA.

The test scenario is as follows:

1. Do a measurement with PBO on the bare metal system.

2. Setup a VM on the same machine according to the Best Practices Guide:
https://access.redhat.com/articles/4448131

3. Do a PBO measurement on the virtual machine.

4. Degradation between measurement 1) and 3) should be <10%.

As a first step, instead of PBO the pure IO performance can be compared (bare metal / virtual) at various block sizes: 4k, 16k, 64k, 1M, 16M, 64M with random and sequential IO being <10%.

Comment 2 Stefan Hajnoczi 2020-02-25 16:00:39 UTC

I have created bz1806887 to track code QEMU changes to optimize for high IOPS storage devices.  This is ongoing work and will still take some time.

To give an overview of what is being done to improve performance:

1. Configuration changes have been identified that improve latency on local NVMe adapters.  These are now being verified by the Performance Team to confirm that they are widely applicable improvements.  Initially they can be configured manually by users.  In the long run they should become the defaults so that VMs experience better performance by default.

2. Optimizations requiring code changes are being developed and benchmarked.  They have to go through the full cycle of upstreaming and backporting.

If you are curious about the findings so far, the main ones are:

1. Polling is critical to reducing overheads on local NVMe adapters.  Changes in QEMU and Linux are required.
2. QEMU's userspace NVMe driver delivers better performance thanks to bypassing the host kernel.  Work is needed to make it ready for production.
2. Enabling multiqueue virtio-blk improves latency by avoiding IPIs inside the guest.  This is because completion interrupts can be delivered on the vCPU that submitted the requests.

Comment 6 RHEL Program Management 2021-08-17 07:27:09 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 8 Tingting Mao 2021-09-30 14:05:38 UTC

Agree to close it. Thanks.

Note You need to log in before you can comment on or make changes to this bug.