Bug 1416180

Summary: QEMU VFIO based block driver for NVMe devices
Product: Red Hat Enterprise Linux 7 Reporter: Ademar Reis <areis>
Component: qemu-kvm-rhevAssignee: Fam Zheng <famz>
Status: CLOSED ERRATA QA Contact: CongLi <coli>
Severity: high Docs Contact:
Priority: high    
Version: 7.3CC: aliang, areis, chayang, dgibson, famz, juzhang, michen, mtessun, qzhang, virt-maint, xuwei, yama, yhong
Target Milestone: rcKeywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.12.0-4.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1416182 1519004 (view as bug list) Environment:
Last Closed: 2018-11-01 11:01:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1416182, 1519004, 1519005, 1590471, 1590472    

Description Ademar Reis 2017-01-24 19:11:07 UTC
This BZ tracks the upstream work currently being done by Fam to introduce a VFIO based NVMe driver to QEMU:

https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg02812.html

Date: Wed, 21 Dec 2016 00:31:35 +0800
From: Fam Zheng <famz>
Subject: [Qemu-devel] [PATCH 0/4] RFC: A VFIO based block driver for NVMe device

This series adds a new protocol driver that is intended to achieve about 20%
better performance for latency bound workloads (i.e. synchronous I/O) than
linux-aio when guest is exclusively accessing a NVMe device, by talking to the
device directly instead of through kernel file system layers and its NVMe
driver.

This applies on top of Stefan's block-next tree which has the busy polling
patches - the new driver also supports it.

A git branch is also available as:

    https://github.com/famz/qemu nvme

See patch 4 for benchmark numbers.

Tests are done on QEMU's NVMe emulation and a real Intel P3700 SSD NVMe card.
Most of dd/fio/mkfs/kernel build and OS installation testings work well, but an
weird write fault looking similar to [1] is consistently seen when installing
RHEL 7.3 guest, which is still under investigation.

[1]: http://lists.infradead.org/pipermail/linux-nvme/2015-May/001840.html

Also, the ram notifier is not enough for hot plugged block device because in
that case the notifier is installed _after_ ram blocks are added so it won't
get the events.

Comment 1 Fam Zheng 2017-07-10 08:35:40 UTC
Upstream progress update:

https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg00995.html

Comment 2 Fam Zheng 2018-01-17 06:19:48 UTC
Upstream patches:

https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg03375.html

Comment 4 Yongxue Hong 2018-06-09 01:20:29 UTC
Reproduce on ppc64le:

[root@ibm-p8-garrison-04 ~]# uname -r
3.10.0-862.3.2.el7.ppc64le
[root@ibm-p8-garrison-04 ~]# /usr/libexec/qemu-kvm  -version
QEMU emulator version 2.12.0 (qemu-kvm-rhev-2.12.0-3.el7)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers

[root@ibm-p8-garrison-04 ~]# lspci | grep NVMe
0001:01:00.0 Non-Volatile memory controller: HGST, Inc. Ultrastar SN100 Series NVMe SSD (rev 05)

[root@ibm-p8-garrison-04 ~]# /usr/libexec/qemu-kvm -device vfio-pci,id=pf,host=0001:01:00.0  -drive file=nvme://0001:01:00.0/1qemu-kvm: -drive file=nvme://0001:01:00.0/1: Driver 'nvme' is not whitelisted

Comment 9 Yongxue Hong 2018-06-13 04:15:34 UTC
Reproduced with rpm(comment 8)

QEMU emulator version 2.12.0 (qemu-kvm-rhev-2.12.0-3.el7.dwg201806131316)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers

[root@ibm-p8-garrison-03 cmds]# lspci | grep NVMe
0001:01:00.0 Non-Volatile memory controller: HGST, Inc. Ultrastar SN100 Series NVMe SSD (rev 05)

[root@ibm-p8-garrison-03 cmds]# /usr/libexec/qemu-kvm -device vfio-pci,id=pf,host=0001:01:00.0  -drive file=nvme://0001:01:00.0/1
qemu-kvm: -drive file=nvme://0001:01:00.0/1: VFIO IOMMU check failed

Boot qemu command failed with SN100 Series NVMe SSD device on ppc64le.

Comment 10 Fam Zheng 2018-06-13 06:20:50 UTC
The right syntax:

/usr/libexec/qemu-kvm -drive file=nvme://0001:01:00.0/1,if=none,id=drive0 -device virtio-blk,drive=drive0

(remove -device vfio-pci,... part).

Comment 11 Yongxue Hong 2018-06-13 06:44:51 UTC
(In reply to Fam Zheng from comment #10)
> The right syntax:
> 
> /usr/libexec/qemu-kvm -drive file=nvme://0001:01:00.0/1,if=none,id=drive0
> -device virtio-blk,drive=drive0
> 
> (remove -device vfio-pci,... part).

Hi Fam,

[root@ibm-p8-garrison-03 ~]# /usr/libexec/qemu-kvm -drive file=nvme://0001:01:00.0/1,if=none,id=drive0 -device virtio-blk,drive=drive0
qemu-kvm: -drive file=nvme://0001:01:00.0/1,if=none,id=drive0: VFIO IOMMU check failed

Still failed to execute command.

Thanks.

Comment 12 Fam Zheng 2018-06-13 06:57:24 UTC
Please check if your system has enabled IOMMU.

Comment 13 Miroslav Rezanina 2018-06-19 18:32:50 UTC
Fix included in qemu-kvm-rhev-2.12.0-4.el7

Comment 14 Yongxue Hong 2018-06-20 02:58:41 UTC
(In reply to Miroslav Rezanina from comment #13)
> Fix included in qemu-kvm-rhev-2.12.0-4.el7

(workspace) [root@ibm-p8-garrison-03 vmt]# /usr/libexec/qemu-kvm --version
QEMU emulator version 2.12.0 (qemu-kvm-rhev-2.12.0-4.el7)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
(workspace) [root@ibm-p8-garrison-03 vmt]# /usr/libexec/qemu-kvm -drive file=nvme://0001:01:00.0/1,if=none,id=drive0 -device virtio-blk,drive=drive0
qemu-kvm: -drive file=nvme://0001:01:00.0/1,if=none,id=drive0: VFIO IOMMU check failed

Still failed with latest qemu.

Comment 16 CongLi 2018-08-30 08:48:47 UTC
Tested nvme:// with basic block testing and block migration(drive mirror+nbd) testing.

NVMe SSD: Intel P3700

Know issue:
Bug 1587992 - nvme:// image creation support

Set this bug to 'VERIFIED'.

Comment 18 errata-xmlrpc 2018-11-01 11:01:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3443