RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1013157 - backport block-layer dataplane implementation
Summary: backport block-layer dataplane implementation
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Stefan Hajnoczi
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 1101572 (view as bug list)
Depends On:
Blocks: 824644 1101569 824648 824649 824650 1001564 1029596 1030582 1086307 1101574 1101577 1252481
TreeView+ depends on / blocked
 
Reported: 2013-09-27 23:09 UTC by Ademar Reis
Modified: 2015-08-11 14:15 UTC (History)
8 users (show)

Fixed In Version: qemu-2.1
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-03-05 09:42:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:0624 0 normal SHIPPED_LIVE Important: qemu-kvm-rhev security, bug fix, and enhancement update 2015-03-05 14:37:36 UTC

Description Ademar Reis 2013-09-27 23:09:00 UTC
Stefan is working on the proper data-plane implementation that covers all of the block layer and this BZ is for the backport of such work. The current implementation duplicates blocklayer code and is very limited, serving only as a tech-preview experiment.

Below are some data-plane features which are missing in the current version of qemu-kvm.

 * Image formats (qcow2)
   Dataplane currently only supports raw image files.  It should be
   possible to use image formats like qcow2.

 * Protocols (iSCSI, GlusterFS, Ceph, NBD)
   Dataplane currently only supports local files using Linux AIO.
   Network protocols like NBD, iSCSI, GlusterFS, and Ceph should be
   supported.

 * I/O throttling
   I/O throttling requires the new AioContext timer support which is
   currently being merged upstream.

 * Hot unplug
   Due to the way bdrv_in_use() is currently used to prevent hot unplug,
   the virtio-blk-pci adapter cannot be hot unplugged.  The hot unplug
   command will fail with EBUSY.

 * iothreads as objects
   The user should be able to define the number of iothreads and bind
   devices to specific threads.  iothreads should be discoverable using
   a QMP query-iothreads command, which includes thread IDs suitable for
   CPU affinity setting.

 * Runtime NBD exports
   The runtime NBD server should not interfere with dataplane.  This
   will allow image fleecing and other NBD export users to coexist with
   dataplane.

 * Block jobs
   Block jobs should not interfere with dataplane.  Currently block jobs
   cannot be started (EBUSY).

This BZ will be closed once the first patch-series implementing the bulk of the work is backported and the new implementation is ready for wider testing. If necessary, extra BZs will be open to track missing functionalities by then.

Comment 1 Ademar Reis 2013-12-06 23:00:55 UTC
*** Bug 824647 has been marked as a duplicate of this bug. ***

Comment 2 Stefan Hajnoczi 2014-07-28 09:12:01 UTC
We will get this for free when RHEV switches to QEMU 2.1.0.

Comment 4 Sibiao Luo 2014-09-01 05:34:49 UTC
(In reply to Ademar Reis from comment #0)
> Stefan is working on the proper data-plane implementation that covers all of
> the block layer and this BZ is for the backport of such work. The current
> implementation duplicates blocklayer code and is very limited, serving only
> as a tech-preview experiment.

Verify this issue with the following supported parts.
host info:
# uname -r && rpm -q qemu-kvm-rhev && rpm -q seabios
3.10.0-145.el7.x86_64
qemu-kvm-rhev-2.1.0-2.el7.x86_64
seabios-1.7.5-4.el7.x86_64
guest info:
# uname -r
3.10.0-145.el7.x86_64

> Below are some data-plane features which are missing in the current version
> of qemu-kvm.
> 
>  * Image formats (qcow2)
>    Dataplane currently only supports raw image files.  It should be
>    possible to use image formats like qcow2.
> 
1.create a qcow2 data disk image.
# qemu-img create -f qcow2 /home/my-data-disk.qcow2 10G
Formatting '/home/my-data-disk.qcow2', fmt=qcow2 size=10737418240 encryption=off cluster_size=65536 lazy_refcounts=off
2.launch the qcow2 data disk image to KVM guest.
e.g:...-drive file=/home/my-data-disk.qcow2,if=none,id=drive-data-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,config-wce=off,x-data-plane=on,bus=pci.0,addr=0x7
3.verify the disk can be detected in guest and can work well with format/dd.
# fdisk -l                                      <------------detect it in guest
# mkfs.ext4 /dev/vda                            <------------successfully
# dd if=/dev/zero of=/dev/vda bs=1M count=1000  <------------successfully

>  * Protocols (iSCSI, GlusterFS, Ceph, NBD)
>    Dataplane currently only supports local files using Linux AIO.
>    Network protocols like NBD, iSCSI, GlusterFS, and Ceph should be
>    supported.
> 
- iSCSI:
e.g.:...-drive file=/dev/disk/by-path/ip-10.66.33.253:3260-iscsi-iqn.2014.sluo.com:iscsi.storage.1-lun-1,if=none,id=drive-data-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,config-wce=off,x-data-plane=on,bus=pci.0,addr=0x7

Verify the disk can be detected in guest and can work well with format/dd.
# fdisk -l                                     <------------detect it in guest
# mkfs.ext4 /dev/vda                           <------------successfully
# dd if=/dev/zero of=/dev/vda bs=1M count=1000 <------------successfully

- Libiscsi:
e.g:...-drive file=iscsi://10.66.33.253:3260/iqn.2014.sluo.com:iscsi.storage.1/1,if=none,id=drive-data-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,config-wce=off,x-data-plane=on,bus=pci.0,addr=0x7

Verify the disk can be detected in guest and can work well with format/dd.
# fdisk -l                                     <------------detect it in guest
# mkfs.ext4 /dev/vda                           <------------successfully
# dd if=/dev/zero of=/dev/vda bs=1M count=1000 <------------successfully

- GluserFS:
# qemu-img create -f qcow2 gluster://10.66.106.22/sluo_volume/my-data-disk.qcow2 10G
Formatting 'gluster://10.66.106.22/sluo_volume/my-data-disk.qcow2', fmt=qcow2 size=10737418240 encryption=off cluster_size=65536 

# qemu-img info gluster://10.66.106.22/sluo_volume/my-data-disk.qcow2
image: gluster://10.66.106.22/sluo_volume/my-data-disk.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 140K
cluster_size: 65536
Format specific information:
    compat: 0.10

e.g:...-drive file=gluster://10.66.106.22/sluo_volume/my-data-disk.qcow2,if=none,id=drive-data-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,config-wce=off,x-data-plane=on,bus=pci.0,addr=0x7

Verify the disk can be detected in guest and can work well with format/dd.
# fdisk -l                                     <------------detect it in guest
# mkfs.ext4 /dev/vda                           <------------successfully
# dd if=/dev/zero of=/dev/vda bs=1M count=1000 <------------successfully

- NBD:
# nbd-server 12345 /home/my-data-disk.qcow2

** (process:28046): WARNING **: Specifying an export on the command line is deprecated.

** (process:28046): WARNING **: Please use a configuration file instead.

e.g:...-drive file=nbd:10.66.11.154:12345,if,if=none,id=drive-data-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,config-wce=off,x-data-plane=on,bus=pci.0,addr=0x7

Verify the disk can be detected in guest and can work well with format/dd.
# fdisk -l                                     <------------detect it in guest
# mkfs.ext4 /dev/vda                           <------------successfully
# dd if=/dev/zero of=/dev/vda bs=1M count=1000 <------------successfully

- Ceph:

I did not use the Ceph in the past, just leave Ceph along first, i will investigate it later and try it. 

>  * I/O throttling
>    I/O throttling requires the new AioContext timer support which is
>    currently being merged upstream.
> 

>  * Hot unplug
>    Due to the way bdrv_in_use() is currently used to prevent hot unplug,
>    the virtio-blk-pci adapter cannot be hot unplugged.  The hot unplug
>    command will fail with EBUSY.
> 
e.g:...-drive file=/dev/disk/by-path/ip-10.66.33.253:3260-iscsi-iqn.2014.sluo.com:iscsi.storage.1-lun-1,if=none,id=drive-data-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,config-wce=off,x-data-plane=on,bus=pci.0,addr=0x7
(qemu) device_del data-disk
(qemu) info block    <-------hotunplug successfully.
guest ]# fdisk -l    <-------disappear from guest.
guest ]# dmesg       <-------not see any error

>  * iothreads as objects
>    The user should be able to define the number of iothreads and bind
>    devices to specific threads.  iothreads should be discoverable using
>    a QMP query-iothreads command, which includes thread IDs suitable for
>    CPU affinity setting.
> 
e.g:...-object iothread,id=iothread0 -drive file=/home/my-data-disk.qcow2,if=none,id=drive-data-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,iothread=iothread0,bus=pci.0,addr=0x7

discoverable iothreads via QMP:
{"execute":"query-iothreads"}
{"return": [{"thread-id": 6072, "id": "iothread0"}]}

>  * Runtime NBD exports
>    The runtime NBD server should not interfere with dataplane.  This
>    will allow image fleecing and other NBD export users to coexist with
>    dataplane.
> 
1).launch a KVM guest with data-plane.
e.g:...-object iothread,id=iothread0 -drive file=/home/my-data-disk.qcow2,if=none,id=drive-data-disk0,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk0,id=data-disk0,iothread=iothread0,bus=pci.0,addr=0x7
2).nbd_server_start nbd_server_start [-a] [-w] host:port -- serve block devices on the given host and port and export a block device via NBD.
{"execute":"qmp_capabilities"}
{"return": {}}
{ "execute": "nbd-server-start", "arguments": { "addr": { "type": "inet", "data": { "host": "10.66.11.154", "port": "1234" } } } }
{"return": {}}
{ "execute": "nbd-server-add", "arguments": { "device": "drive-data-disk0", "writable": true } }
{"return": {}}
3).read/write to the nbd target.
# qemu-img info nbd://10.66.11.154:1234/drive-data-disk0
image: 
file format: raw
virtual size: 10G (10737418240 bytes)
disk size: unavailable
# qemu-io -c 'read 512 1024' nbd://10.66.11.154:1234/drive-data-disk0
read 1024/1024 bytes at offset 512
1 KiB, 1 ops; 0.0003 sec (2.713 MiB/sec and 2777.7778 ops/sec)
# qemu-io -c 'write 512 1024' nbd://10.66.11.154:1234/drive-data-disk0
wrote 1024/1024 bytes at offset 512
1 KiB, 1 ops; 0.0005 sec (1.922 MiB/sec and 1968.5039 ops/sec)

>  * Block jobs
>    Block jobs should not interfere with dataplane.  Currently block jobs
>    cannot be started (EBUSY).
> 
e.g.:...-object iothread,id=iothread0 -drive file=/home/my-data-disk.qcow2,if=none,id=drive-data-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,iothread=iothread0,bus=pci.0,addr=0x7

- do block stream:
{ "execute": "drive-mirror", "arguments": { "device": "drive-data-disk", "target": "/root/sn1", "format": "qcow2", "mode": "absolute-paths", "sync": "full", "speed": 1000000000, "on-source-error": "stop", "on-target-error": "stop" } }
{"error": {"class": "GenericError", "desc": "Device 'drive-data-disk' is busy: block device is in use by data plane"}}

- do block mirror:
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-data-disk","snapshot-file": "/home/snap", "format": "qcow2" } }
{"error": {"class": "GenericError", "desc": "Device 'drive-data-disk' is busy: block device is in use by data plane"}}

> This BZ will be closed once the first patch-series implementing the bulk of
> the work is backported and the new implementation is ready for wider
> testing. If necessary, extra BZs will be open to track missing
> functionalities by then.

Best Regards,
sluo

Comment 5 Sibiao Luo 2014-09-01 06:20:01 UTC
(In reply to Sibiao Luo from comment #4)
> (In reply to Ademar Reis from comment #0)
> > Stefan is working on the proper data-plane implementation that covers all of
> > the block layer and this BZ is for the backport of such work. The current
> > implementation duplicates blocklayer code and is very limited, serving only
> > as a tech-preview experiment.
> 
> Verify this issue with the following supported parts.
> host info:
> # uname -r && rpm -q qemu-kvm-rhev && rpm -q seabios
> 3.10.0-145.el7.x86_64
> qemu-kvm-rhev-2.1.0-2.el7.x86_64
> seabios-1.7.5-4.el7.x86_64
> guest info:
> # uname -r
> 3.10.0-145.el7.x86_64
> 
> 
> >  * I/O throttling
> >    I/O throttling requires the new AioContext timer support which is
> >    currently being merged upstream.
> > 
> 
1.launch a KVM guest with data-plane.
e.g:...-object iothread,id=iothread0 -drive file=/home/my-data-disk.qcow2,if=none,id=drive-data-disk0,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop,bps=1024000,bps_rd=0,bps_wr=0,iops=1024000,iops_rd=0,iops_wr=0 -device virtio-blk-pci,drive=drive-data-disk0,id=data-disk0,iothread=iothread0,bus=pci.0,addr=0x7
(qemu) info block
drive-system-disk: /home/RHEL-7.0-20140507.0-Server-x86_64.qcow2 (qcow2)

drive-data-disk0: /home/my-data-disk.qcow2 (qcow2)
    I/O throttling:   bps=1024000 bps_rd=0 bps_wr=0 bps_max=102400 bps_rd_max=0 bps_wr_max=0 iops=1024000 iops_rd=0 iops_wr=0 iops_max=102400 iops_rd_max=0 iops_wr_max=0 iops_size=0
...
2).query the block info via HMP/QMP monitor.
{"execute":"query-block"}
{"return": [...{"io-status": "ok", "device": "drive-data-disk0", "locked": false, "removable": false, "inserted": {"iops_rd": 0, "detect_zeroes": "off", "image": {"virtual-size": 10737418240, "filename": "/home/my-data-disk.qcow2", "cluster-size": 65536, "format": "qcow2", "actual-size": 140320768, "format-specific": {"type": "qcow2", "data": {"compat": "1.1", "lazy-refcounts": false}}, "dirty-flag": false}, "iops_wr": 0, "ro": false, "backing_file_depth": 0, "drv": "qcow2", "bps_max": 102400, "iops": 1024000, "bps_wr": 0, "encrypted": false, "bps": 1024000, "bps_rd": 0, "iops_max": 102400, "file": "/home/my-data-disk.qcow2", "encryption_key_missing": false}, "type": "unknown"}, {"io-status": "ok", "device": "ide1-cd0", "locked": false, "removable": true, "tray_open": false, "type": "unknown"}, {"device": "floppy0", "locked": false, "removable": true, "tray_open": false, "type": "unknown"}, {"device": "sd0", "locked": false, "removable": true, "tray_open": false, "type": "unknown"}]}
3).do fio to the disk in guest.
# fio --filename=/dev/vda --direct=1 --rw=randrw --bs=100K --size=10M --name=test --iodepth=100 --ioengine=libaio
test: (g=0): rw=randrw, bs=100K-100K/100K-100K/100K-100K, ioengine=libaio, iodepth=100
fio-2.1.10
Starting 1 process
Jobs: 1 (f=1): [m] [3.2% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta 05m:31s]
test: (groupid=0, jobs=1): err= 0: pid=4230: Mon Sep  1 02:17:19 2014
  read : io=4900.0KB, bw=501659B/s, iops=4, runt= 10002msec
    slat (usec): min=6, max=34, avg=10.53, stdev= 4.11
    clat (msec): min=1, max=10001, avg=9390.61, stdev=2413.35
     lat (msec): min=1, max=10001, avg=9390.62, stdev=2413.35
    clat percentiles (usec):
     |  1.00th=[ 1352],  5.00th=[99840], 10.00th=[10027008], 20.00th=[10027008],
     | 30.00th=[10027008], 40.00th=[10027008], 50.00th=[10027008], 60.00th=[10027008],
     | 70.00th=[10027008], 80.00th=[10027008], 90.00th=[10027008], 95.00th=[10027008],
     | 99.00th=[10027008], 99.50th=[10027008], 99.90th=[10027008], 99.95th=[10027008],
     | 99.99th=[10027008]
    bw (KB  /s): min=   39, max=   39, per=7.98%, avg=39.00, stdev= 0.00
  write: io=5300.0KB, bw=542611B/s, iops=5, runt= 10002msec
    slat (usec): min=8, max=22, avg=15.28, stdev= 3.32
    clat (usec): min=10000K, max=10001K, avg=10000757.38, stdev=421.06
     lat (usec): min=10000K, max=10001K, avg=10000772.89, stdev=418.90
    clat percentiles (msec):
     |  1.00th=[10028],  5.00th=[10028], 10.00th=[10028], 20.00th=[10028],
     | 30.00th=[10028], 40.00th=[10028], 50.00th=[10028], 60.00th=[10028],
     | 70.00th=[10028], 80.00th=[10028], 90.00th=[10028], 95.00th=[10028],
     | 99.00th=[10028], 99.50th=[10028], 99.90th=[10028], 99.95th=[10028],
     | 99.99th=[10028]
    lat (msec) : 2=0.98%, 20=0.98%, 250=0.98%, >=2000=97.06%
  cpu          : usr=0.00%, sys=0.02%, ctx=93, majf=0, minf=29
  IO depths    : 1=1.0%, 2=2.0%, 4=3.9%, 8=7.8%, 16=15.7%, 32=31.4%, >=64=38.2%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=75.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=25.0%
     issued    : total=r=49/w=53/d=0, short=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=100

Run status group 0 (all jobs):
   READ: io=4900KB, aggrb=489KB/s, minb=489KB/s, maxb=489KB/s, mint=10002msec, maxt=10002msec
  WRITE: io=5300KB, aggrb=529KB/s, minb=529KB/s, maxb=529KB/s, mint=10002msec, maxt=10002msec

Disk stats (read/write):
  vda: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

Best Regards,
sluo

Comment 9 Stefan Hajnoczi 2014-11-18 15:54:36 UTC
*** Bug 1101572 has been marked as a duplicate of this bug. ***

Comment 10 Stefan Hajnoczi 2014-11-18 15:55:28 UTC
Current dataplane status.

Supported:
 * Image formats
 * I/O throttling
 * Block jobs
 * GlusterFS, RBD, iSCSI, NBD

Unsupported:
 * External and internal snapshots
 * QMP 'transaction' command
 * Eject
 * qcow2 encryption

Comment 12 errata-xmlrpc 2015-03-05 09:42:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0624.html


Note You need to log in before you can comment on or make changes to this bug.