Bug 1013157
Summary: | backport block-layer dataplane implementation | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Ademar Reis <areis> |
Component: | qemu-kvm-rhev | Assignee: | Stefan Hajnoczi <stefanha> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.0 | CC: | hhuang, jenifer.golmitz, juzhang, knoel, michen, rbalakri, sluo, virt-maint |
Target Milestone: | rc | Keywords: | FutureFeature |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-2.1 | Doc Type: | Enhancement |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-03-05 09:42:39 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 824644, 824648, 824649, 824650, 1001564, 1029596, 1030582, 1086307, 1101569, 1101574, 1101577, 1252481 |
Description
Ademar Reis
2013-09-27 23:09:00 UTC
*** Bug 824647 has been marked as a duplicate of this bug. *** We will get this for free when RHEV switches to QEMU 2.1.0. (In reply to Ademar Reis from comment #0) > Stefan is working on the proper data-plane implementation that covers all of > the block layer and this BZ is for the backport of such work. The current > implementation duplicates blocklayer code and is very limited, serving only > as a tech-preview experiment. Verify this issue with the following supported parts. host info: # uname -r && rpm -q qemu-kvm-rhev && rpm -q seabios 3.10.0-145.el7.x86_64 qemu-kvm-rhev-2.1.0-2.el7.x86_64 seabios-1.7.5-4.el7.x86_64 guest info: # uname -r 3.10.0-145.el7.x86_64 > Below are some data-plane features which are missing in the current version > of qemu-kvm. > > * Image formats (qcow2) > Dataplane currently only supports raw image files. It should be > possible to use image formats like qcow2. > 1.create a qcow2 data disk image. # qemu-img create -f qcow2 /home/my-data-disk.qcow2 10G Formatting '/home/my-data-disk.qcow2', fmt=qcow2 size=10737418240 encryption=off cluster_size=65536 lazy_refcounts=off 2.launch the qcow2 data disk image to KVM guest. e.g:...-drive file=/home/my-data-disk.qcow2,if=none,id=drive-data-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,config-wce=off,x-data-plane=on,bus=pci.0,addr=0x7 3.verify the disk can be detected in guest and can work well with format/dd. # fdisk -l <------------detect it in guest # mkfs.ext4 /dev/vda <------------successfully # dd if=/dev/zero of=/dev/vda bs=1M count=1000 <------------successfully > * Protocols (iSCSI, GlusterFS, Ceph, NBD) > Dataplane currently only supports local files using Linux AIO. > Network protocols like NBD, iSCSI, GlusterFS, and Ceph should be > supported. > - iSCSI: e.g.:...-drive file=/dev/disk/by-path/ip-10.66.33.253:3260-iscsi-iqn.2014.sluo.com:iscsi.storage.1-lun-1,if=none,id=drive-data-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,config-wce=off,x-data-plane=on,bus=pci.0,addr=0x7 Verify the disk can be detected in guest and can work well with format/dd. # fdisk -l <------------detect it in guest # mkfs.ext4 /dev/vda <------------successfully # dd if=/dev/zero of=/dev/vda bs=1M count=1000 <------------successfully - Libiscsi: e.g:...-drive file=iscsi://10.66.33.253:3260/iqn.2014.sluo.com:iscsi.storage.1/1,if=none,id=drive-data-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,config-wce=off,x-data-plane=on,bus=pci.0,addr=0x7 Verify the disk can be detected in guest and can work well with format/dd. # fdisk -l <------------detect it in guest # mkfs.ext4 /dev/vda <------------successfully # dd if=/dev/zero of=/dev/vda bs=1M count=1000 <------------successfully - GluserFS: # qemu-img create -f qcow2 gluster://10.66.106.22/sluo_volume/my-data-disk.qcow2 10G Formatting 'gluster://10.66.106.22/sluo_volume/my-data-disk.qcow2', fmt=qcow2 size=10737418240 encryption=off cluster_size=65536 # qemu-img info gluster://10.66.106.22/sluo_volume/my-data-disk.qcow2 image: gluster://10.66.106.22/sluo_volume/my-data-disk.qcow2 file format: qcow2 virtual size: 10G (10737418240 bytes) disk size: 140K cluster_size: 65536 Format specific information: compat: 0.10 e.g:...-drive file=gluster://10.66.106.22/sluo_volume/my-data-disk.qcow2,if=none,id=drive-data-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,config-wce=off,x-data-plane=on,bus=pci.0,addr=0x7 Verify the disk can be detected in guest and can work well with format/dd. # fdisk -l <------------detect it in guest # mkfs.ext4 /dev/vda <------------successfully # dd if=/dev/zero of=/dev/vda bs=1M count=1000 <------------successfully - NBD: # nbd-server 12345 /home/my-data-disk.qcow2 ** (process:28046): WARNING **: Specifying an export on the command line is deprecated. ** (process:28046): WARNING **: Please use a configuration file instead. e.g:...-drive file=nbd:10.66.11.154:12345,if,if=none,id=drive-data-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,config-wce=off,x-data-plane=on,bus=pci.0,addr=0x7 Verify the disk can be detected in guest and can work well with format/dd. # fdisk -l <------------detect it in guest # mkfs.ext4 /dev/vda <------------successfully # dd if=/dev/zero of=/dev/vda bs=1M count=1000 <------------successfully - Ceph: I did not use the Ceph in the past, just leave Ceph along first, i will investigate it later and try it. > * I/O throttling > I/O throttling requires the new AioContext timer support which is > currently being merged upstream. > > * Hot unplug > Due to the way bdrv_in_use() is currently used to prevent hot unplug, > the virtio-blk-pci adapter cannot be hot unplugged. The hot unplug > command will fail with EBUSY. > e.g:...-drive file=/dev/disk/by-path/ip-10.66.33.253:3260-iscsi-iqn.2014.sluo.com:iscsi.storage.1-lun-1,if=none,id=drive-data-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,config-wce=off,x-data-plane=on,bus=pci.0,addr=0x7 (qemu) device_del data-disk (qemu) info block <-------hotunplug successfully. guest ]# fdisk -l <-------disappear from guest. guest ]# dmesg <-------not see any error > * iothreads as objects > The user should be able to define the number of iothreads and bind > devices to specific threads. iothreads should be discoverable using > a QMP query-iothreads command, which includes thread IDs suitable for > CPU affinity setting. > e.g:...-object iothread,id=iothread0 -drive file=/home/my-data-disk.qcow2,if=none,id=drive-data-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,iothread=iothread0,bus=pci.0,addr=0x7 discoverable iothreads via QMP: {"execute":"query-iothreads"} {"return": [{"thread-id": 6072, "id": "iothread0"}]} > * Runtime NBD exports > The runtime NBD server should not interfere with dataplane. This > will allow image fleecing and other NBD export users to coexist with > dataplane. > 1).launch a KVM guest with data-plane. e.g:...-object iothread,id=iothread0 -drive file=/home/my-data-disk.qcow2,if=none,id=drive-data-disk0,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk0,id=data-disk0,iothread=iothread0,bus=pci.0,addr=0x7 2).nbd_server_start nbd_server_start [-a] [-w] host:port -- serve block devices on the given host and port and export a block device via NBD. {"execute":"qmp_capabilities"} {"return": {}} { "execute": "nbd-server-start", "arguments": { "addr": { "type": "inet", "data": { "host": "10.66.11.154", "port": "1234" } } } } {"return": {}} { "execute": "nbd-server-add", "arguments": { "device": "drive-data-disk0", "writable": true } } {"return": {}} 3).read/write to the nbd target. # qemu-img info nbd://10.66.11.154:1234/drive-data-disk0 image: file format: raw virtual size: 10G (10737418240 bytes) disk size: unavailable # qemu-io -c 'read 512 1024' nbd://10.66.11.154:1234/drive-data-disk0 read 1024/1024 bytes at offset 512 1 KiB, 1 ops; 0.0003 sec (2.713 MiB/sec and 2777.7778 ops/sec) # qemu-io -c 'write 512 1024' nbd://10.66.11.154:1234/drive-data-disk0 wrote 1024/1024 bytes at offset 512 1 KiB, 1 ops; 0.0005 sec (1.922 MiB/sec and 1968.5039 ops/sec) > * Block jobs > Block jobs should not interfere with dataplane. Currently block jobs > cannot be started (EBUSY). > e.g.:...-object iothread,id=iothread0 -drive file=/home/my-data-disk.qcow2,if=none,id=drive-data-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-data-disk,id=data-disk,iothread=iothread0,bus=pci.0,addr=0x7 - do block stream: { "execute": "drive-mirror", "arguments": { "device": "drive-data-disk", "target": "/root/sn1", "format": "qcow2", "mode": "absolute-paths", "sync": "full", "speed": 1000000000, "on-source-error": "stop", "on-target-error": "stop" } } {"error": {"class": "GenericError", "desc": "Device 'drive-data-disk' is busy: block device is in use by data plane"}} - do block mirror: { "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-data-disk","snapshot-file": "/home/snap", "format": "qcow2" } } {"error": {"class": "GenericError", "desc": "Device 'drive-data-disk' is busy: block device is in use by data plane"}} > This BZ will be closed once the first patch-series implementing the bulk of > the work is backported and the new implementation is ready for wider > testing. If necessary, extra BZs will be open to track missing > functionalities by then. Best Regards, sluo (In reply to Sibiao Luo from comment #4) > (In reply to Ademar Reis from comment #0) > > Stefan is working on the proper data-plane implementation that covers all of > > the block layer and this BZ is for the backport of such work. The current > > implementation duplicates blocklayer code and is very limited, serving only > > as a tech-preview experiment. > > Verify this issue with the following supported parts. > host info: > # uname -r && rpm -q qemu-kvm-rhev && rpm -q seabios > 3.10.0-145.el7.x86_64 > qemu-kvm-rhev-2.1.0-2.el7.x86_64 > seabios-1.7.5-4.el7.x86_64 > guest info: > # uname -r > 3.10.0-145.el7.x86_64 > > > > * I/O throttling > > I/O throttling requires the new AioContext timer support which is > > currently being merged upstream. > > > 1.launch a KVM guest with data-plane. e.g:...-object iothread,id=iothread0 -drive file=/home/my-data-disk.qcow2,if=none,id=drive-data-disk0,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop,bps=1024000,bps_rd=0,bps_wr=0,iops=1024000,iops_rd=0,iops_wr=0 -device virtio-blk-pci,drive=drive-data-disk0,id=data-disk0,iothread=iothread0,bus=pci.0,addr=0x7 (qemu) info block drive-system-disk: /home/RHEL-7.0-20140507.0-Server-x86_64.qcow2 (qcow2) drive-data-disk0: /home/my-data-disk.qcow2 (qcow2) I/O throttling: bps=1024000 bps_rd=0 bps_wr=0 bps_max=102400 bps_rd_max=0 bps_wr_max=0 iops=1024000 iops_rd=0 iops_wr=0 iops_max=102400 iops_rd_max=0 iops_wr_max=0 iops_size=0 ... 2).query the block info via HMP/QMP monitor. {"execute":"query-block"} {"return": [...{"io-status": "ok", "device": "drive-data-disk0", "locked": false, "removable": false, "inserted": {"iops_rd": 0, "detect_zeroes": "off", "image": {"virtual-size": 10737418240, "filename": "/home/my-data-disk.qcow2", "cluster-size": 65536, "format": "qcow2", "actual-size": 140320768, "format-specific": {"type": "qcow2", "data": {"compat": "1.1", "lazy-refcounts": false}}, "dirty-flag": false}, "iops_wr": 0, "ro": false, "backing_file_depth": 0, "drv": "qcow2", "bps_max": 102400, "iops": 1024000, "bps_wr": 0, "encrypted": false, "bps": 1024000, "bps_rd": 0, "iops_max": 102400, "file": "/home/my-data-disk.qcow2", "encryption_key_missing": false}, "type": "unknown"}, {"io-status": "ok", "device": "ide1-cd0", "locked": false, "removable": true, "tray_open": false, "type": "unknown"}, {"device": "floppy0", "locked": false, "removable": true, "tray_open": false, "type": "unknown"}, {"device": "sd0", "locked": false, "removable": true, "tray_open": false, "type": "unknown"}]} 3).do fio to the disk in guest. # fio --filename=/dev/vda --direct=1 --rw=randrw --bs=100K --size=10M --name=test --iodepth=100 --ioengine=libaio test: (g=0): rw=randrw, bs=100K-100K/100K-100K/100K-100K, ioengine=libaio, iodepth=100 fio-2.1.10 Starting 1 process Jobs: 1 (f=1): [m] [3.2% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta 05m:31s] test: (groupid=0, jobs=1): err= 0: pid=4230: Mon Sep 1 02:17:19 2014 read : io=4900.0KB, bw=501659B/s, iops=4, runt= 10002msec slat (usec): min=6, max=34, avg=10.53, stdev= 4.11 clat (msec): min=1, max=10001, avg=9390.61, stdev=2413.35 lat (msec): min=1, max=10001, avg=9390.62, stdev=2413.35 clat percentiles (usec): | 1.00th=[ 1352], 5.00th=[99840], 10.00th=[10027008], 20.00th=[10027008], | 30.00th=[10027008], 40.00th=[10027008], 50.00th=[10027008], 60.00th=[10027008], | 70.00th=[10027008], 80.00th=[10027008], 90.00th=[10027008], 95.00th=[10027008], | 99.00th=[10027008], 99.50th=[10027008], 99.90th=[10027008], 99.95th=[10027008], | 99.99th=[10027008] bw (KB /s): min= 39, max= 39, per=7.98%, avg=39.00, stdev= 0.00 write: io=5300.0KB, bw=542611B/s, iops=5, runt= 10002msec slat (usec): min=8, max=22, avg=15.28, stdev= 3.32 clat (usec): min=10000K, max=10001K, avg=10000757.38, stdev=421.06 lat (usec): min=10000K, max=10001K, avg=10000772.89, stdev=418.90 clat percentiles (msec): | 1.00th=[10028], 5.00th=[10028], 10.00th=[10028], 20.00th=[10028], | 30.00th=[10028], 40.00th=[10028], 50.00th=[10028], 60.00th=[10028], | 70.00th=[10028], 80.00th=[10028], 90.00th=[10028], 95.00th=[10028], | 99.00th=[10028], 99.50th=[10028], 99.90th=[10028], 99.95th=[10028], | 99.99th=[10028] lat (msec) : 2=0.98%, 20=0.98%, 250=0.98%, >=2000=97.06% cpu : usr=0.00%, sys=0.02%, ctx=93, majf=0, minf=29 IO depths : 1=1.0%, 2=2.0%, 4=3.9%, 8=7.8%, 16=15.7%, 32=31.4%, >=64=38.2% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=75.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=25.0% issued : total=r=49/w=53/d=0, short=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=100 Run status group 0 (all jobs): READ: io=4900KB, aggrb=489KB/s, minb=489KB/s, maxb=489KB/s, mint=10002msec, maxt=10002msec WRITE: io=5300KB, aggrb=529KB/s, minb=529KB/s, maxb=529KB/s, mint=10002msec, maxt=10002msec Disk stats (read/write): vda: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00% Best Regards, sluo *** Bug 1101572 has been marked as a duplicate of this bug. *** Current dataplane status. Supported: * Image formats * I/O throttling * Block jobs * GlusterFS, RBD, iSCSI, NBD Unsupported: * External and internal snapshots * QMP 'transaction' command * Eject * qcow2 encryption Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0624.html |