Bug 1672682 - virtio-blk: add discard and write zeroes support (libvirt)
Summary: virtio-blk: add discard and write zeroes support (libvirt)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: ---
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Libvirt Maintainers
QA Contact: gaojianan
URL:
Whiteboard:
Depends On: 1672680 1692939
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-05 15:57 UTC by Stefano Garzarella
Modified: 2023-03-14 19:56 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of: 1672680
Environment:
Last Closed: 2020-01-03 20:18:26 UTC
Type: Feature Request
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Stefano Garzarella 2019-02-05 15:57:59 UTC
+++ This bug was initially created as a clone of Bug #1672680 +++

Add the support of DISCARD and WRITE ZEROES commands, that have been introduced in the virtio-blk protocol to have better performance when using SSD backend.
Linux driver (guest) already supports these features.

Comment 1 Peter Krempa 2019-02-11 12:55:57 UTC
We already support passing those options to qemu for the virtio-blk-pci device:

    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' discard='unmap' detect_zeroes='unmap'/>
      <source file='/tmp/snap'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
    </disk>

which translates into:

-drive file=/tmp/snap,format=qcow2,if=none,id=drive-virtio-disk0,discard=unmap,detect-zeroes=unmap
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0xa,drive=drive-virtio-disk0,id=virtio-disk0

Note that this depends on qemu adding the feature.

Comment 2 Stefano Garzarella 2019-02-12 09:11:43 UTC
(In reply to Peter Krempa from comment #1)
> We already support passing those options to qemu for the virtio-blk-pci
> device:
> 
>     <disk type='file' device='disk'>
>       <driver name='qemu' type='qcow2' discard='unmap'
> detect_zeroes='unmap'/>
>       <source file='/tmp/snap'/>
>       <target dev='vda' bus='virtio'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x0a'
> function='0x0'/>
>     </disk>
> 
> which translates into:
> 
> -drive
> file=/tmp/snap,format=qcow2,if=none,id=drive-virtio-disk0,discard=unmap,
> detect-zeroes=unmap
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0xa,drive=drive-virtio-disk0,
> id=virtio-disk0
> 
> Note that this depends on qemu adding the feature.

Thanks Peter,
just a note: the QEMU patch series will add some new parameters for virtio-blk
device (ie. discard, write-zeroes, max-discard-sectors, max-write-zeroes-sectors).
The discard and write-zeroes parameters should be used to enable (default) or
disable the new features for the guest perspective.

Comment 3 Peter Krempa 2019-02-12 09:21:39 UTC
Okay, so it looks like that we don't support "write-zeroes". What is the purpose and benefit of enabling that option?

Comment 4 Stefano Garzarella 2019-02-12 09:35:19 UTC
(In reply to Peter Krempa from comment #3)
> Okay, so it looks like that we don't support "write-zeroes". What is the
> purpose and benefit of enabling that option?

With this feature, the guest driver can specify the sectors to fill with zeroes.
(eg. for security reason you want to be sure that some sectors are cleaned and then discard them)
Using this command rather than use the write command with a buffer with all zeroes should be faster.

Comment 5 Peter Krempa 2019-02-12 11:42:05 UTC
So is this operation executed by the SSD firmware or emulated in qemu?

Comment 6 Stefano Garzarella 2019-02-12 12:46:53 UTC
(In reply to Peter Krempa from comment #5)
> So is this operation executed by the SSD firmware or emulated in qemu?

Depends on the host OS and block backend: for example, if you use raw file on Linux host,
the QEMU backend can use the BLKZEROOUT ioctl to directly talk with the device driver
(and then with the SSD firmware).

If the feature is not supported, it is emulated writing a bounce buffer of zeroes.

Comment 7 Peter Krempa 2019-02-12 12:52:12 UTC
So I don't understand then why that isn't enabled by default. Generally we try to give the users options to configure what makes sense to configure and this in my opinion does not make sense. It should be just always enabled. Are there any drawbacks I don't see?

Comment 8 Stefano Garzarella 2019-02-12 13:28:18 UTC
(In reply to Peter Krempa from comment #7)
> So I don't understand then why that isn't enabled by default. Generally we
> try to give the users options to configure what makes sense to configure and
> this in my opinion does not make sense. It should be just always enabled.
> Are there any drawbacks I don't see?

I think that there are no drawbacks. Indeed, it will be enabled by default in QEMU. (sorry if I didn't explain well this)
We added the parameters in QEMU to handle the migration issue.

Comment 9 Peter Krempa 2019-02-12 13:34:14 UTC
I'm not sure what you mean by migration issue. If it does not work automagically when migrating from a new version which automatically enables this feature to an older version which does not support it you can't enable it by default as it will break migrations.

Comment 10 Stefano Garzarella 2019-02-12 14:04:13 UTC
To migrate from a new version (e.g. 4.0) to an older version (e.g. 3.1), you can use the machine type (e.g. "pc-q35-3.1") in a new QEMU.
In this case, we must disable all new features not available in the 3.1 (like discard and write_zeroes for virtio-blk) and the common way to do that is through parameters.
So, when you run "qemu -machine pc-q35-3.1" the discard and write_zeroes will be automagically disabled to allow a safe migration.

Comment 11 Peter Krempa 2019-02-12 14:19:17 UTC
Ah cool, so if the feature behaves correctly during migration depending on the machine type and is enabled by default for new machine types I don't think that libvirt should expose a way to disable it.

Comment 12 Stefano Garzarella 2019-02-13 08:36:29 UTC
(In reply to Peter Krempa from comment #11)
> Ah cool, so if the feature behaves correctly during migration depending on
> the machine type and is enabled by default for new machine types I don't
> think that libvirt should expose a way to disable it.

Make sense.

Thanks,
Stefano

Comment 13 gaojianan 2019-02-22 09:07:40 UTC
I can't find the upstream or the downstream patch link both in this bug and  Bug #1672680 ,so can you give me the patch link or tell me where is it in this bz?
Thank you.

Comment 14 Daniel Berrangé 2019-02-22 11:06:51 UTC
There is no libvirt patch for this bug yet.

Comment 15 Daniel Berrangé 2019-02-22 11:08:09 UTC
Sorry, submitted too soon. I meant to say the feature was already supported in libvirt so required no extra work.

Comment 16 gaojianan 2019-02-25 09:20:00 UTC
(In reply to Peter Krempa from comment #3)
> Okay, so it looks like that we don't support "write-zeroes". What is the
> purpose and benefit of enabling that option?

I have a question here about "write-zeroes".
In comment 1 i know we already support "discard=unmap,detect-zeroes=unmap" ,but is "write-zeroes" and "detect-zeroes" the same meaning?
I check in libvirt that we only have "detect-zeroes",but in qemu i find "write_zeroes".
So i want to confirm this question.

Thank you.

Comment 17 Peter Krempa 2019-02-25 09:33:54 UTC
Please read the discussion above. It clarifies what 'write-zeroes' is about and also you can find out that it's supposed to be auto-enabled by qemu and also clarifies that it does not seem to be worth being able to disable 'write-zeroes'

Comment 18 gaojianan 2019-02-26 01:46:59 UTC
(In reply to Peter Krempa from comment #17)
> Please read the discussion above. It clarifies what 'write-zeroes' is about
> and also you can find out that it's supposed to be auto-enabled by qemu and
> also clarifies that it does not seem to be worth being able to disable
> 'write-zeroes'

Okay,I know your means.
And i want to confirm that if  i don't specify the "discard,detect-zeroes" in the guest's xml, what is the default value we'll get in libvirt or qemu?
And how can i confirm they really come into effect ?

Thank you!

Comment 21 Peter Krempa 2019-03-07 07:49:22 UTC
(In reply to gaojianan from comment #18)

[...]

> And i want to confirm that if  i don't specify the "discard,detect-zeroes"
> in the guest's xml, what is the default value we'll get in libvirt or qemu?
> And how can i confirm they really come into effect ?

Both 'discard' and 'detect-zeroes' are disabled by default in qemu, so if you don't specify them they should be disabled.

Note that 'detect-zeroes' is not relevant in context of this bug.

The discard option, if enabled, shows up as '-drive ...,id=drive-virtio-disk0,discard=unmap,...' on the command line or "discard"="unmap" in case of -blockdev. AFAIK this can't be queried from qemu at this point.

Comment 23 gaojianan 2019-03-11 07:58:40 UTC
Verified on:
qemu-kvm-3.1.0-18.module+el8+2834+fa8bb6e2.x86_64
libvirt-5.0.0-6.module+el8+2860+4e0fe96a.x86_64

1. Start a VM with following disk xml:
   <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' discard='unmap' detect_zeroes='unmap'/>
      <source file='/tmp/scsi'/>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw' discard='ignore' detect_zeroes='unmap'/>
      <source file='/var/lib/libvirt/images/boot1.iso'/>
      <backingStore/>
      <target dev='hdc' bus='sata'/>
      <readonly/>
      <alias name='sata0-0-2'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>

2.Start the guest and check
[root@nssguest ~]# ps aux |grep rhel8.0
qemu     12035  7.5  8.0 3962160 632132 ?      Sl   15:47   0:39 /usr/libexec/qemu-kvm -name guest=rhel8.0,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-5-rhel8.0/master-key.aes -machine pc-q35-rhel8.0.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off -cpu SandyBridge-IBRS,vme=on,ss=on,pcid=on,hypervisor=on,arat=on,tsc_adjust=on,umip=on,ssbd=on,xsaveopt=on -m 2049 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 67f477a2-d089-4ef1-ab35-54bbd0f1e27f -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=26,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -device pcie-root-port,port=0x17,chassis=8,id=pci.8,bus=pcie.0,addr=0x2.0x7 -device pcie-pci-bridge,id=pci.9,bus=pci.1,addr=0x0 -device pcie-root-port,port=0x19,chassis=10,id=pci.10,bus=pcie.0,addr=0x4 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-scsi-pci,id=scsi0,bus=pci.7,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -drive file=/var/lib/libvirt/images/RHEL-8.0-x86_64-latest.qcow2.3,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.8,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/tmp/scsi,format=qcow2,if=none,id=drive-virtio-disk1,discard=unmap,detect-zeroes=unmap -device virtio-blk-pci,scsi=off,bus=pcie.0,addr=0xa,drive=drive-virtio-disk1,id=virtio-disk1 -drive file=/var/lib/libvirt/images/boot1.iso,format=raw,if=none,id=drive-sata0-0-2,media=cdrom,readonly=on,discard=ignore,detect-zeroes=on -device ide-cd,bus=ide.2,drive=drive-sata0-0-2,id=sata0-0-2 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=28,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on


Worked as expected

Comment 24 gaojianan 2019-09-16 09:02:21 UTC
(In reply to Peter Krempa from comment #1)
> We already support passing those options to qemu for the virtio-blk-pci
> device:
> 
>     <disk type='file' device='disk'>
>       <driver name='qemu' type='qcow2' discard='unmap'
> detect_zeroes='unmap'/>
>       <source file='/tmp/snap'/>
>       <target dev='vda' bus='virtio'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x0a'
> function='0x0'/>
>     </disk>
> 
> which translates into:
> 
> -drive
> file=/tmp/snap,format=qcow2,if=none,id=drive-virtio-disk0,discard=unmap,
> detect-zeroes=unmap
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0xa,drive=drive-virtio-disk0,
> id=virtio-disk0
> 
> Note that this depends on qemu adding the feature.

Sorry for having a question about it after verified this bug.
I reproduce it in new qemu version:qemu-kvm-4.1.0-9.module+el8.1.0+4210+23b2046a.x86_64
and set the disk xml like what you did before,but i found that even if i tried it in qemu version 
qemu-kvm-1.5.3-167.el7.x86_64 ,i get the same qemu process "-drive file=/var/lib/libvirt/images/RHEL-8.1-x86_64-latest.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,discard=unmap,detect-zeroes=unmap -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0xa,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 "
So are there any differences between the two qemu versions or did i do something wrong ?

Comment 25 Peter Krempa 2019-09-16 09:15:28 UTC
This feature was introduced very long time ago and libvirt didn't add any kind of checks if it is supported by qemu. Thus we always try to pass it to qemu even if qemu didn't support it previously. We only retruned error if qemu refused to start with such configuration.

Comment 26 Marina Kalinin 2020-01-03 20:15:33 UTC
Shouldn't this BZ be closed ERRATA https://access.redhat.com/errata/RHBA-2019:3723 ?

Comment 27 Ademar Reis 2020-01-03 20:18:26 UTC
(In reply to Marina Kalinin from comment #26)
> Shouldn't this BZ be closed ERRATA
> https://access.redhat.com/errata/RHBA-2019:3723 ?

Indeed. This feature is enabled and was verified in 8.1.0, so closing it (thanks Marina).

Comment 28 Devin Henderson 2020-07-10 17:11:04 UTC
I'm still seeing

EXT4-fs (vda2): mounting with "discard" option, but the device does not support discard

in my KVM guest when using virtio-blk and discard='unmap'. It works if I switch it to virtio-scsi. Does this not work with 'raw' storage format? (Underlying storage is an LVM)

Thanks

Comment 29 Stefano Garzarella 2020-07-14 08:49:30 UTC
(In reply to Devin Henderson from comment #28)
> I'm still seeing
> 
> EXT4-fs (vda2): mounting with "discard" option, but the device does not
> support discard
> 
> in my KVM guest when using virtio-blk and discard='unmap'. It works if I
> switch it to virtio-scsi. Does this not work with 'raw' storage format?
> (Underlying storage is an LVM)
> 

What's your environment? (kernel version, QEMU version)


Note You need to log in before you can comment on or make changes to this bug.