Description of problem: Today, vdsm uses the "dd" linux command to wipe volumes. The problem with using "dd" to wipe volumes is that it is very slow (~ 7 minutes to wipe a 10GB volume on netapp in my environment). To zero volumes more efficiently, vdsm can use the "blkdiscard" command from the util-linux package, which can run up to ~ 10 times faster. Version-Release number of selected component (if applicable): 7cf1dbe1b669e9dab203b33baae34192bf01e114 Steps to Reproduce: 1. Create a disk on a block storage domain. 2. Set its "Wipe After Delete" property to true. 3. Remove the disk and see in the vdsm log that it is done very slow. You can do that by calculating the time that passes between the log message "Zero volume thread started for volume <volume_id>" and the log message "Zero volume <volume_id> task <task_id> completed". Actual results: In my environment it takes ~ 7 minutes to wipe a 10GB disk. Expected results: Should be quicker, if it's possible. Additional info: Calling "blkdiscard -z <block_device>" should work at least fast as "dd", and up to ~ 10 times faster, as it calls "write same" if the device supports it.
No, I tried to figure out what we should do so that won't happen, but we decided that we should be done with bug 1241106 first, and then come back to this one. The last thing I saw was that the command failed with timeouts only when I ran it from vdsm. I guess that we should run it with higher priority or maybe run it in a different way (not like we run dd today). Anyway, right now there's no need for a bug.
Bug 1475780 was opened to switch the default zero method to "blkdiscard".
Tested on ovirt-engine-4.2.0-0.0.master.20170828065003.git0619c76.el7.centos wipe after delete took 1.5 minutes for a 10GB disk on a block domain Is there any change to move the messages Zero volume thread started for volume <VOL_ID> and Zero volume thread finished for volume <VOL_ID> to be on INFO logger level or should I open a new bug for that request?
(In reply to Raz Tamir from comment #5) > Tested on ovirt-engine-4.2.0-0.0.master.20170828065003.git0619c76.el7.centos > > wipe after delete took 1.5 minutes for a 10GB disk on a block domain On the same setup/storage, how long does it take to delete a 10GB disk with the "old" method? > > Is there any change to move the messages > Zero volume thread started for volume <VOL_ID> > and > Zero volume thread finished for volume <VOL_ID> > > to be on INFO logger level or should I open a new bug for that request? Let's have a new BZ for this please
(In reply to Allon Mureinik from comment #6) > (In reply to Raz Tamir from comment #5) > > Tested on ovirt-engine-4.2.0-0.0.master.20170828065003.git0619c76.el7.centos > > > > wipe after delete took 1.5 minutes for a 10GB disk on a block domain > On the same setup/storage, how long does it take to delete a 10GB disk with > the "old" method? ~ 5 minutes > > > > > Is there any change to move the messages > > Zero volume thread started for volume <VOL_ID> > > and > > Zero volume thread finished for volume <VOL_ID> > > > > to be on INFO logger level or should I open a new bug for that request? > Let's have a new BZ for this please https://bugzilla.redhat.com/show_bug.cgi?id=1487151
(In reply to Raz Tamir from comment #7) > (In reply to Allon Mureinik from comment #6) > > (In reply to Raz Tamir from comment #5) > > > Tested on ovirt-engine-4.2.0-0.0.master.20170828065003.git0619c76.el7.centos > > > > > > wipe after delete took 1.5 minutes for a 10GB disk on a block domain > > On the same setup/storage, how long does it take to delete a 10GB disk with > > the "old" method? > ~ 5 minutes A x3 improvement for a 10GB disk? I'll take that! > > > > > > > > Is there any change to move the messages > > > Zero volume thread started for volume <VOL_ID> > > > and > > > Zero volume thread finished for volume <VOL_ID> > > > > > > to be on INFO logger level or should I open a new bug for that request? > > Let's have a new BZ for this please > https://bugzilla.redhat.com/show_bug.cgi?id=1487151 Thanks!
(In reply to Raz Tamir from comment #7) > (In reply to Allon Mureinik from comment #6) > > (In reply to Raz Tamir from comment #5) > > > Tested on ovirt-engine-4.2.0-0.0.master.20170828065003.git0619c76.el7.centos > > > > > > wipe after delete took 1.5 minutes for a 10GB disk on a block domain That's a bit slow - ~113MBps for discard? > > On the same setup/storage, how long does it take to delete a 10GB disk with > > the "old" method? > ~ 5 minutes And that's VERY slow! 34MBps?!?! Something wrong with that storage. Or my math. My laptop (SSD, 16.5G): [ykaul@ykaul sosreport-dvrhvm01.cbec.gov.in-20170831074327]$ time sudo blkdiscard --zero /dev/sda3 real 1m47.121s user 0m0.005s sys 0m0.167s But: [ykaul@ykaul sosreport-dvrhvm01.cbec.gov.in-20170831074327]$ time sudo blkdiscard /dev/sda3 [sudo] password for ykaul: real 0m4.236s user 0m0.019s sys 0m0.017s So perhaps my SSD doesn't support write_same? [ykaul@ykaul sosreport-dvrhvm01.cbec.gov.in-20170831074327]$ sudo sg_inq -p 0xb0 /dev/sda VPD INQUIRY: Block limits page (SBC) Maximum compare and write length: 0 blocks Optimal transfer length granularity: 1 blocks Maximum transfer length: 0 blocks Optimal transfer length: 0 blocks Maximum prefetch transfer length: 0 blocks Maximum unmap LBA count: 0 Maximum unmap block descriptor count: 0 Optimal unmap granularity: 1 Unmap granularity alignment valid: 0 Unmap granularity alignment: 0 Maximum write same length: 0x3fffc0 blocks Maximum atomic transfer length: 0 Atomic alignment: 0 Atomic transfer length granularity: 0 > > > > > > > > Is there any change to move the messages > > > Zero volume thread started for volume <VOL_ID> > > > and > > > Zero volume thread finished for volume <VOL_ID> > > > > > > to be on INFO logger level or should I open a new bug for that request? > > Let's have a new BZ for this please > https://bugzilla.redhat.com/show_bug.cgi?id=1487151
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017. Since the problem described in this bug report should be resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.