Bug 1563794

Summary:	vdostats displays incorrect information when used with logical volumes
Product:	Red Hat Enterprise Linux 7	Reporter:	David Galloway <dgallowa>
Component:	vdo	Assignee:	Bryan Gurney <bgurney>
Status:	CLOSED NOTABUG	QA Contact:	Filip Suba <fsuba>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	7.5	CC:	awalsh, bgurney, corwin, dgallowa, hansjoerg.maurer, limershe, pasik
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-05-12 19:05:20 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description David Galloway 2018-04-04 17:39:47 UTC

Description of problem:
I created logical volume on top of a vdo device and `vdostats` use information is incorrect.  I know this because it's being used as a Ceph OSD which tells me how much data is on the OSD.

Version-Release number of selected component (if applicable):
[root@reesi004 ~]# rpm -qa | grep vdo
kmod-kvdo-6.1.0.153-15.el7.x86_64
vdo-6.1.0.149-16.x86_64

[root@reesi004 ~]# uname -r
3.10.0-862.el7.x86_64

[root@reesi004 ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.5 (Maipo)

[root@reesi004 ~]# ceph --version
ceph version 12.2.4-4.el7cp (bfc2b497ab362f2b3afa7bd1f9d0053f74b60d66) luminous (stable)


How reproducible:
Always

Steps to Reproduce:
1. Create vdo device
2. Create physical volume, volume group, then logical volume on top of vdo device
3. Create a Ceph OSD on top of the logical volume

## Example
vdo create --name vdo_sda --device=/dev/sda --writePolicy=async
 
pvcreate /dev/mapper/vdo_sda
 
vgcreate vg_sda /dev/mapper/vdo_sda
 
lvcreate -l 100%FREE -n lv_sda vg_sda
 
ceph-volume lvm create --bluestore --data vg_sda/lv_sda --block.db journals/sda

Actual results:
[root@reesi003 ~]# ceph osd df | grep 'ID\|^55'
ID CLASS WEIGHT  REWEIGHT SIZE  USE    AVAIL  %USE  VAR  PGS 
55   hdd 3.65810  1.00000 3745G 50645M  3696G  1.32 0.05 367 


[root@reesi004 ~]# vdostats --hu
Device                    Size      Used Available Use% Space saving%
/dev/mapper/vdo_sda       3.6T     19.7G      3.6T   0%           23%

[root@reesi004 ~]# ceph-volume lvm list


====== osd.55 ======

  [  db]    /dev/journals/sda

      type                      db
      osd id                    55
      cluster fsid              dd5d3a03-ba02-48a5-97ec-869b5443fb7b
      cluster name              ceph
      osd fsid                  19e83348-bc84-4516-9a78-fbac863ca3f7
      db device                 /dev/journals/sda
      encrypted                 0
      db uuid                   Y7JcqI-YQ2A-CetU-rqwA-Fp48-yT7h-mWjG07
      cephx lockbox secret      
      block uuid                EutNis-Gb1K-D281-dAMa-tHtD-6nYW-qiOCXv
      block device              /dev/vg_sda/lv_sda
      crush device class        None

  [block]    /dev/vg_sda/lv_sda

      type                      block
      osd id                    55
      cluster fsid              dd5d3a03-ba02-48a5-97ec-869b5443fb7b
      cluster name              ceph
      osd fsid                  19e83348-bc84-4516-9a78-fbac863ca3f7
      db device                 /dev/journals/sda
      encrypted                 0
      db uuid                   Y7JcqI-YQ2A-CetU-rqwA-Fp48-yT7h-mWjG07
      cephx lockbox secret      
      block uuid                EutNis-Gb1K-D281-dAMa-tHtD-6nYW-qiOCXv
      block device              /dev/vg_sda/lv_sda
      crush device class        None


Expected results:
19700/50645 would be roughly 39% space saving

Additional info:

Comment 2 Bryan Gurney 2018-04-04 19:39:13 UTC

The "vdostats" command displays a df-style output, and in this output, the "Used" column contains, according to the man page for vdostats(8), "The total number of 1K blocks used on a VDO volume".  This is more than just the blocks used by the data itself; it also includes VDO's metadata.

If you run "vdostats --verbose vdo_sda", you'll see a lot more statistics for the VDO volume.  Among those statistics will be "data blocks used" ("the number of physical blocks currently in use by a VDO volume to store data") and "overhead blocks used" ("the number of physical blocks currently in use by a VDO volume to store VDO metadata").

The "Used" statistic in "vdostats" corresponds to "(data blocks used + overhead blocks used) * 4096 / 1024".  (In "vdostats --verbose", this is also shown in "1K-blocks used".)

In "vdostats --verbose", there's another statistic called "logical blocks used"; this tracks the number of logical blocks currently mapped.

Also, don't forget to run "mount -o discard", in order to ensure that the OSD will perform discards on the device, which will reclaim space.

If you have a chance, can you run the command "vdostats --verbose vdo_sda | grep blocks" to show the block statistics for your VDO volume?

Comment 3 David Galloway 2018-04-06 19:11:42 UTC

Ah, the 'mount -o discard' may need to be added to ceph-volume then since that was the tool used to create the OSD.

Here's the output requested.

[root@reesi004 ~]# vdostats --verbose vdo_sda | grep blocks
  data blocks used                    : 113125669
  overhead blocks used                : 1879291
  logical blocks used                 : 133219360
  physical blocks                     : 976754646
  logical blocks                      : 973838646
  1K-blocks                           : 3907018584
  1K-blocks used                      : 460019840
  1K-blocks available                 : 3446998744
  compressed blocks written           : 9318395
  journal blocks batching             : 0
  journal blocks started              : 2277538
  journal blocks writing              : 0
  journal blocks written              : 2277538
  journal blocks committed            : 2277538
  slab journal blocks written         : 105095
  slab summary blocks written         : 104693
  reference blocks written            : 38655

Note that more data was written since I opened the bug.

# ceph osd df | grep 'ID\|^55'
ID CLASS WEIGHT  REWEIGHT SIZE  USE    AVAIL  %USE  VAR  PGS 
55   hdd 3.65810  1.00000 3745G   542G  3203G 14.48 0.58   0 

# vdostats --hu
Device                    Size      Used Available Use% Space saving%
/dev/mapper/vdo_sda       3.6T    438.7G      3.2T  11%           15%


I also opened a bug for Ceph to make it aware of VDO OSDs: https://tracker.ceph.com/issues/23554

Comment 4 David Galloway 2018-04-06 19:16:46 UTC

Bug filed for ceph-volume: https://tracker.ceph.com/issues/23581

Comment 5 Bryan Gurney 2018-04-06 19:36:02 UTC

So I see that the "logical blocks used" statistic is 133219360 (this is the number of 4096-byte blocks that are used in the logical space of the VDO volume).  If I compare that to the "data blocks used" statistic of 113125669, and calculate the savings percentage:

(133219360 - 113125669) / 133219360.0

...the result is "0.1508316...", which corresponds to the "Space saving%" value of "15%".

One question that I have for the "ceph osd df" command: does the "USE" statistic add the used space for the "db" device (in this OSD, /dev/journals/sda), and the block device (/dev/vg_sda/lv_sda)?

The VDO statistic "logical blocks used" tracks the number of logical blocks currently mapped.  A block will be "mapped" until it is discarded, which could occur manually on a filesystem via "fstrim", or by the filesystem driver for a filesystem mounted via "mount -o discard".

Comment 6 Bryan Gurney 2018-04-06 20:13:25 UTC

Another question (which I think we should ask Sage): is there a way to configure Bluestore to issue discards to a VDO volume, in order to reclaim space from blocks on the VDO volume that are no longer used?

I did see a pull request in Luminous for adding discard support to bluestore (https://github.com/ceph/ceph/pull/14727), but it was in the context of discarding blocks on solid state drives (and it is disabled by default).  Discarding will also be important for a VDO volume, since it's a thin-provisioned device.

Comment 7 Bryan Gurney 2018-04-06 20:56:50 UTC

Now that I think of it, you would be able to see on your test system if any discards were sent to the VDO volume.  If you run "vdostats --verbose", there should be a statistic called "bios in discard"; does it have a number that is greater than zero?

Comment 8 David Galloway 2018-04-07 02:14:41 UTC

(In reply to Bryan Gurney from comment #6)
> Another question (which I think we should ask Sage): is there a way to
> configure Bluestore to issue discards to a VDO volume, in order to reclaim
> space from blocks on the VDO volume that are no longer used?
> 
> I did see a pull request in Luminous for adding discard support to bluestore
> (https://github.com/ceph/ceph/pull/14727), but it was in the context of
> discarding blocks on solid state drives (and it is disabled by default). 
> Discarding will also be important for a VDO volume, since it's a
> thin-provisioned device.

I filed a ticket to have the volume/disk preparation tool (ceph-volume) mount VDO devices using '-o discard' https://tracker.ceph.com/issues/23581

Comment 9 David Galloway 2018-04-07 02:15:29 UTC

(In reply to Bryan Gurney from comment #7)
> Now that I think of it, you would be able to see on your test system if any
> discards were sent to the VDO volume.  If you run "vdostats --verbose",
> there should be a statistic called "bios in discard"; does it have a number
> that is greater than zero?

[root@reesi004 ~]# vdostats --verbose
/dev/mapper/vdo_sda : 
  version                             : 26
  release version                     : 131337
  data blocks used                    : 113125669
  overhead blocks used                : 1879291
  logical blocks used                 : 133219360
  physical blocks                     : 976754646
  logical blocks                      : 973838646
  1K-blocks                           : 3907018584
  1K-blocks used                      : 460019840
  1K-blocks available                 : 3446998744
  used percent                        : 11
  saving percent                      : 15
  block map cache size                : 134217728
  write policy                        : async
  block size                          : 4096
  completed recovery count            : 0
  read-only recovery count            : 0
  operating mode                      : normal
  recovery progress (%)               : N/A
  compressed fragments written        : 28111173
  compressed blocks written           : 9318395
  compressed fragments in packer      : 0
  slab count                          : 1861
  slabs opened                        : 272
  slabs reopened                      : 0
  journal disk full count             : 0
  journal commits requested count     : 2645565
  journal entries batching            : 0
  journal entries started             : 266606857
  journal entries writing             : 0
  journal entries written             : 266606857
  journal entries committed           : 266606857
  journal blocks batching             : 0
  journal blocks started              : 2277538
  journal blocks writing              : 0
  journal blocks written              : 2277538
  journal blocks committed            : 2277538
  slab journal disk full count        : 0
  slab journal flush count            : 12761
  slab journal blocked count          : 0
  slab journal blocks written         : 105095
  slab journal tail busy count        : 0
  slab summary blocks written         : 104693
  reference blocks written            : 38655
  block map dirty pages               : 3160
  block map clean pages               : 29608
  block map free pages                : 0
  block map failed pages              : 0
  block map incoming pages            : 0
  block map outgoing pages            : 0
  block map cache pressure            : 0
  block map read count                : 133244416
  block map write count               : 133220519
  block map failed reads              : 0
  block map failed writes             : 0
  block map reclaimed                 : 0
  block map read outgoing             : 0
  block map found in cache            : 203948315
  block map discard required          : 132653
  block map wait for page             : 62351199
  block map fetch required            : 32768
  block map pages loaded              : 165421
  block map pages saved               : 162272
  block map flush count               : 161721
  invalid advice PBN count            : 0
  no space error count                : 0
  read only error count               : 0
  instance                            : 1
  512 byte emulation                  : off
  current VDO IO requests in progress : 0
  maximum VDO IO requests in progress : 2000
  dedupe advice valid                 : 762443
  dedupe advice stale                 : 0
  dedupe advice timeouts              : 0
  flush out                           : 166229
  write amplification ratio           : 0.87
  bios in read                        : 24592
  bios in write                       : 133386744
  bios in discard                     : 0
  bios in flush                       : 166225
  bios in fua                         : 4
  bios in partial read                : 0
  bios in partial write               : 0
  bios in partial discard             : 0
  bios in partial flush               : 0
  bios in partial fua                 : 0
  bios out read                       : 767487
  bios out write                      : 103807477
  bios out discard                    : 0
  bios out flush                      : 0
  bios out fua                        : 0
  bios meta read                      : 165482
  bios meta write                     : 12504681
  bios meta discard                   : 0
  bios meta flush                     : 640090
  bios meta fua                       : 1
  bios journal read                   : 0
  bios journal write                  : 2443939
  bios journal discard                : 0
  bios journal flush                  : 166401
  bios journal fua                    : 0
  bios page cache read                : 165421
  bios page cache write               : 486230
  bios page cache discard             : 0
  bios page cache flush               : 323958
  bios page cache fua                 : 0
  bios out completed read             : 767487
  bios out completed write            : 103807477
  bios out completed discard          : 0
  bios out completed flush            : 0
  bios out completed fua              : 0
  bios meta completed read            : 165482
  bios meta completed write           : 12173036
  bios meta completed discard         : 0
  bios meta completed flush           : 308445
  bios meta completed fua             : 1
  bios journal completed read         : 0
  bios journal completed write        : 2277538
  bios journal completed discard      : 0
  bios journal completed flush        : 0
  bios journal completed fua          : 0
  bios page cache completed read      : 165421
  bios page cache completed write     : 324509
  bios page cache completed discard   : 0
  bios page cache completed flush     : 162237
  bios page cache completed fua       : 0
  bios acknowledged read              : 24592
  bios acknowledged write             : 133386744
  bios acknowledged discard           : 0
  bios acknowledged flush             : 166225
  bios acknowledged fua               : 4
  bios acknowledged partial read      : 3546
  bios acknowledged partial write     : 0
  bios acknowledged partial discard   : 0
  bios acknowledged partial flush     : 0
  bios acknowledged partial fua       : 0
  bios in progress read               : 0
  bios in progress write              : 0
  bios in progress discard            : 0
  bios in progress flush              : 0
  bios in progress fua                : 0
  read cache accesses                 : 0
  read cache hits                     : 0
  read cache data hits                : 0
  KVDO module bytes used              : 1468797760
  KVDO module peak bytes used         : 1468800640
  KVDO module bios used               : 37286
  KVDO module peak bio count          : 37574
  entries indexed                     : 65180178
  posts found                         : 761330
  posts not found                     : 131918538
  queries found                       : 0
  queries not found                   : 0
  updates found                       : 28111255
  updates not found                   : 0
  current dedupe queries              : 0
  maximum dedupe queries              : 1037

Comment 11 Bryan Gurney 2018-07-11 22:11:53 UTC

Hi David,

I can see from the stats that "bios in discard" is 0, confirming that the VDO volume has not received any discards.

Have you tried a test with the "mount -o discard" option?

Comment 13 Need Real Name 2018-11-12 19:23:14 UTC

Hi
I observe the same behavior with Thin-LVM on top of vdo

I have mounted the filesystem with discard

/dev/mapper/VolGroupData-lvdata /data xfs rw,relatime,attr2,discard,inode64,sunit=1024,swidth=2048,noquota 0 0

vdostats 
Device               1K-blocks      Used Available Use% Space saving%
/dev/mapper/vdo1     314570752  46212788 268357964  14%           99%


When I write file to it the occopied space grows
vdostats --hu
Device                    Size      Used Available Use% Space saving%
/dev/mapper/vdo1        300.0G     45.3G    254.7G  15%           78%
...
vdostats --hu
Device                    Size      Used Available Use% Space saving%
/dev/mapper/vdo1        300.0G     47.9G    252.1G  15%           64%

When I delete the data it gets not discared.
vdostat shows the same output and

vdostats --verbose | grep discard
  block map discard required          : 0
  bios in discard                     : 0
  bios in partial discard             : 0
  bios out discard                    : 0
  bios meta discard                   : 0
  bios journal discard                : 0
  bios page cache discard             : 0
  bios out completed discard          : 0
  bios meta completed discard         : 0
  bios journal completed discard      : 0
  bios page cache completed discard   : 0
  bios acknowledged discard           : 0
  bios acknowledged partial discard   : 0
  bios in progress discard            : 0

even after an 

fstrim -v /data
/data: 1,8 TiB (1931789565952 bytes) trimmed

nothing changes
The LV's are created with discards=passdown

lvs -o lv_name,pool_lv,discards,data_percent
  LV          Pool Discards Data% 
  lvhomelocal                     
  lvroot                          
  lvvar                           
  lvdata      pool passdown 0,05  
  pool             passdown 0,05  
  pool_meta0                      

If I run 

blkdiscard /dev/mapper/VolGroupData-pool

(which takes long time) the discarded number grows

vdostats --verbose | grep discard
  block map discard required          : 0
  bios in discard                     : 22750953
  bios in partial discard             : 0
  bios out discard                    : 0
  bios meta discard                   : 0
  bios journal discard                : 0
  bios page cache discard             : 0
  bios out completed discard          : 0
  bios meta completed discard         : 0
  bios journal completed discard      : 0
  bios page cache completed discard   : 0
  bios acknowledged discard           : 22749493
  bios acknowledged partial discard   : 0
  bios in progress discard            : 1460



Why does the discard not run automatically when deleting files
and why does the space savings stay at 36% after the manual discard

Regards

Hansjörg

Comment 14 Need Real Name 2018-11-12 19:41:18 UTC

this may be related to

https://bugzilla.redhat.com/show_bug.cgi?id=1600156

kernel: device-mapper: thin: Data device (dm-5) max discard sectors smaller than a block: Disabling discard passdown.

Comment 15 Need Real Name 2018-11-15 17:39:12 UTC

The problem, that thin-lvm discards are nor passed down to kvdo can be fixed as follows


[root@rmc-cs57 ~]# lvs -o name,chunksize
  LV          Chunk
  lvhomelocal    0 
  lvroot         0 
  lvvar          0 
  lvdata         0 
  pool        1,00m
  pool_meta0     0 

-> Chunksize is 1,00m = 1024k = 2048 * 512 Byte

An  
echo 2048 > /sys/kvdo/max_discard_sectors
and both sizes fit

This has to be done
- after kvdo module is loaded
- before the vdo device is startet and the LVM is assembled

After that, the 
max discard sectors smaller than a block: Disabling discard passdown.

disappears and removing files in LV frees space in the vdo

BUT: This is not stable on our case under heavy load.
If I copy about half a million of files to the LVM (XFS on top of it)
an remove it

I get errors like
Nov 14 16:03:22 rmc-cs57 kernel: INFO: task xfsaild/dm-8:1175 blocked for more than 120 seconds.
Nov 14 16:03:22 rmc-cs57 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 14 16:03:22 rmc-cs57 kernel: xfsaild/dm-8    D ffff8ad5d484eeb0     0  1175      2 0x00000000
Nov 14 16:03:22 rmc-cs57 kernel: Call Trace:
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffff8d718f39>] schedule+0x29/0x70
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffff8d7168a9>] schedule_timeout+0x239/0x2c0
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffff8d060603>] ? x2apic_send_IPI_mask+0x13/0x20
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffff8d0d1d8c>] ? try_to_wake_up+0x18c/0x350
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffff8d7192ed>] wait_for_completion+0xfd/0x140
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffff8d0d2010>] ? wake_up_state+0x20/0x20
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffff8d0b68bd>] flush_work+0xfd/0x190
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffff8d0b36b0>] ? move_linked_works+0x90/0x90
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffffc05ced7a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffff8d0a697e>] ? try_to_del_timer_sync+0x5e/0x90
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffffc05ccce5>] _xfs_log_force+0x85/0x2c0 [xfs]
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffff8d0a6220>] ? internal_add_timer+0x70/0x70
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffffc05d8fdc>] ? xfsaild+0x16c/0x6f0 [xfs]
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffffc05ccf4c>] xfs_log_force+0x2c/0x70 [xfs]
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffffc05d8e70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffffc05d8fdc>] xfsaild+0x16c/0x6f0 [xfs]
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffffc05d8e70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffff8d0bdf21>] kthread+0xd1/0xe0
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffff8d0bde50>] ? insert_kthread_work+0x40/0x40
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffff8d7255dd>] ret_from_fork_nospec_begin+0x7/0x21
Nov 14 16:03:22 rmc-cs57 kernel: [<ffffffff8d0bde50>] ? insert_kthread_work+0x40/0x40

and the discard does not free the vdo completely

Even if I place the xfs directly on top of vdo and mount it with discard mount option (without thin lvm and with the default /sys/kvdo/max_discard_sectors of 8
I get xfs errors when removing 100.000 of files

Nov 15 11:44:33 rmc-cs57 kernel: INFO: task rm:30539 blocked for more than 120 seconds.
Nov 15 11:44:33 rmc-cs57 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 15 11:44:33 rmc-cs57 kernel: rm              D ffff97fc672f8fd0     0 30539   7743 0x00000080
Nov 15 11:44:33 rmc-cs57 kernel: Call Trace:
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffb2d18f39>] schedule+0x29/0x70
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffb2d168a9>] schedule_timeout+0x239/0x2c0
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffb26cea5a>] ? check_preempt_curr+0x8a/0xa0
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffb26cea89>] ? ttwu_do_wakeup+0x19/0xe0
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffb26d1d8c>] ? try_to_wake_up+0x18c/0x350
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffb2d192ed>] wait_for_completion+0xfd/0x140
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffb26d2010>] ? wake_up_state+0x20/0x20
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffb26b68bd>] flush_work+0xfd/0x190
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffb26b36b0>] ? move_linked_works+0x90/0x90
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffc0256d7a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffb26d7979>] ? select_task_rq_fair+0x549/0x700
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffc0255110>] _xfs_log_force_lsn+0x80/0x340 [xfs]
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffb2955f34>] ? __radix_tree_lookup+0x84/0xf0
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffc024303c>] ? __xfs_iunpin_wait+0x9c/0x150 [xfs]
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffc0255404>] xfs_log_force_lsn+0x34/0x70 [xfs]
Nov 15 11:44:33 rmc-cs57 kernel: [<ffffffffc02462a9>] ? xfs_iunpin_wait+0x19/0x20 [xfs]


If I mount the xfs without discard option, the 'rm' finishes without errors and 
if I do an fstrim afterwards, the vdo gets cleaned
The 'rm' with mount discard option takes about twice the time the an "rm" without mount discard option AND a subsequent fstrim

There seems to bee a performance bottleneck with the interaction of xfs discard mount option and vdo 

Regards

Hansjörg

Comment 17 Jakub Krysl 2019-10-15 14:38:54 UTC

Mass migration to Filip.

Comment 18 David Galloway 2019-11-06 13:30:33 UTC

(In reply to Bryan Gurney from comment #11)
> Hi David,
> 
> I can see from the stats that "bios in discard" is 0, confirming that the
> VDO volume has not received any discards.
> 
> Have you tried a test with the "mount -o discard" option?

The discard mount option was added to the tool that creates OSDs for Ceph so as far as I know, this is resolved from my perspective.  https://tracker.ceph.com/issues/23581