RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1846036 - LVMVDO volume does not reclaim disk space, eventually becomes read-only & fsck reports filesystem error
Summary: LVMVDO volume does not reclaim disk space, eventually becomes read-only & fsc...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: lvm2
Version: 8.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: 8.0
Assignee: LVM and device-mapper development team
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-10 16:01 UTC by Petr Beranek
Modified: 2021-09-07 11:57 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-16 13:59:33 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-46257 0 None None None 2021-09-07 11:57:36 UTC

Description Petr Beranek 2020-06-10 16:01:21 UTC
Description of problem:
When writing to and deleting data from LVMVDO volume, the volume does not
reclaim the space freed by data deletion. When checking volume space
utilization via `lvs', "Data%" value is never decresed (but e.g. `df' command
does reflect the data deletion and reports expected values). Eventually, when
"Data%" value of `vdo_pool' becomes 100.00, LVMVDO volume ends up in read-only
mode and `fsck' reports an filesystem error.


Version-Release number of selected component (if applicable):
lvm2-2.03.08-3.el8.x86_64 (RHEL 8.2.1)
lvm2-2.03.09-2.el8.x86_64 (RHEL 8.3.0)
(other versions were not tested)


How reproducible:
always (see steps below)


Steps to Reproduce:
prerequisites: 3 disks, 5GB each
1. pvcreate /dev/sd{a,b,c}
2. vgcreate vdovg /dev/sd{a,b,c}
3. lvcreate --type vdo -n vdolv -L 10G -V 20G vdovg/vdo_pool
4. mkfs.ext4 -E nodiscard /dev/vdovg/vdolv
5. mkdir /mnt/vdolv
6. mount /dev/vdovg/vdolv /mnt/vdolv/
*  check logical volumes stats using `lvs' and `df -h' 
7. dd if=/dev/urandom of=/mnt/vdolv/urandom_data.bin bs=1G count=5 iflag=fullblock
   # the overall result is exactly the same result if you copy the random
   # data using `cp' instead of writing them directly using `dd' command
*  check logical volumes stats using `lvs' and `df -h'
8. rm /mnt/vdolv/urandom_data.bin
*  check logical volumes stats using `lvs' and `df -h'
9. repeat step #7


Actual results:
dd: error writing '/mnt/vdolv/urandom_data.bin': Read-only file system
2+0 records in
1+0 records out 
1384009728 bytes (1.4 GB, 1.3 GiB) copied, 41.5512 s, 33.3 MB/s

[root@virt-175 ~]# lvs 
  LV       VG            Attr       LSize   Pool     Origin Data%  Meta%  Move Log Cpy%Sync Convert
  root     rhel_virt-175 -wi-ao----  <6.20g
  swap     rhel_virt-175 -wi-ao---- 820.00m
  vdo_pool vdovg         dwi-------  10.00g                 100.00
  vdolv    vdovg         vwi-aov---  20.00g vdo_pool        29.94

[root@virt-175 ~]# df -h
Filesystem                       Size  Used Avail Use% Mounted on
devtmpfs                         991M     0  991M   0% /dev
tmpfs                           1011M     0 1011M   0% /dev/shm
tmpfs                           1011M   22M  989M   3% /run
tmpfs                           1011M     0 1011M   0% /sys/fs/cgroup
/dev/mapper/rhel_virt--175-root  6.2G  3.1G  3.2G  49% /
/dev/vda1                       1014M  194M  821M  20% /boot
tmpfs                            202M     0  202M   0% /run/user/0
/dev/mapper/vdovg-vdolv           20G  1.4G   18G   8% /mnt/vdolv

[root@virt-175 ~]# umount /mnt/vdolv/
[root@virt-175 ~]# fsck.ext4 /dev/vdovg/vdolv
e2fsck 1.45.4 (23-Sep-2019)
/dev/vdovg/vdolv: recovering journal
Superblock needs_recovery flag is clear, but journal has data.
Run journal anyway<y>? yes
fsck.ext4: unable to set superblock flags on /dev/vdovg/vdolv


/dev/vdovg/vdolv: ********** WARNING: Filesystem still has errors **********


Expected results:
All data (5GB) has been written. `vdolv' volume is available for further use.

Comment 1 Zdenek Kabelac 2020-06-16 13:58:54 UTC
Getting to the step  8. - since the filesystem is mounted without immediate discard  (and this is usually recommended way) - after 'rm' it's users' responsibility to initiated trimming for release fs block - by running fstrim as step 8.
(But fstrim is quite SLOW operation with VDO volumes compared to Thin volumes)

So report looks like misusage of VDO volumes - but let's just add few more comments:

Primary goal *IS* to avoid hitting out-of-space state. Once user is reaching 'full' pool (applies to both Thin & VDO) user has to deal with consequences.  The most usable 'recovery' scenario is to extend pool to accommodate more user's data. 

Once the pool is out-of-space - there is no easy way to 'repair' i.e. filesystem located on such device
as there are no free blocks to be written by fileystem fsck operation (with VDO situation is even worts,
since even overwrite of already owned block may require new few block in pool)

Unlike filesystems like btrfs/zfs - combination of ext4  & provisioned device works with two different entities.

So to avoid o-o-s above - user should enable/use  autoextension of VDO pool device when it writes more data then it is currently available.

User cannot expect/take out-of-space of VDO device as 'similar' case to  ouf-of-space filesystem - these 2 cases are very different!

When the filesystem 'exhaust' ALL blocks of provisioned device (be it Thin or VDO) - it may not be able to further update even it's metadata - user is basically reaching 'dead-end' and unmounting (when possible) is required and before fsck *new space* has to be added to pool - so 'repair' can proceed and acquire new empty block in pool.

Once the filesystem is 'repaired/fixed' - user can fire i.e. 'fstrim' to reclaim free blocks and return then back to the pool.
(with VDO volume usage of fstrim can be a very lengthy/slow operation for big VDO pool!)

Comment 3 Petr Beranek 2020-06-19 13:18:04 UTC
Thank you, Zdenek, for clarification. This important detail, that user is
responsible for discarding unused fs blocks, is missing in current lvmvdo(7)
manpage and it doesn't seem to be obvious from the product itself. Our
users/customers should be explicitly warn about this, or better, we should
also provide recommendations how to deal with it.

This risk may be mitigated by proper volume monitoring/autoextension, but
anyway, we should not expect, that all our users/customers have always
sufficient VDO expertise.

Therefore I have opened a bug (https://bugzilla.redhat.com/show_bug.cgi?id=1849009)
related to the current lvmvdo documentation.


Note You need to log in before you can comment on or make changes to this bug.