Bug 2119039 - Enhance user experience with corner case of full VDOPOOL
Summary: Enhance user experience with corner case of full VDOPOOL
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: lvm2
Version: 9.1
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: rc
: ---
Assignee: Zdenek Kabelac
QA Contact: cluster-qe
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-08-17 10:40 UTC by Zdenek Kabelac
Modified: 2023-08-10 15:41 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-131322 0 None None None 2022-08-17 10:43:05 UTC

Description Zdenek Kabelac 2022-08-17 10:40:39 UTC
When the VDOPOOL becomes full with current version of lvm2 - the only way to discover such state by user is to check for 100% data usage of VDOPOOL by lvs.

However this is not a good way to inform user about such highly problematic situation.

The associated problems are - the fullness might be a 'temporary' issue - so user might be even unaware the error state of full pool even occurred - it's not even reported by kvdo target - the only observable moment is report of 'dmeventd' monitoring and kernel write error message - which might be possibly hardly associated.

Some small example how to examine situation:

# Create small VDO pool with 'overprovisioned' 200MiB volume
# while the vdopool can only store <130MiB 
#
# lvcreate --vdo -V200M -L2.9G --vdosettings 'vdoslabsizemb=128' -n lv vg 

# now start to write to such volume urandom data with 'dd'

# dd if=/dev/urandom of=/dev/vg/lv bs=1M count=140 status=progress

Such operation for a user ends with 'success' return code 0.
Kernel reports in parallel some 'async' write errors in dmesg.

Situation gets better if the options 'conv=fdatasync'  or  'oflag=direct' are used with dd so the userspace app at least recognizes an error on write.

Yet we still can easily 'miss' the error reporting state on lvm2 side - as even simple 'TRIM' on a device might make look such VDO LV looking normaly - although on overfilled pool the  fsck operation shall be always used.

Enhance this use case - similarities could be probably found with thin-pool out-of-data_space/out_of_metadata_space


Note You need to log in before you can comment on or make changes to this bug.