Bug 1665654
Summary: | lvm needs to increase default value of migration_threshold and better document migration_threshold | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | nikhil kshirsagar <nkshirsa> |
Component: | lvm2 | Assignee: | Zdenek Kabelac <zkabelac> |
lvm2 sub component: | Cache Logical Volumes | QA Contact: | cluster-qe <cluster-qe> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | urgent | ||
Priority: | urgent | CC: | agk, cmarthal, heinzm, jbrassow, loberman, mcsontos, msnitzer, pasik, prajnoha, zkabelac |
Version: | 8.0 | Keywords: | Bugfix, Triaged |
Target Milestone: | rc | ||
Target Release: | 8.2 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | lvm2-2.03.09-4.el8 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-11-04 02:00:20 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1679810, 1755139, 1825061 |
Description
nikhil kshirsagar
2019-01-12 03:29:22 UTC
Items I think we need for this bz: i) When a new cache is created set the migration_threshold to a value that allows several blocks to migrate at once. I suggest: migration_threshold = max(2048, chunk_size * 10) ii) Describe migration_threshold in the man page for lvchange. This issue has been already resolved in for 2.03.03 with libdm 1.02.159 - and for stable release 2.02.184 with libdm 1.02.156 (March 2019) libdm automatically guards the passed setting and will automatically adjust migration_threshold to be at least 8 times bigger then chunk_size. Although the usefulness of a caches with chunk-sizes in range of MB is questionable and should be benchmarked whether it really brings expected benefits. I don't see any default changes from rhel7.7 and current 8.2 regarding migration_threshold. In comment #0, #1, and #2 it was discussed adding migration_threshold to lvmcache(7) and|or lvm.conf. Nothing has been added and I still don't see anything listed in '-o cache_settings' kernel-4.18.0-185.el8 BUILT: Fri Feb 28 17:18:25 CST 2020 lvm2-2.03.08-2.el8 BUILT: Mon Feb 24 11:21:38 CST 2020 lvm2-libs-2.03.08-2.el8 BUILT: Mon Feb 24 11:21:38 CST 2020 device-mapper-1.02.169-2.el8 BUILT: Mon Feb 24 11:21:38 CST 2020 device-mapper-libs-1.02.169-2.el8 BUILT: Mon Feb 24 11:21:38 CST 2020 device-mapper-event-1.02.169-2.el8 BUILT: Mon Feb 24 11:21:38 CST 2020 device-mapper-event-libs-1.02.169-2.el8 BUILT: Mon Feb 24 11:21:38 CST 2020 [root@hayes-02 ~]# grep migration_threshold /etc/lvm/lvm.conf [root@hayes-02 ~]# # RHEL8.2 [root@hayes-02 ~]# lvs -a -o +devices,cache_settings LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices CacheSettings [POOL_cpool] cache_sanity Cwi---C--- 500.00m 0.10 1.17 0.00 POOL_cpool_cdata(0) [POOL_cpool_cdata] cache_sanity Cwi-ao---- 500.00m /dev/sdo1(4) [POOL_cpool_cmeta] cache_sanity ewi-ao---- 8.00m /dev/sdo1(2) corigin cache_sanity Cwi-a-C--- 4.00g [POOL_cpool] [corigin_corig] 0.10 1.17 0.00 corigin_corig(0) [corigin_corig] cache_sanity owi-aoC--- 4.00g /dev/sdn1(0) [lvol0_pmspare] cache_sanity ewi------- 8.00m /dev/sdo1(0) The default size 2048 still appears to be the same as well. # RHEL7.7 [root@hayes-01 ~]# dmsetup status cache_sanity-corigin: 0 8388608 cache 8 24/2048 128 9/8000 0 49 0 0 0 9 0 3 metadata2 writethrough no_discard_passdown 2 migration_threshold 2048 mq 10 random_threshold 0 sequential_threshold 0 discard_promote_adjustment 0 read_promote_adjustment 0 write_promote_adjustment 0 rw - # RHEL8.2 [root@hayes-02 ~]# dmsetup status cache_sanity-corigin: 0 8388608 cache 8 24/2048 128 8/8000 5 51 0 0 0 8 0 3 metadata2 writethrough no_discard_passdown 2 migration_threshold 2048 mq 10 random_threshold 0 sequential_threshold 0 discard_promote_adjustment 0 read_promote_adjustment 0 write_promote_adjustment 0 rw - Please provide devel unit testing results when moving this back to ON_QA. The outcome of committed patch/solution in lvm2 was that we put a protection at libdm level. So whenever DM cache table line is submitted by ANY user of libdm (i.e. lvm2 tool) the table line modifies migration threshold to always ensure there is at least 8 cache chunks that can be migrated. This is good as long as the 'cache chunks' are not getting absurdly high. So when you would have set cache chunk as 128M - migration threshold would ensure you have at least 1G that can be migrated - which probably considerable limit the remaining available bandwidth to a disk user. For this reason - we are considering to limit the maximum cache chunk size to smaller values - not yet sure about the final max value - maybe 16MiB sounds like a reasonable default. Pushed update of man page mentioning 'migration_threshold' for cache pools in lvmcache.7 with couple more words and explaination how to access migration_threshold from kernel via 'lvs -o+kernel_cache_settings,cache_settings VG/LV' https://www.redhat.com/archives/lvm-devel/2020-June/msg00041.html Marking verified with the caveat that w/o devel results on default migration threshold increases, this time around all I did was verify that there was something in the man page in the latest rpms. kernel-4.18.0-232.el8 BUILT: Mon Aug 10 02:17:54 CDT 2020 lvm2-2.03.09-5.el8 BUILT: Wed Aug 12 15:51:50 CDT 2020 lvm2-libs-2.03.09-5.el8 BUILT: Wed Aug 12 15:51:50 CDT 2020 lvm2-lockd-2.03.09-5.el8 BUILT: Wed Aug 12 15:51:50 CDT 2020 device-mapper-1.02.171-5.el8 BUILT: Wed Aug 12 15:51:50 CDT 2020 device-mapper-libs-1.02.171-5.el8 BUILT: Wed Aug 12 15:51:50 CDT 2020 device-mapper-event-1.02.171-5.el8 BUILT: Wed Aug 12 15:51:50 CDT 2020 device-mapper-event-libs-1.02.171-5.el8 BUILT: Wed Aug 12 15:51:50 CDT 2020 lvcreate --wipesignatures y -L 4G -n corigin cache_sanity @slow lvcreate -L 2G -n fs_B_pool cache_sanity @fast lvcreate -L 12M -n fs_B_pool_meta cache_sanity @fast lvconvert --yes --type cache-pool --cachepolicy smq --cachemode writethrough -c 32 --poolmetadata cache_sanity/fs_B_pool_meta cache_sanity/fs_B_pool WARNING: Converting cache_sanity/fs_B_pool and cache_sanity/fs_B_pool_meta to cache pool's data and metadata volumes with metadata wiping. THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) lvconvert --yes --type cache --cachemetadataformat 1 --cachepool cache_sanity/fs_B_pool cache_sanity/corigin [root@hayes-02 ~]# lvs -o+kernel_cache_settings,cache_settings LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert KCacheSettings CacheSettings corigin cache_sanity owi-aoC--- 4.00g [fs_B_pool_cpool] [corigin_corig] 0.94 6.58 0.00 migration_threshold=2048 write_snap_1 cache_sanity swi-aos--- 500.00m corigin 75.25 write_snap_2 cache_sanity swi-aos--- 500.00m corigin 7.75 write_snap_3 cache_sanity swi-aos--- 500.00m corigin 0.45 write_snap_4 cache_sanity swi-aos--- 500.00m corigin 0.44 write_snap_5 cache_sanity swi-aos--- 500.00m corigin 0.45 [root@hayes-02 ~]# dmsetup status cache_sanity-write_snap_4-cow: 0 1024000 linear cache_sanity-corigin-real: 0 8388608 cache 8 202/3072 64 613/65536 2943 2624 21 0 0 501 0 2 writethrough no_discard_passdown 2 migration_threshold 2048 smq 0 rw - cache_sanity-write_snap_5-cow: 0 1024000 linear cache_sanity-corigin: 0 8388608 snapshot-origin cache_sanity-write_snap_1-cow: 0 1024000 linear cache_sanity-write_snap_5: 0 8388608 snapshot 4608/1024000 256 cache_sanity-fs_B_pool_cpool_cdata: 0 4194304 linear cache_sanity-write_snap_4: 0 8388608 snapshot 4480/1024000 256 cache_sanity-corigin_corig: 0 8388608 linear cache_sanity-fs_B_pool_cpool_cmeta: 0 24576 linear cache_sanity-write_snap_2-cow: 0 1024000 linear cache_sanity-write_snap_3: 0 8388608 snapshot 4608/1024000 256 cache_sanity-write_snap_2: 0 8388608 snapshot 79360/1024000 256 cache_sanity-write_snap_3-cow: 0 1024000 linear cache_sanity-write_snap_1: 0 8388608 snapshot 770560/1024000 384 Using a chunk size that is too large can result in wasteful use of the cache, in which small reads and writes cause large sections of an LV to be stored in the cache. It can also require increasing migration threshold which defaults to 2048 sectors (1 MiB). Lvm2 ensures migration threshold is at least 8 chunks in size. This may in some cases result in very high bandwidth load of transfering data between the cache LV and its cache origin LV. However, choosing a chunk size that is too small can result in more overhead trying to manage the numerous chunks that become mapped into the cache. Overhead can include both excessive CPU time searching for chunks, and excessive memory tracking chunks. Command to display the chunk size: lvs -o+chunksize VG/LV lvm.conf(5) cache_pool_chunk_size controls the default chunk size. The default value is shown by: lvmconfig --type default allocation/cache_pool_chunk_size Checking migration threshold (in sectors) of running cached LV: lvs -o+kernel_cache_settings VG/LV dm-cache migration threshold Migrating data between the origin and cache LV uses bandwidth. The user can set a throttle to prevent more than a certain amount of migration occurring at any one time. Currently dm- cache is not taking any account of normal io traffic going to the devices. User can set migration threshold via cache policy settings as "migration_threshold=<#sectors>" to set the maximum number of sectors being migrated, the default being 2048 sectors (1MiB). Command to set migration threshold to 2MiB (4096 sectors): lvcreate --cachepolicy 'migration_threshold=4096' VG/LV Command to display the migration threshold: lvs -o+kernel_cache_settings,cache_settings VG/LV lvs -o+chunksize VG/LV Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (lvm2 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4546 |