Back to bug 1794715

Who When What Removed Added
Matt Benjamin (redhat) 2020-01-24 12:56:56 UTC Status NEW ASSIGNED
Matt Benjamin (redhat) 2020-01-24 12:58:31 UTC Flags needinfo?(jjose)
Gaurav Sitlani 2020-01-24 13:10:10 UTC CC gsitlani
Jerrin Jose 2020-01-24 13:27:34 UTC Flags needinfo?(jjose)
Matt Benjamin (redhat) 2020-01-24 19:00:42 UTC Flags needinfo?(jjose)
Matt Benjamin (redhat) 2020-01-24 21:49:42 UTC Flags needinfo?(jjose)
Mike Hackett 2020-01-24 22:15:34 UTC CC mhackett
Karun Josy 2020-01-25 05:19:32 UTC CC kjosy
Flags needinfo?(jjose) needinfo?(jjose)
Ashish Singh 2020-01-27 05:21:37 UTC CC assingh
Matt Benjamin (redhat) 2020-01-27 20:59:15 UTC Flags needinfo?(jjose)
Bob Emerson 2020-01-29 15:54:34 UTC CC roemerso
Jerrin Jose 2020-02-03 08:12:41 UTC CC tserlin
Flags needinfo?(mbenjamin)
Jerrin Jose 2020-02-03 10:32:12 UTC Flags needinfo?(jjose)
Matt Benjamin (redhat) 2020-02-18 22:28:10 UTC Target Release 5.* 4.1
Flags needinfo?(mbenjamin)
Yaniv Kaul 2020-02-19 16:46:50 UTC Priority unspecified medium
Yaniv Kaul 2020-03-10 10:47:08 UTC Keywords Performance
Matt Benjamin (redhat) 2020-03-24 12:21:06 UTC Status ASSIGNED POST
Hemanth Kumar 2020-03-26 10:50:07 UTC Status POST MODIFIED
Fixed In Version ceph-14.2.8-23.el8cp, ceph-14.2.8-16.el7cp
CC ceph-qe-bugs
Flags needinfo?(ceph-qe-bugs)
CC hyelloji
Flags needinfo?(ceph-qe-bugs) needinfo-
errata-xmlrpc 2020-03-26 14:19:35 UTC Status MODIFIED ON_QA
Karen Norteman 2020-04-03 14:12:25 UTC CC knortema
Blocks 1816167
Doc Type If docs needed, set a value Bug Fix
Karen Norteman 2020-04-14 18:07:16 UTC Flags needinfo?(mbenjamin)
Matt Benjamin (redhat) 2020-04-14 18:24:58 UTC Doc Text Cause:
In addition to other possible bottlenecks, RGW lifecycle processing performance was severely constrained by lack of parallelism.

Consequence:
Environments with many buckets/containers and/or very many containers (especially both at once) had no way to scale lifecycle processing up with the increasing workload of objects/buckets.

Fix:
This change provides explicit parallelism and provides tuning parameters to scale parallelism when workload is likely to be large. Paralellism is now on two dimensions: a single radosgw instance can have several lifecycle processing threads (number controlled by rgw_lc_max_worker, default 3), and each thread has multiple work-pool threads executing lifecycle work (number controlled by rgw_lc_max_wp_worker). A single bucket can only be processed by one lc-worker at a time, but, that worker can allocate many work-pool threads to consume it. In addition, this change improved the allocation of lc "shards" (partitions) to workers, which should increase overall throughput.

Result:
Preliminary testing showed that lc-worker parallelism significantly accelerates overall processing.
Ranjini M N 2020-04-16 10:07:07 UTC CC rmandyam
Docs Contact rmandyam
Ranjini M N 2020-04-28 10:17:59 UTC Doc Text Cause:
In addition to other possible bottlenecks, RGW lifecycle processing performance was severely constrained by lack of parallelism.

Consequence:
Environments with many buckets/containers and/or very many containers (especially both at once) had no way to scale lifecycle processing up with the increasing workload of objects/buckets.

Fix:
This change provides explicit parallelism and provides tuning parameters to scale parallelism when workload is likely to be large. Paralellism is now on two dimensions: a single radosgw instance can have several lifecycle processing threads (number controlled by rgw_lc_max_worker, default 3), and each thread has multiple work-pool threads executing lifecycle work (number controlled by rgw_lc_max_wp_worker). A single bucket can only be processed by one lc-worker at a time, but, that worker can allocate many work-pool threads to consume it. In addition, this change improved the allocation of lc "shards" (partitions) to workers, which should increase overall throughput.

Result:
Preliminary testing showed that lc-worker parallelism significantly accelerates overall processing.
.Increase in overall throughput of Object Gateway lifecycle processing performance

Previously, Object Gateway lifecycle processing performance was constrained by the lack of parallelism due to the increasing workload of objects or buckets with many buckets or containers in the given environment. With this update, parallelism is in two dimensions, a single object gateway instance can have several lifecycle processing threads, and each thread has multiple work-pool threads executing the lifecycle work. Additionally, this update improved the allocation of `shards` to workers, thereby increasing overall throughput.
Matt Benjamin (redhat) 2020-04-28 10:47:20 UTC Flags needinfo?(mbenjamin)
errata-xmlrpc 2020-05-19 15:12:32 UTC Status ON_QA RELEASE_PENDING
errata-xmlrpc 2020-05-19 17:32:06 UTC Status RELEASE_PENDING CLOSED
Resolution --- ERRATA
Last Closed 2020-05-19 17:32:06 UTC
errata-xmlrpc 2020-05-19 17:32:51 UTC Link ID Red Hat Product Errata RHSA-2020:2231

Back to bug 1794715