Description of problem: The RGW team believes there are at least 10 commits in rhcs-4.x that address serious bucket listing correctness as well as performance and which are eligible for backport for rhcs-3.x. We believe that backports of these commits should be included in the upcoming rhcs-3.3z7 release, to reduce cost of support. list of commits: commit 67c36d4a7ff2995a3ad235ea1774274f6397ebf7 Author: J. Eric Ivancich <ivancich> Date: Fri Dec 4 18:16:08 2020 -0500 rgw: in ordered bucket listing skip namespaced entries when possible When listing non-namespaced entries in the bucket index, the code would march through the namespaced entries in blocks, requesting all of them from the CLS layer. When there were many namespaced entries, it would significantly affect the performance of ordered listing. This commit adds code to advance the marker passed to lower layers to skip past namespaced entries. This is challenging in that non-namespaced entries can appear in the middle of the namespaced entries. We'll ignore the issue instance tags in names to simplify the following discussion. Non-namespaced entries are indexed by "name". Namespaced entries are indexed by _namespace_name, using underscores to surround the namespace. The challenge comes with entries such as "_name", where the name begins with an underscore. In that case we index them by "__name", quoting the underscore with another. Now the extra challenge comes due to the lexic ordering of the following: ASP _BAT_cat __DOG _eel_FOX goat Note that the namespaced entries are in positions 2 and 4, and the non-namespaced entries are in positions 1, 3, and 5. So when skipping past the namespaced entries, we have to be careful not to skip past the non-namespaced entries that begin with underscore. Additional code clean-ups done as well. Signed-off-by: J. Eric Ivancich <ivancich> Resolves: rhbz#1883283 commit e70d08c483c15067c5cf2d7c7f8d4fe1b6192bf3 Author: J. Eric Ivancich <ivancich> Date: Sat Nov 21 11:10:35 2020 -0500 rgw: during GC defer, prevent new GC enqueue With the new queue-based GC code, when a GC defer operation is performed, it adds an "urgent" record to prevent GC from occuring, whether there's a GC entry or not (it's not checked). But either way the code *also* adds a new GC entry to the queue to cause GC to occur at a later time. This would be incorrect if there is no GC entry to begin with. This will cause GC to delete tail objects when there has been no user-initiated delete. In other words a READ operation can result in a permenent DELETE of portions of large objects. This is a temporary fix for this bug. It marks the code in error and prevents GC defer operations from taking place at all as a temporary measure. Signed-off-by: J. Eric Ivancich <ivancich> Resolves: rhbz#1892644 commit 340360faf951c92cc3a3ca5ca57842ac6b4e72ec Author: J. Eric Ivancich <ivancich> Date: Wed Oct 21 10:32:30 2020 -0400 tools/rados: flush formatter periodically during json output of `rados ls` While `rados ls` is emitting object info through a json formatter, flush the formatter after there are at least 4096 bytes are buffered for output. Signed-off-by: J. Eric Ivancich <ivancich> (cherry picked from commit 1548ef7a97559f17023f17842dab51d47cef89df) Resolves: rhbz#1883590 commit 7011bf3f64b183243b788323eb76cdd114a3ed07 Author: J. Eric Ivancich <ivancich> Date: Fri Oct 9 16:06:55 2020 -0400 rgw: rgw-orphan-list should use "plain" formatted `rados ls` output The previous version that used "json-pretty" output for `rados ls` added complications due to json's escaping of special characters. So this version returns to the "plain" output for `rados ls` but deals with entries (oids) that might have namespaces and/or locators as well. Signed-off-by: J. Eric Ivancich <ivancich> (cherry picked from commit 5b994f90594208dab81045732099a03686819b30) Resolves: rhbz#1883590 commit 501239bb32c4ccf867610625c5fa049b98a43a61 Author: J. Eric Ivancich <ivancich> Date: Tue Jun 9 23:12:22 2020 -0400 rgw: use exponential back-off for retries after bucketinfo update race Use a simple exponential back-off mechanism during write races of bucketinfo updates to note bucket index reshard status. Signed-off-by: J. Eric Ivancich <ivancich> Resolves: rhbz#1846504 commit 7b46439c2a0a5f67c11c7edbbd0252f944f2c045 Author: J. Eric Ivancich <ivancich> Date: Tue Sep 15 14:20:04 2020 -0400 rgw: advance pseudo-folders properly in delimited ordered listing The code mistakenly uses the current marker to figure out how to skip past a pseudo-directory. This could allow for some entries in a bucket to be skipped. The code should have used the current pseudo-directory to determine what to skip past. Signed-off-by: J. Eric Ivancich <ivancich> (cherry picked from commit c3f346e3157ef56254b199ea46e26016166451e4) Resolves: rhbz#1874645 commit 6d1105f33479a6f7a2ff110a8540e75c027d2bdf Author: J. Eric Ivancich <ivancich> Date: Mon Sep 14 19:33:51 2020 -0400 Revert "rgw: fix list bucket with delimiter wrongly skip some special keys" This reverts commit 04b15cef88c5d50ce18911f63c63fa094101ced0. While this did fix https://tracker.ceph.com/issues/40905, it did so in an unnecessarily complex manner. So we're reverting it to more easily apply a cleaner solution. Signed-off-by: J. Eric Ivancich <ivancich> (cherry picked from commit 130a74a60802d8b0db15dc0d5c9fb6164d78d72d) Resolves: rhbz#1874645 commit ae82ee1df4e86c0c56570230aaf58cb1f0a5a33a Author: J. Eric Ivancich <ivancich> Date: Tue Oct 6 15:21:02 2020 -0400 rgw: allow rgw-orphan-list to note when rados objects are in namespace Currently namespaces and locators are ignored when `rados ls` is run by rgw-orphan-list to record RADOS's known objects. However there have been cases where RADOS objects have a locator, and when one is included in the listing, the script does not handle it correctly. Now when objects have locators, we will prevent their output from entering the .intermediate file. Additionally we do not expect RGW data objects to be in RADOS namespaces, so when a namespaced object is detected, we'll error out with a message. Signed-off-by: J. Eric Ivancich <ivancich> (cherry picked from commit ddf52016fa03ba192f242ad641a5c8e5a95035a1) Resolves: rhbz#1883590 commit eb8e59276bec45831fbe123f5c51a9e54270c745 Author: J. Eric Ivancich <ivancich> Date: Tue Oct 6 12:42:22 2020 -0400 rgw: fix setting of namespace in ordered and unordered bucket listing The namespace is not always set correctly during bucket listing. This can, for example, cause the listing of incomplete multipart uploads, which are in the _multipart_ namespace, to not paginate correctly, and cause entries to be re-listed. Signed-off-by: J. Eric Ivancich <ivancich> (cherry picked from commit bd6f163f366753e8ec42b85a53334f4bf78916bd) Resolves: rhbz#1883283 commit 1e15dde27c8078f93a98003a3c592255a902af92 Author: J. Eric Ivancich <ivancich> Date: Thu Oct 1 13:33:01 2020 -0400 rgw: radosgw-admin should paginate internally when listing bucket Currently `radosgw-admin bucket list ...`, when listing a bucket, asks for the value of "--max-entries" internally. To list a large bucket entirely the user would have to set "--max-entries" to a large value (e.g., 10000000). Internally this doesn't paginate, so it will try to produce the entire list at once. This can consume a lot of memory, and there are known cases where this induces an out-of-memory crash. So now we'll set a maximum pagination size of 10,000. So even with large values of "--max-entries" it will still be able to produce the full listing without stressing memory, because it will ask for at most 10,000 entries at a time. Signed-off-by: J. Eric Ivancich <ivancich> (cherry picked from commit 6d033061bf9eaebf3dab37b9ed45de22ce6fa6b7) Resolves: rhbz#1883283
6 bucket listing related commits added to ceph-3.3-rhel-patches. In addition to the above listed, I added one more (here's the commit): Author: J. Eric Ivancich <ivancich> Date: Fri Jul 19 16:10:59 2019 -0400 rgw: mitigate bucket list with max-entries excessively high When listing a bucket with radosgw-admin, the user can specify the maximum number of entries. That number can be unreasonably large, and can affect the performance and memory availability. For example: radosgw-admin bucket list --bucket mybucket1 --max-entries=10000000 This has the potential for creating large data structures at multiple levels in the the call stack of the radosgw(-admin) process, potentially causing the process to run out of memory. This change limits the maximum number of entries requested in all but the high level code to help mitigate this issue. Signed-off-by: J. Eric Ivancich <ivancich> (cherry picked from commit 300429c9e98a27e17c2a20ade82c6c63ac276c20) Conflicts: variable converted from static constexpr to static const in light of varying compiler versions. Resolves: rhbz#1915078
Note: one commit listed above is not included: rgw: during GC defer, prevent new GC enqueue The bug that it was intended to address did not appear in 3.3z6, so it's unnnecessary.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat Ceph Storage 3.3 Security and Bug Fix Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:1518
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days