Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1472868 - With RGWs configured in a load balancer, quota stats cache doesn't work
With RGWs configured in a load balancer, quota stats cache doesn't work
Status: CLOSED ERRATA
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: RGW (Show other bugs)
2.3
x86_64 Linux
low Severity low
: rc
: 3.1
Assigned To: Orit Wasserman
vidushi
:
Depends On: 1523246
Blocks: 1473188 1473436 1584264
  Show dependency treegraph
 
Reported: 2017-07-19 10:55 EDT by Benjamin Schmaus
Modified: 2018-09-26 14:17 EDT (History)
16 users (show)

See Also:
Fixed In Version: RHEL: ceph-12.2.5-12.el7cp Ubuntu: ceph_12.2.5-3redhat1xenial
Doc Type: Bug Fix
Doc Text:
.Quota stats cache is no longer invalid Previously in {product}, quota values sometimes were not properly decremented. This could cause exceed errors when the quota was not actually exceeded. With this update to Ceph, quota values are properly decremented and no incorrect errors are printed.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-09-26 14:16:41 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Ceph Project Bug Tracker 20661 None None None 2017-07-19 10:55 EDT
Ceph Project Bug Tracker 20934 None None None 2017-08-21 12:07 EDT
Red Hat Product Errata RHBA-2018:2819 None None None 2018-09-26 14:17 EDT

  None (edit)
Description Benjamin Schmaus 2017-07-19 10:55:01 EDT
Description of problem:

With RGWs configured in a load balancer, quota stats cache can possibly run into unbound values. We have found errors like below in our clusters running Jewel. This happens when PUT and DELETE operations do not hit the same RGW and an eventual update_stats() from a DELETE operation tries to decrement the stats cache. This can be easily verified by having RGWs configured in a load balancer(I've used HAProxy in RR mode) and running a script to upload/delete objects, with the user quota enabled.

20 quota: can't use cached stats, exceeded soft threshold (num objs): 18446744073709551615 >= 190000

10 quota exceeded: stats.num_kb_rounded=18446744073709549572 size_kb=1024 user_quota.max_size_kb=5242880000


Version-Release number of selected component (if applicable):
2.3

How reproducible:
90%

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 8 Orit Wasserman 2017-08-21 12:07:52 EDT
Additional fix is needed, upstream pr:
https://github.com/ceph/ceph/pull/17116
Comment 9 shilpa 2017-09-11 03:58:41 EDT
@orit,

Could you please suggest a reproducer for this?
Comment 10 Orit Wasserman 2017-09-11 04:31:56 EDT
(In reply to shilpa from comment #9)
> @orit,
> 
> Could you please suggest a reproducer for this?

Hi Shilpa,
You will need a cluster with at least two radosgw.
You will need to configure a load balancer for example HAproxy.
Use a script that does several PUT objects and than several DELETE objects.
The load balancer would send the requests to different radosgws and you should hit the bug.
Comment 19 Gregory Meno 2017-10-03 16:23:33 EDT
Moving back to assigned based on Marcus' comments above.

Matt/Orit Is this something we have ability to fix/retest in the next week?
Comment 32 John Brier 2018-08-31 15:28:09 EDT
Thanks Orit!
Comment 34 errata-xmlrpc 2018-09-26 14:16:41 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2819

Note You need to log in before you can comment on or make changes to this bug.