2108707 – [Workload-DFG][RHCS 5.2 ][Greenfield] [SingleSite] put workload fails to update bucket index at a rate of around 0.05% (5,000 per 10,000,000)

Bug 2108707 - [Workload-DFG][RHCS 5.2 ][Greenfield] [SingleSite] put workload fails to update bucket index at a rate of around 0.05% (5,000 per 10,000,000)

Summary: [Workload-DFG][RHCS 5.2 ][Greenfield] [SingleSite] put workload fails to upda...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	RGW
Sub Component:
Version:	5.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	5.3z1
Assignee:	J. Eric Ivancich
QA Contact:	Anuchaithra
Docs Contact:	Akash Raj
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-07-19 18:28 UTC by Vikhyat Umrao
Modified:	2023-02-07 03:38 UTC (History)
CC List:	15 users (show)
Fixed In Version:	ceph-16.2.10-8.el8cp
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2023-01-11 17:39:52 UTC
Embargoed:
Dependent Products:
Flags:	akraj: needinfo-

Attachments	(Terms of Use)
radosgw-admin bucket stats with debug_rgw = 20 and debug_ms = 1 (4.58 MB, text/plain) 2022-07-19 18:32 UTC, Vikhyat Umrao	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	RHCEPH-4834	0	None	None	None	2022-07-19 18:32:17 UTC
Red Hat Product Errata	RHSA-2023:0076	0	None	None	None	2023-01-11 17:40:47 UTC

Description Vikhyat Umrao 2022-07-19 18:28:45 UTC

Description of problem:
[Workload DFG][RHCS 5.2 ][Greenfield]radosgw-admin bucket stats does not report correct object counts per bucket

Version-Release number of selected component (if applicable):
RHCS 5.2 16.2.8-75.el8cp

How reproducible:
Mostly always (reproduced approx 3 times)

Steps to Reproduce:
1. Deploy an RHCS 5.2 cluster
2. Run Cosbench *S3* fill workload on 5 buckets - 100M/bucket


Actual results:
- Bucket stats should give 100M objects/buckets they return different object count
- We have seen sometimes Cosbench write less number of objects but they are in the range of 1 to 2%, we identified the issue is not on PUT it is mainly on bucket stats reporting.

[root@f11-h21-000-6049p 9b786f86-0692-11ed-b24c-ac1f6ba409aa]# ceph --version
ceph version 16.2.8-75.el8cp (3b7a6413ab943a935f85f10a25d51a88093d15e9) pacific (stable)

[root@f11-h21-000-6049p 9b786f86-0692-11ed-b24c-ac1f6ba409aa]# radosgw-admin --version
ceph version 16.2.8-75.el8cp (3b7a6413ab943a935f85f10a25d51a88093d15e9) pacific (stable)

[root@f11-h21-000-6049p 9b786f86-0692-11ed-b24c-ac1f6ba409aa]# radosgw-admin bucket stats | grep num_objects
                "num_objects": 85348169
                "num_objects": 46337304
                "num_objects": 49697801
                "num_objects": 61534844
                "num_objects": 66544315

Expected results:
- Bucket stats should report the correct number of objects


Additional info:

- Okay so we thought it is write failures and the Cosbench job might be failing to PUT the correct number of objects hence we started troubleshooting from 

Cosbench Drivers(clients) -> HAPROXYs -> RGW Servers -> OSDs 

- We came to know not a single place we see errors where we see PUTs failing.
- The following job gave us more confidence that is not a PUT failure and it is a `radosgw-admin` reporting error, this logline is from one of the RGW daemons:

2022-07-18T21:18:06.491+0000 7f3397b3e700  1 ====== req done req=0x7f336b713630 op status=0 http_status=200 latency=0.011999994s ======
2022-07-18T21:18:06.491+0000 7f3397b3e700  1 beast: 0x7f336b713630: 172.16.37.9 - johndoe [18/Jul/2022:21:18:06.479 +0000] "PUT /bucket5/primary94415552 HTTP/1.1" 200 1000 - "aws-sdk-java/1.10.76 Linux/4.18.0-372.9.1.el8.x86_64 OpenJDK_64-Bit_Server_VM/25.332-b09/1.8.0_332" - latency=0.011999994s

- Please note the object name *primary94415552* in bucket5. In Cosbench we start writing objects in an incremental fashion so primary1 ---> primary100000000 for 100M in per bucket. As you can see in the buckets no bucket is near the 90M mark. 

- Also verified via s3cmd
root@f10-h29-b02-5039ms:~
# s3cmd info s3://bucket5/primary94415552
s3://bucket5/primary94415552 (object):
   File size: 1000
   Last mod:  Mon, 18 Jul 2022 21:18:06 GMT
   MIME type: application/octet-stream
   Storage:   STANDARD
   MD5 sum:   0f6ad7b5d75f51d3486167a48f713a5c
   SSE:       none
   Policy:    none
   CORS:      none
   ACL:       John Doe: FULL_CONTROL


- Then we thought let us check at the RADOS layer with `ceph df` and the whole picture was clear that it is a `radosgw-admin` reporting issue.


[root@f11-h21-000-6049p 9b786f86-0692-11ed-b24c-ac1f6ba409aa]# ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL    USED  RAW USED  %RAW USED
hdd    579 TiB  549 TiB  30 TiB    30 TiB       5.26
TOTAL  579 TiB  549 TiB  30 TiB    30 TiB       5.26
 
--- POOLS ---
POOL                       ID   PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
device_health_metrics       1     1  202 KiB      341  606 KiB      0    169 TiB
.rgw.root                   2    32  1.3 KiB        4   48 KiB      0    169 TiB
default.rgw.control         6   256      0 B        8      0 B      0    169 TiB
default.rgw.buckets.data    7  8192   13 TiB     500M   19 TiB   3.57    338 TiB
default.rgw.buckets.index   8   512  117 GiB    6.12k  350 GiB   0.07    169 TiB
default.rgw.log             9   256  5.2 MiB      290   17 MiB      0    169 TiB
default.rgw.meta           10   256  1.8 MiB       13  5.5 MiB      0    169 TiB
  

- Please note the default.rgw.buckets.data pool OBJECTS as 500M. As this is a small object workload - Cosbench histogram - sizes=h(1|2|25,2|4|40,4|8|25,8|256|10)KB"/>
  + with blustore_min_alloc_size as 4K most of the object at Bluestore layer turns out to be 4K objects hence 500M - this again confirms that all PUTs were fine.

- The following s3cmd command is still running:

root@f10-h29-b02-5039ms:~
# s3cmd ls s3://bucket5 | wc -l


We did try to run the command from different versions so to see if it is radosgw-admin client version incompatibility but it does not look like the case it is broken in the RGW side itself the way it is reading the stats from RGW.


### 5.1.z2 radosgw-admin ###


[root@f11-h21-000-6049p 9b786f86-0692-11ed-b24c-ac1f6ba409aa]# cephadm --image registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.1-rhel-8-containers-candidate-20720-20220624151827 shell
Inferring fsid 9b786f86-0692-11ed-b24c-ac1f6ba409aa
[ceph: root@f11-h21-000-6049p /]# radosgw-admin --version
ceph version 16.2.7-126.el8cp (fe0af61d104d48cb9d116cde6e593b5fc8c197e4) pacific (stable)
[ceph: root@f11-h21-000-6049p /]# radosgw-admin bucket stats |egrep '\''bucket"|num_objects|num_shards'\'''
                "num_objects": 85348169
                "num_objects": 46337304
                "num_objects": 49697801
                "num_objects": 61534844
                "num_objects": 66544315

[ceph: root@f11-h21-000-6049p /]# ceph --version
ceph version 16.2.7-126.el8cp (fe0af61d104d48cb9d116cde6e593b5fc8c197e4) pacific (stable)

[ceph: root@f11-h21-000-6049p /]# ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL    USED  RAW USED  %RAW USED
hdd    579 TiB  549 TiB  30 TiB    30 TiB       5.26
TOTAL  579 TiB  549 TiB  30 TiB    30 TiB       5.26
 
--- POOLS ---
POOL                       ID   PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
device_health_metrics       1     1  202 KiB      341  606 KiB      0    169 TiB
.rgw.root                   2    32  1.3 KiB        4   48 KiB      0    169 TiB
default.rgw.control         6   256      0 B        8      0 B      0    169 TiB
default.rgw.buckets.data    7  8192   13 TiB     500M   19 TiB   3.57    338 TiB
default.rgw.buckets.index   8   512  117 GiB    6.12k  350 GiB   0.07    169 TiB
default.rgw.log             9   256  5.2 MiB      290   17 MiB      0    169 TiB
default.rgw.meta           10   256  1.8 MiB       13  5.5 MiB      0    169 TiB


### 5.0.z4 radosgw-admin ###

[root@f11-h21-000-6049p 9b786f86-0692-11ed-b24c-ac1f6ba409aa]# cephadm --image registry-proxy.engineering.redhat.com/rh-osbs/rhceph:5-103 shell
Inferring fsid 9b786f86-0692-11ed-b24c-ac1f6ba409aa
[ceph: root@f11-h21-000-6049p /]# ceph --version
ceph version 16.2.0-152.el8cp (e456e8b705cb2f4a779689a0d80b122bcb0d67c9) pacific (stable)
[ceph: root@f11-h21-000-6049p /]# radosgw-admin --version
ceph version 16.2.0-152.el8cp (e456e8b705cb2f4a779689a0d80b122bcb0d67c9) pacific (stable)
[ceph: root@f11-h21-000-6049p /]# radosgw-admin bucket stats |egrep '\''bucket"|num_objects|num_shards'\'''
                "num_objects": 85348169
                "num_objects": 46337304
                "num_objects": 49697801
                "num_objects": 61534844
                "num_objects": 66544315
[ceph: root@f11-h21-000-6049p /]# ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL    USED  RAW USED  %RAW USED
hdd    579 TiB  549 TiB  30 TiB    30 TiB       5.26
TOTAL  579 TiB  549 TiB  30 TiB    30 TiB       5.26
 
--- POOLS ---
POOL                       ID   PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
device_health_metrics       1     1  202 KiB      341  606 KiB      0    169 TiB
.rgw.root                   2    32  1.3 KiB        4   48 KiB      0    169 TiB
default.rgw.control         6   256      0 B        8      0 B      0    169 TiB
default.rgw.buckets.data    7  8192   13 TiB     500M   19 TiB   3.57    338 TiB
default.rgw.buckets.index   8   512  117 GiB    6.12k  350 GiB   0.07    169 TiB
default.rgw.log             9   256  5.2 MiB      290   17 MiB      0    169 TiB
default.rgw.meta           10   256  1.8 MiB       13  5.5 MiB      0    169 TiB
[ceph: root@f11-h21-000-6049p /]#


### 5.2 radosgw-admin in cephadm shell ###

[root@f11-h21-000-6049p 9b786f86-0692-11ed-b24c-ac1f6ba409aa]# cephadm shell
Inferring fsid 9b786f86-0692-11ed-b24c-ac1f6ba409aa
Using recent ceph image registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:fd15c0d46b97fb812679f218f77182aa4b1bb0a039dea88d3986c02311e4f23f
[ceph: root@f11-h21-000-6049p /]# ceph --version
ceph version 16.2.8-75.el8cp (3b7a6413ab943a935f85f10a25d51a88093d15e9) pacific (stable)
[ceph: root@f11-h21-000-6049p /]# radosgw-admin --version
ceph version 16.2.8-75.el8cp (3b7a6413ab943a935f85f10a25d51a88093d15e9) pacific (stable)
[ceph: root@f11-h21-000-6049p /]# radosgw-admin bucket stats |egrep '\''bucket"|num_objects|num_shards'\'''
                "num_objects": 85348169
                "num_objects": 46337304
                "num_objects": 49697801
                "num_objects": 61534844
                "num_objects": 66544315
[ceph: root@f11-h21-000-6049p /]# ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL    USED  RAW USED  %RAW USED
hdd    579 TiB  549 TiB  30 TiB    30 TiB       5.26
TOTAL  579 TiB  549 TiB  30 TiB    30 TiB       5.26
 
--- POOLS ---
POOL                       ID   PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
device_health_metrics       1     1  202 KiB      341  606 KiB      0    169 TiB
.rgw.root                   2    32  1.3 KiB        4   48 KiB      0    169 TiB
default.rgw.control         6   256      0 B        8      0 B      0    169 TiB
default.rgw.buckets.data    7  8192   13 TiB     500M   19 TiB   3.57    338 TiB
default.rgw.buckets.index   8   512  117 GiB    6.12k  350 GiB   0.07    169 TiB
default.rgw.log             9   256  5.2 MiB      290   17 MiB      0    169 TiB
default.rgw.meta           10   256  1.8 MiB       13  5.5 MiB      0    169 TiB


Then finally we did capture the debug logs:

[root@f11-h21-000-6049p 9b786f86-0692-11ed-b24c-ac1f6ba409aa]# radosgw-admin bucket stats --debug-rgw=20 --debug-ms=1 &> bucket.stats.rgw20.ms.1.txt
[root@f11-h21-000-6049p 9b786f86-0692-11ed-b24c-ac1f6ba409aa]# cat bucket.stats.rgw20.ms.1.txt | egrep '\''bucket"|num_objects|num_shards'\'''
                "num_objects": 85348169
                "num_objects": 46337304
                "num_objects": 49697801
                "num_objects": 61534844
                "num_objects": 66544315

Comment 1 Vikhyat Umrao 2022-07-19 18:32:26 UTC

Created attachment 1898183 [details]
radosgw-admin bucket stats with debug_rgw = 20 and debug_ms = 1

Comment 3 Matt Benjamin (redhat) 2022-07-19 18:50:33 UTC

Thanks, Vikhyat.

Yes, it's important to know if this behavior is a regression relative to 5.0z4.

Matt

Comment 4 Vikhyat Umrao 2022-07-19 18:55:32 UTC

(In reply to Matt Benjamin (redhat) from comment #3)
> Thanks, Vikhyat.
> 
> Yes, it's important to know if this behavior is a regression relative to
> 5.0z4.
> 
> Matt

Thank you, Matt.

Comment 5 Vikhyat Umrao 2022-07-19 19:03:16 UTC

(In reply to Matt Benjamin (redhat) from comment #3)
> Thanks, Vikhyat.
> 
> Yes, it's important to know if this behavior is a regression relative to
> 5.0z4.
> 
> Matt

Yes, it is as I mentioned in the chat 5.0.z4 is working fine.

Comment 127 errata-xmlrpc 2023-01-11 17:39:52 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 5.3 security update and Bug Fix), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:0076

Note You need to log in before you can comment on or make changes to this bug.