Bug 2210278 - OMAP statistics are not gathered even after deep-scrub [NEEDINFO]
Summary: OMAP statistics are not gathered even after deep-scrub
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 6.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 6.1z2
Assignee: Brad Hubbard
QA Contact: Pawan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-05-26 12:37 UTC by Harsh Kumar
Modified: 2023-07-24 01:59 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:
rzarzyns: needinfo? (rfriedma)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-6748 0 None None None 2023-05-26 12:38:07 UTC

Description Harsh Kumar 2023-05-26 12:37:24 UTC
Description of problem:
OMAP entries do not show up in ceph df statistics even after deep-scrubbing on the pool is completed.
Reference(from ceph pg dump):
Omap statistics are gathered during deep scrub and may be inaccurate soon afterwards depending on utilization. See http://docs.ceph.com/en/latest/dev/placement-group/#omap-statistics for further details.

Key points -
1. OMAP entries made on a pool show up automatically in the ceph df stats if enough number of omaps are written to the pool (observed previously and is still true)
2. Ideally, OMAP stats should show up if deep-scrub is performed on a pool (was working previously, no longer true)
3. Even without deep-scrub, if OSDs which are part of the pool's acting pg set were restarted, OMAP entries were observed to recognized and displayed in the ceph df stats. (was working previously, no longer true)

As of now, with ceph version 16.2.10-172.el8cp and ceph version 17.2.6-65.el9cp, OMAP entries are accounted for only after deep-scrub and OSD restart both are performed.

Version-Release number of selected component (if applicable):
ceph version 17.2.6-65.el9cp (9b65890b2351d108c4d5fa7a6be7011e9e3d2966) quincy
ceph version 16.2.10-172.el8cp (00a157ecd158911ece116ae43095de793ed9f389) pacific

How reproducible:
3/3

Steps to Reproduce:
1. Configure a Quincy / Pacific Cluster
2. Create a replicated pool with default config
3. Use the python script attached to write objects and OMAPs to the pool from client
 - curl -k https://raw.githubusercontent.com/red-hat-storage/cephci/master/utility/generate_omap_entries.py -O
 - pip3 install docopt
 - python3 generate_omap_entries.py --pool <pool-name> --start 0 --end 20 --key-count 1000
4. Once script has written 20,000 OMAP entries, check the 'ceph df detail' output; Objects in the pool will be 20, but OMAP entries will be 0
5. Trigger deep-scrub on the concerned pool
 - ceph osd pool deep-scrub <pool-name>
6. Once deep-scrubbing is completed, check 'ceph df detail' stats again; Expectation is that OMAP entries should now be listed for the pool, but actually OMAP entries will remain as 0.
7. Choose an OSD from the acting set of any of PGs belonging to the pool; Login to the OSD node, stop the OSD systemd service and disable it; 
8. Use ceph-objectstore-tool to list the omap entries made for a particular object part of a PG whose acting set contains the chosen OSD.
 - ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-11/ --pgid 6.a "omap_obj_4828_14" list-omap
 All the omaps entries for the object will be displayed
8. Enable and start the OSD service which was stopped in Step 7
9. Check 'ceph df detail' stats and now OMAP entries will be visible against the pool.

Actual results:
Ceph does not have a mechanism to explicitly monitor and account OMAP entries in an object in a replicated pool. But it does recognize any new OMAP entries once the concerned pool is deep-scrubbed. However, it was observed that OMAP entries were not recognized and accounted in the ceph df detail stats even after scrubbing and deep-scrubbing the pool multiple times.

Expected results:
OMAP entries should show up in 'ceph df detail' stats after deep-scrubbing of concerned pool has been completed without the need of an OSD restart.

Additional info:

List of objects and their PGs in the pool 're_pool_3' - 
# for i in `rados ls -p re_pool_3`; do ceph osd map re_pool_3 $i; done
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_8' -> pg 6.948d8c40 (6.0) -> up ([12,4,10], p12) acting ([12,4,10], p12)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_17' -> pg 6.c82764a8 (6.8) -> up ([11,14,18], p11) acting ([11,14,18], p11)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_15' -> pg 6.c539e064 (6.4) -> up ([12,16,15], p12) acting ([12,16,15], p12)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_9' -> pg 6.e888b26c (6.c) -> up ([0,16,18], p0) acting ([0,16,18], p0)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_3' -> pg 6.54f1f05c (6.1c) -> up ([7,14,15], p7) acting ([7,14,15], p7)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_6' -> pg 6.150be582 (6.2) -> up ([19,3,5], p19) acting ([19,3,5], p19)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_14' -> pg 6.3ae1d28a (6.a) -> up ([11,18,4], p11) acting ([11,18,4], p11)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_7' -> pg 6.eb2ff3ea (6.a) -> up ([11,18,4], p11) acting ([11,18,4], p11)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_11' -> pg 6.33c2583a (6.1a) -> up ([4,8,5], p4) acting ([4,8,5], p4)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_4' -> pg 6.bfbecf9e (6.1e) -> up ([4,13,6], p4) acting ([4,13,6], p4)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_19' -> pg 6.ed1607c5 (6.5) -> up ([19,6,13], p19) acting ([19,6,13], p19)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_16' -> pg 6.f101bce5 (6.5) -> up ([19,6,13], p19) acting ([19,6,13], p19)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_0' -> pg 6.54776933 (6.13) -> up ([3,9,17], p3) acting ([3,9,17], p3)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_5' -> pg 6.4cc7260b (6.b) -> up ([15,1,4], p15) acting ([15,1,4], p15)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_10' -> pg 6.f2fab59b (6.1b) -> up ([17,3,9], p17) acting ([17,3,9], p17)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_18' -> pg 6.7fd10dbb (6.1b) -> up ([17,3,9], p17) acting ([17,3,9], p17)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_1' -> pg 6.9346df27 (6.7) -> up ([3,19,5], p3) acting ([3,19,5], p3)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_12' -> pg 6.a6b8eeb7 (6.17) -> up ([18,10,2], p18) acting ([18,10,2], p18)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_13' -> pg 6.219c2a6f (6.f) -> up ([5,3,17], p5) acting ([5,3,17], p5)
osdmap e107 pool 're_pool_3' (6) object 'omap_obj_4828_2' -> pg 6.b76947ff (6.1f) -> up ([5,12,1], p5) acting ([5,12,1], p5)

Output of ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-11/ --pgid 6.a "omap_obj_4828_14" list-omap ===> omap_keys_omap_obj_4828_14.txt (part of attachments) contains the list of 1000 OMAPs available for the object "omap_obj_4828_14"


Key Point #3 above talks about OMAP entries getting recognized just by restarting an OSD.
Test logs -
Pacific - http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-WKW5AV/Omap_creations_on_objects_0.log
Quincy - http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-1Q1T5A/Omap_creations_on_objects_0.log

The same tests are now failing as only restart of OSD is no longer sufficient to get OMAP stats.

ceph df details stats for the pool post writing omaps and deep-scrub -
{
            "name": "re_pool_3",
            "id": 6,
            "stats": {
                "stored": 0,
                "stored_data": 0,
                "stored_omap": 0,
                "objects": 20,
                "kb_used": 0,
                "bytes_used": 0,
                "data_bytes_used": 0,
                "omap_bytes_used": 0,
                "percent_used": 0,
                "max_avail": 178747604992,
                "quota_objects": 0,
                "quota_bytes": 0,
                "dirty": 0,
                "rd": 0,
                "rd_bytes": 0,
                "wr": 20,
                "wr_bytes": 491520,
                "compress_bytes_used": 0,
                "compress_under_bytes": 0,
                "stored_raw": 0,
                "avail_raw": 509430663513
            }
}

ceph df details stats for the pool post writing omaps, deep-scrub, and OSD restart - 
{
            "name": "re_pool_3",
            "id": 6,
            "stats": {
                "stored": 26606,
                "stored_data": 0,
                "stored_omap": 26606,
                "objects": 20,
                "kb_used": 78,
                "bytes_used": 79818,
                "data_bytes_used": 0,
                "omap_bytes_used": 79818,
                "percent_used": 1.5668155128878425e-07,
                "max_avail": 169809379328,
                "quota_objects": 0,
                "quota_bytes": 0,
                "dirty": 0,
                "rd": 0,
                "rd_bytes": 0,
                "wr": 20,
                "wr_bytes": 491520,
                "compress_bytes_used": 0,
                "compress_under_bytes": 0,
                "stored_raw": 79818,
                "avail_raw": 509428123993
            }
}

Attachment contains stdouts of ceph df detail and ceph pg dump after every step of BZ reproduction.
Cluster logs are also attached.

Comment 1 RHEL Program Management 2023-05-26 12:37:36 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.


Note You need to log in before you can comment on or make changes to this bug.