Bug 2182397

Summary: RGW Metadata gone for ~20,000 buckets after RHCS 5.3 upgrade
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Bipin Kunal <bkunal>
Component: RGWAssignee: Matt Benjamin (redhat) <mbenjamin>
Status: CLOSED DUPLICATE QA Contact: Madhavi Kasturi <mkasturi>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 5.3CC: ceph-eng-bugs, cephqe-warriors
Target Milestone: ---   
Target Release: 6.0z1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-03-28 15:08:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2174235    

Description Bipin Kunal 2023-03-28 14:25:41 UTC
This bug was initially created as a copy of Bug #2174235

I am copying this bug because: 



Description of problem:  RGW Metadata gone for ~20,000 buckets after RHCS 5.3 upgrade

As an example, we have this bucket:

    {
        "bucket": "wscnhpro-2-142-p",
        "num_shards": 1,
        "tenant": "",
        "zonegroup": "bc76b7c5-0b80-4c3e-b50f-cb6c0c6a752f",
        "placement_rule": "flash",
        "explicit_placement": {
            "data_pool": "",
            "data_extra_pool": "",
            "index_pool": ""
        },
        "id": "da1f0613-46b7-49fb-a853-c64192fb7f5b.2763177.114",
        "marker": "da1f0613-46b7-49fb-a853-c64192fb7f5b.2763177.114",
        "index_type": "Normal",
        "owner": "renachservice-storagebsa",
        "ver": "0#1",
        "master_ver": "0#0",
        "mtime": "2023-02-27T20:56:23.810068Z",
        "creation_time": "2021-05-19T16:10:46.301591Z",
        "max_marker": "0#",
        "usage": {},
        "bucket_quota": {
            "enabled": false,
            "check_on_raw": false,
            "max_size": -1,
            "max_size_kb": 0,
            "max_objects": -1
        }
    },

Please note

* The usage is empty braces, {}, which is usually indicative of a never used bucket
* The shard count is 1.
* The mtime is yesterday, just after or during the 4.3 to 5.3 upgrade
* The same pattern exists for about 20,000 buckets
* See the bucket stats output in Support Shell

And this:

[root@dfessrvbpcm00001 /]# rados -p i02.rgw.flash.index listomapkeys .dir.da1f0613-46b7-49fb-a853-c64192fb7f5b.2013442.245.1.0
{Gives no output}

But if they do this:

rados -p i02.rgw.flash.data ls | grep da1f0613-46b7-49fb-a853-c64192fb7f5b.2013442.245

They receive back millions of lines of data.


Version-Release number of selected component (if applicable):

# ceph versions
{
    "mon": {
        "ceph version 16.2.10-94.el8cp (48ce8ed67474ea50f10c019b9445be7f49749d23) pacific (stable)": 3
    },
    "mgr": {
        "ceph version 16.2.10-94.el8cp (48ce8ed67474ea50f10c019b9445be7f49749d23) pacific (stable)": 3
    },
    "osd": {
        "ceph version 16.2.10-94.el8cp (48ce8ed67474ea50f10c019b9445be7f49749d23) pacific (stable)": 762
    },
    "mds": {
        "ceph version 16.2.10-94.el8cp (48ce8ed67474ea50f10c019b9445be7f49749d23) pacific (stable)": 2
    },
    "rgw": {
        "ceph version 16.2.10-94.el8cp (48ce8ed67474ea50f10c019b9445be7f49749d23) pacific (stable)": 7
    },
    "overall": {
        "ceph version 16.2.10-94.el8cp (48ce8ed67474ea50f10c019b9445be7f49749d23) pacific (stable)": 777
    }
}


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:  That the metadata would be intact


Additional info:  The site is essentially down and escalating to management, FYI.

Comment 2 Bipin Kunal 2023-03-28 15:08:46 UTC

*** This bug has been marked as a duplicate of bug 2175299 ***