Bug 2108394

Summary: rgw: object lock not enforced in some circumstances
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Matt Benjamin (redhat) <mbenjamin>
Component: RGWAssignee: Matt Benjamin (redhat) <mbenjamin>
Status: CLOSED ERRATA QA Contact: Madhavi Kasturi <mkasturi>
Severity: high Docs Contact: Akash Raj <akraj>
Priority: high    
Version: 5.1CC: akraj, anarnold, cbodley, ceph-eng-bugs, cephqe-warriors, kbader, kdreyer, kkeithle, mbenjamin, mkasturi, vereddy
Target Milestone: ---   
Target Release: 5.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-16.2.10-2.el8cp Doc Type: Bug Fix
Doc Text:
.The object version access is corrected preventing object lock violation Previously, inadvertent slicing of version information would occur in some call paths, causing any object version protected by object lock to be deleted contrary to policy. With this release, the object version access is corrected, thereby preventing object lock violation.
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-11 17:39:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2126049    

Description Matt Benjamin (redhat) 2022-07-19 01:29:39 UTC
From upstream tracker (Igor Fedotov):

Well, some more findings on the issue:
1) Apparently this issue is present in Pacific minor releases 16.2.6 through 16.2.9
Not sure about earlier ones but it definitely does not exist in Quincy and main branch.

2) After comparing the delete version operation's behavior between Quincy and Pacific I realized that RGWRados::get_obj_state_impl() function gets different object reference (rgw_obj& obj). In Pacific obj.key.get_oid() call result lacks proper object version specification (while it's present originally at RGWDeleteObj::execute()) and this finally causes ENOENT return from get_obj_state_impl().

More investigation reveals the following implementation of
int RGWRadosObject::get_obj_state(const DoutPrefixProvider *dpp, RGWObjectCtx *rctx, RGWBucket& bucket, RGWObjState **state, optional_yield y, {
rgw_obj obj(bucket.get_key(), key.name);

return store->getRados()->get_obj_state(dpp, rctx, bucket.get_info(), obj, state, follow_olh, y);
}

which is apparently a culprit as it makes an incomplete object reference which lacks version ref.
And indeed Quincy release implementation is different: {
return store->getRados()->get_obj_state(dpp, rctx, bucket.get_info(), get_obj(), state, follow_olh, y);
}

Updating to the latter implementation fixes the bug for me.
But I'm not completely sure this is 100% corrent and there are no side effects.

Comment 1 RHEL Program Management 2022-07-19 01:29:45 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 19 errata-xmlrpc 2023-01-11 17:39:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 5.3 security update and Bug Fix), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:0076