So to more directly address customer, locks are a way for processes running in parallel to coordinate their access to shared objects/data. We would not want each of the RGW processes to simultaneously process the same reshard log, so the first one to try acquires the lock, the second one is locked out for the duration, and finally the first one releases the lock. The customer clearly diagnosed this when they write: "Enable rgw debug log on the first rgw node in test env, find that the error msg is logged when another RGW daemon already acquired lock for reshard.000000000x:" So the links to an analogous situation with LC (lifecycle) logs are relevant in that although based on a different subsystem of RGW, it's ultimately the same underlying issue. I think the best course is to mark these messages INFOs rather than WARNINGs or ERRORs, so they don't raise unnecessary concern. If that's the case, remaining at log level 0 would not be an issue. I'll put together a fix and target it for 5.1. Eric
The upstream PR to address this can be found at https://github.com/ceph/ceph/pull/40862 .
The commit used from the pr linked to in comment #4 is 6d3dee37791ad427a3435c493a1d7874ba075674 .
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat Ceph Storage 5.1 Security, Enhancement, and Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1174