Description of problem: Customer is observing frequent connection timed out error in the RGW logs and because of this jobs are getting cancelled and users have to rerun the jobs. There is no pattern for the connection timed out BTW as per customer this is happening when they are coping huge data via RGW. ~~~ 2023-05-30T06:02:26.098+0000 7f8cbf8b5700 1 req 2251681747474232490 10.014173508s int RGWSI_Notify::robust_notify(const DoutPrefixProvider*, RGWSI_RADOS::Obj&, const RGWCacheNotifyInfo&, optional_yield):402 Notify failed on object xx.rgw.meta:users.uid:xxx: (110) Connection timed out 2023-05-30T06:02:26.098+0000 7f8cbf8b5700 1 req 2251681747474232490 10.014173508s int RGWSI_Notify::robust_notify(const DoutPrefixProvider*, RGWSI_RADOS::Obj&, const RGWCacheNotifyInfo&, optional_yield):418 Invalidating obj=xx.rgw.meta:users.uid:xxx tries=0 ~~~ Version-Release number of selected component (if applicable): RHCS 5.3z3 (16.2.10-160.el8cp) RHEL 8.4 How reproducible: Customer environment specific Steps to Reproduce: NA Actual results: Connection timed out happening Expected results: There should not be connection timed out messages Additional info: Customer is using Hadoop credential provider for OIDC+STS authentication
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 7.0 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:7780
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days