Bug 2214981

Summary: [CEE/sd][RGW] RGWSI_Notify::robust_notify(const DoutPrefixProvider*, RGWSI_RADOS::Obj&, const RGWCacheNotifyInfo&, optional_yield):402 Notify failed on object: (110) Connection timed out
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Tridibesh Chakraborty <trchakra>
Component: RGWAssignee: Matt Benjamin (redhat) <mbenjamin>
Status: ASSIGNED --- QA Contact: Madhavi Kasturi <mkasturi>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 5.3CC: aemerson, bkunal, cbodley, ceph-eng-bugs, cephqe-warriors, ckulal, mbenjamin, mkogan, vumrao
Target Milestone: ---Flags: aemerson: needinfo-
mbenjamin: needinfo? (mkogan)
trchakra: needinfo? (mbenjamin)
trchakra: needinfo? (cbodley)
Target Release: 7.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2228874 (view as bug list) Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2228874, 2228875, 2230445    

Description Tridibesh Chakraborty 2023-06-14 10:41:59 UTC
Description of problem:
Customer is observing frequent connection timed out error in the RGW logs and because of this jobs are getting cancelled and users have to rerun the jobs. There is no pattern for the connection timed out BTW as per customer this is happening when they are coping huge data via RGW. 

~~~
2023-05-30T06:02:26.098+0000 7f8cbf8b5700  1 req 2251681747474232490 10.014173508s int RGWSI_Notify::robust_notify(const DoutPrefixProvider*, RGWSI_RADOS::Obj&, const RGWCacheNotifyInfo&, optional_yield):402 Notify failed on object xx.rgw.meta:users.uid:xxx: (110) Connection timed out
2023-05-30T06:02:26.098+0000 7f8cbf8b5700  1 req 2251681747474232490 10.014173508s int RGWSI_Notify::robust_notify(const DoutPrefixProvider*, RGWSI_RADOS::Obj&, const RGWCacheNotifyInfo&, optional_yield):418 Invalidating obj=xx.rgw.meta:users.uid:xxx tries=0
~~~

Version-Release number of selected component (if applicable):
RHCS 5.3z3 (16.2.10-160.el8cp)
RHEL 8.4

How reproducible:
Customer environment specific

Steps to Reproduce:
NA

Actual results:
Connection timed out happening 

Expected results:
There should not be connection timed out messages

Additional info:

Customer is using Hadoop credential provider for OIDC+STS authentication