Description of problem (please be detailed as possible and provide log snippets): [DR] OSD are getting OOM killed when running io Version of all relevant components (if applicable): OCP version:- 4.10.0-0.nightly-2022-03-23-153617 ODF version:- 4.10.0-208 CEPH version:- ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable) Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? yes PVC will take time to be come to bound state Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue reproducible? Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Deploy RDR cluster 2. Run io for 2-3 days min(100 pvc/pods) 3. Check osd pod status Actual results: rook-ceph-osd-0-6c4cdb77f4-p4f7m 2/2 Running 66 (18m ago) 5d4h 10.131.0.36 vmware-dccp-one-4ch4f-worker-m8w42 <none> <none> rook-ceph-osd-1-69fb9b6d9f-7xtjt 2/2 Running 7 (111m ago) 5d4h 10.128.2.23 vmware-dccp-one-4ch4f-worker-hxbhv <none> <none> rook-ceph-osd-2-dd8fb4c89-xb77w 2/2 Running 2 (85m ago) 5d4h 10.129.2.25 vmware-dccp-one-4ch4f-worker-56xcv <none> <none> Output from dmesg -T [Tue Mar 29 15:18:30 2022] Memory cgroup out of memory: Killed process 3671834 (ceph-osd) total-vm:6175716kB, anon-rss:5225984kB, file-rss:34368kB, shmem-rss:0kB, UID:167 pgtables:11100kB oom_score_adj:-997 Expected results: Additional info: $oc adm top nodes NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% vmware-dccp-one-4ch4f-master-0 760m 21% 8649Mi 58% vmware-dccp-one-4ch4f-master-1 858m 24% 7500Mi 50% vmware-dccp-one-4ch4f-master-2 871m 24% 9735Mi 65% vmware-dccp-one-4ch4f-worker-56xcv 3389m 21% 8705Mi 13% vmware-dccp-one-4ch4f-worker-hxbhv 3640m 23% 13309Mi 21% vmware-dccp-one-4ch4f-worker-m8w42 3618m 23% 12912Mi 20%
Given the lack of detail on this ticket, I am going to agree that this seems to be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=2021931 and close this one. Please reopen it if you have additional information that conflicts with this assumption. *** This bug has been marked as a duplicate of bug 2021931 ***