Description of problem: [RGW] Buckets/objects deletion is causing orphan rados objects In the workload, DFG QA cluster which we are using for RHCS 4.1 release based criteria testing - https://bugzilla.redhat.com/show_bug.cgi?id=1824263 is causing a lot of orphan rados objects. The cluster was running a 10 hours cosbench workload during RHCS 4.0 to RHCS 4.1 cluster upgrade and almost filled the cluster. The team decided to remove the data by deleting some objects from the cosbench client and that did not help in the space reclamation then we have deleted all the buckets with the help of radosgw-admin bucket rm command that removed all the buckets. We ran gc process --include-all and that also cleaned up all the objects but still, the cluster has not reclaimed the space and after checking with rados ls in data pool it is all shadow objects. Version-Release number of selected component (if applicable): [root@f09-h17-b05-5039ms ~]# radosgw-admin gc process --include-all [root@f09-h17-b05-5039ms ~]# radosgw-admin gc list --include-all [] [root@f09-h17-b05-5039ms ~]# ceph df RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 567 TiB 234 TiB 333 TiB 333 TiB 58.70 TOTAL 567 TiB 234 TiB 333 TiB 333 TiB 58.70 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL .rgw.root 140 1.2 KiB 4 768 KiB 0 46 TiB default.rgw.control 141 0 B 8 0 B 0 46 TiB default.rgw.meta 142 578 B 5 768 KiB 0 46 TiB default.rgw.log 143 512 MiB 207 1.5 GiB 0 46 TiB default.rgw.index 144 0 B 0 0 B 0 46 TiB default.rgw.buckets.data 146 192 TiB 50.00M 289 TiB 67.48 93 TiB default.rgw.buckets.index 147 0 B 0 0 B 0 46 TiB [root@f09-h17-b05-5039ms ~]# radosgw-admin bucket list [] [root@f09-h17-b05-5039ms ~]# radosgw-admin bucket stats [] [root@f09-h17-b05-5039ms ~]#
During the upgrade one OSD node daemons and mons and mgrs got upgraded. [root@f09-h17-b05-5039ms ~]# ceph versions { "mon": { "ceph version 14.2.8-50.el7cp (53387608e81e6aa2487c952a604db06faa5b2cd0) nautilus (stable)": 3 }, "mgr": { "ceph version 14.2.8-50.el7cp (53387608e81e6aa2487c952a604db06faa5b2cd0) nautilus (stable)": 3 }, "osd": { "ceph version 14.2.4-51.el7cp (db63624068590e593c47150c7574d08c1ec0d3e4) nautilus (stable)": 264, "ceph version 14.2.8-50.el7cp (53387608e81e6aa2487c952a604db06faa5b2cd0) nautilus (stable)": 24 }, "mds": {}, "rgw": { "ceph version 14.2.4-51.el7cp (db63624068590e593c47150c7574d08c1ec0d3e4) nautilus (stable)": 11, "ceph version 14.2.8-50.el7cp (53387608e81e6aa2487c952a604db06faa5b2cd0) nautilus (stable)": 1 }, "overall": { "ceph version 14.2.4-51.el7cp (db63624068590e593c47150c7574d08c1ec0d3e4) nautilus (stable)": 275, "ceph version 14.2.8-50.el7cp (53387608e81e6aa2487c952a604db06faa5b2cd0) nautilus (stable)": 31 } }
We have captured the listing of rados data pool. # rados -p default.rgw.buckets.data ls > rados.list.txt # du -sh rados.list.txt 4.2G rados.list.txt [root@f09-h17-b05-5039ms ~]# cat rados.list.txt | wc -l 50004910 [root@f09-h17-b05-5039ms ~]# cat rados.list.txt | grep shadow | wc -l 50004910 [root@f09-h17-b05-5039ms ~]# The above confirms that all 50M objects are shadow objects.
Here's some preliminary analysis. The number of orphans listed in /root/rados.list.txt is 50,004,910. All orphans are "shadow" objects. It looks like all those objects came from 5 buckets. Here is the result of lopping off everything from "_shadow" onwards, sorting what is left, and running through "uniq -c". 16978508 987371de-e3d9-45cf-b9b8-3c1a19cabd59.11841.1_ 3675110 987371de-e3d9-45cf-b9b8-3c1a19cabd59.11856.1_ 6175225 987371de-e3d9-45cf-b9b8-3c1a19cabd59.11862.1_ 6151689 987371de-e3d9-45cf-b9b8-3c1a19cabd59.21284.1_ 17024378 987371de-e3d9-45cf-b9b8-3c1a19cabd59.21287.1_ In the narratives above, 5 buckets/containers are mentioned and three are mentioned by name -- mycontainers3, mycontainers5, and mycontainers6 (sometimes without the "s" -- mycontainer6). Some questions... 1. How many buckets were there over the life of this cluster? If only 5 why is there a bucket named "mycontainers6"? 2. Would it be fair to say that we do not know at this point whether this is an issue with: a) 4.0, b) 4.1, or c) the upgrade from 4.0 to 4.1 while the workload is running? If that's not fair, what above answers the question? Eric
(In reply to J. Eric Ivancich from comment #20) Thanks Eric. Response inline. > Here's some preliminary analysis. > > The number of orphans listed in /root/rados.list.txt is 50,004,910. > > All orphans are "shadow" objects. > > It looks like all those objects came from 5 buckets. Here is the result of > lopping off everything from "_shadow" onwards, sorting what is left, and > running through "uniq -c". > > 16978508 987371de-e3d9-45cf-b9b8-3c1a19cabd59.11841.1_ > 3675110 987371de-e3d9-45cf-b9b8-3c1a19cabd59.11856.1_ > 6175225 987371de-e3d9-45cf-b9b8-3c1a19cabd59.11862.1_ > 6151689 987371de-e3d9-45cf-b9b8-3c1a19cabd59.21284.1_ > 17024378 987371de-e3d9-45cf-b9b8-3c1a19cabd59.21287.1_ > > In the narratives above, 5 buckets/containers are mentioned and three are > mentioned by name -- mycontainers3, mycontainers5, and mycontainers6 > (sometimes without the "s" -- mycontainer6). > > Some questions... > > 1. How many buckets were there over the life of this cluster? If only 5 why > is there a bucket named "mycontainers6"? Yes. mycontainers6 is from RHCS 4.1 cluster which Rachana reproduced on RHCS 4.1 cluster. The details are given in comment#15. Before comment#15 all the details are from RHCS 4 cluster and that had 5 containers starting from mycontainers1 to mycontainers5. > 2. Would it be fair to say that we do not know at this point whether this is > an issue with: > a) 4.0, > b) 4.1, or > c) the upgrade from 4.0 to 4.1 while the workload is running? > > If that's not fair, what above answers the question? > It is reproducible in RHCS 4.0 and RHCS 4.1 both comment#15 talks about RHCS 4.1 mycontainers6 bucket. -- vikhyat
Thank you, Vikhyat, for clarifying. I believe I've reproduced the issue on 4.1 with a much more simple test case (i.e., no cosbench). And that will allow me to trace to see what's going on. I'll keep all of you posted. Eric
(In reply to J. Eric Ivancich from comment #22) > Thank you, Vikhyat, for clarifying. > > I believe I've reproduced the issue on 4.1 with a much more simple test case > (i.e., no cosbench). And that will allow me to trace to see what's going on. > I'll keep all of you posted. > > Eric Thank you, Eric.
I apologize. What I thought was a reproducer is not reproducing the issue. Back to the drawing board.... Eric
Thank you, Thomas, for getting the build out so quickly! Eric
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days