Bug 1333643
Summary: | Files present in the .shard folder even after deleting all the vms from the UI | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | RamaKasturi <knarra> | |
Component: | sharding | Assignee: | Krutika Dhananjay <kdhananj> | |
Status: | CLOSED ERRATA | QA Contact: | RamaKasturi <knarra> | |
Severity: | medium | Docs Contact: | ||
Priority: | high | |||
Version: | rhgs-3.1 | CC: | asrivast, pkarampu, rhinduja, sabose | |
Target Milestone: | --- | Keywords: | ZStream | |
Target Release: | RHGS 3.1.3 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.7.9-6 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1335858 (view as bug list) | Environment: | ||
Last Closed: | 2016-06-23 05:21:18 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1258386, 1311817, 1335858, 1337837, 1337839 |
Description
RamaKasturi
2016-05-06 05:25:17 UTC
sosreports can be found in the link below: =========================================== http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/HC/1333643/ RCA: Handling of hardlinks in sharding xlator: Sharding translator performs hardlinks only on the base-file. When the last link is deleted, it deletes all the shard files associated with that base-file. So when an unlink fop comes to shard, it does a lookup (Equivalent of stat for user applications) to find out the number of links. If the number of links is > 1 then it just deletes that link but doesn't shards. If the number of links is 1, then it goes ahead and deletes all the shards as well. What happens in the rhev-setup on VM deletion: Kasturi (QE) selected all the VMs at once and performed delete. All these vm data will be moved to remove_me_xyz directory and host which is the owner of that VM will start doing unlinks in the background. It seems like there are hardlinks on the setup where VMs are stored, where links are shared between different hosts. [root@sulphur vmstore-brick1]# find . -type f | xargs getfattr -d -m. -e hex | grep trusted.gfid | sort | uniq -c | grep -v " 2 " 17 trusted.gfid=0x026de0b69bea4b148b6fb7261eb6c030 1 trusted.gfid=0x122c529ce3ab440f87edc2cf9ef79051 1 trusted.gfid=0x1383cf29af214231bbaafd3120fd4c27 16 trusted.gfid=0x13fd032531be41d4a5e8cff2aff02aca 16 trusted.gfid=0x14215e871c8245ec9a344a8bb028d86f 1 trusted.gfid=0x14807a3c2c0b4f6b92cea4a042856971 16 trusted.gfid=0x2b294ed3423c4e43afee404315a316ab 1 trusted.gfid=0x445a3b7cfb444fb290352b9087dc105b 1 trusted.gfid=0x4a21a35d946d4fe88283f3a2a29992cd 1 trusted.gfid=0x4ae3729419894b3bbb08467e32a088e7 1 trusted.gfid=0x655f5fdf9a59499bb3d329a7ab23ff5e 1 trusted.gfid=0x710d1bbe94e54a60ac64d916b1df672c 1 trusted.gfid=0x97abfcae67ff420797ad3014ce084e2e 1 trusted.gfid=0xc5816599efb547cca48a4a58a9751844 17 trusted.gfid=0xd939b5a7aa29453e814e32c0cf6e2608 1 trusted.gfid=0xd94592fb54ef46dbbf3da881ac94e465 17 trusted.gfid=0xdec2404ec7574494a78598ea4bd31381 1 trusted.gfid=0xfff5f116ef9140078c6536f23197695f [root@sulphur vmstore-brick1]# ls -lih .glusterfs/02/6d/026de0b6-9bea-4b14-8b6f-b7261eb6c030 469762591 -rw-rw----. 17 vdsm kvm 512M Apr 13 12:10 .glusterfs/02/6d/026de0b6-9bea-4b14-8b6f-b7261eb6c030 [root@sulphur vmstore-brick1]# find . -inum 469762591 ./.glusterfs/02/6d/026de0b6-9bea-4b14-8b6f-b7261eb6c030 ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/8d0b02a5-d0f1-42e4-bc10-9baa9d0d5506/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/5308d74e-28e2-4158-b2f9-af898e14e6d0/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/7ca57bac-c614-402e-9bb4-0d172d69c476/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/5ee9dc38-f4a5-41ec-bd95-878165c022b1/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/99f4c43a-dc2f-46f1-b4b4-2cd289f43757/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/bedfd824-6560-454a-9ed1-6baf3e592428/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/9965825b-3e79-434a-8eec-b85994901ce5/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/548b241a-7958-40f4-b84a-161628f4e0e5/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/394f07f9-baab-4299-ba76-a13b9d2a949f/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/6e97619b-bbfb-4856-b63c-05727f84556f/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/ff21f8ba-5038-42f6-9b29-d56d87d87b21/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/3f61feef-0eb3-4cee-875b-d2e22dc8e7e3/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/a96768b4-548c-4c9d-80ac-fb4552e97a94/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/ae9715b8-c6a7-4125-948f-877fba1a7757/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/f8de7123-c638-4ee7-9ce0-c87bad52141c/fe3a369e-567b-4d46-a82d-5dd309af0f5d ./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/280f77cd-63b8-4cc9-9449-b10d1c5c36e1/fe3a369e-567b-4d46-a82d-5dd309af0f5d When all the VMs are deleted together, it can lead to different hosts trying to delete hardlinks in parallel in their respective remove_me_xxx directory. i.e. host1 can delete "./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/280f77cd-63b8-4cc9-9449-b10d1c5c36e1/fe3a369e-567b-4d46-a82d-5dd309af0f5d" where as host2 can delete "./297a9b9c-4396-4b30-8bfe-976a67d49a74/images/f8de7123-c638-4ee7-9ce0-c87bad52141c/fe3a369e-567b-4d46-a82d-5dd309af0f5d". which are hardlinks to each other. So at the time when shard queries how many links the file has by doing lookup both of them will get a number greater than 1, so they both go ahead and delete the file. But one of those unlinks will be the last one but shard has no clue about this because at the time when it queried number of links is more than 1. This leads to shard not deleting shards. This is the reason for the bug. We have seen files with hardlink count '1' which suggests there is a bug even in posix-xlator which is not deleting gfid-link thinking it is not the last link. Pranith Upstream patch: http://review.gluster.org/14334 Moving hte bug to POST. https://code.engineering.redhat.com/gerrit/74766 <-- d/s patch Verified and works fine with build glusterfs-3.7.9-6.el7rhgs.x86_64. Ran the following tests: 1) Directly created vms from template and deleted them from UI. All the .shards corresponding to the vm images got deleted except the shards for template since template is not deleted. 2) Created a pool from template and then launched vms and deleted them from UI. All the .shards corresponding to the vm images got deleted except the .shards for template since template is not deleted. Moving this to verified, will re open this bug in case hit again. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1240 |