Bug 1291566
Summary: | first file created after hot tier full fails to create, but later ends up as a stale erroneous file (file with ???????????) | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Joseph Elwin Fernandes <josferna> | ||||
Component: | tier | Assignee: | Joseph Elwin Fernandes <josferna> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Nag Pavan Chilakam <nchilaka> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | rhgs-3.1 | CC: | dlambrig, nchilaka, rcyriac, rhs-bugs, rkavunga, sankarshan, storage-qa-internal | ||||
Target Milestone: | --- | Keywords: | Reopened, Triaged, ZStream | ||||
Target Release: | RHGS 3.1.2 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | glusterfs-3.8rc2 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 1289163 | ||||||
: | 1293348 (view as bug list) | Environment: | |||||
Last Closed: | 2016-06-16 13:50:45 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1289163 | ||||||
Bug Blocks: | 1277154, 1293348 | ||||||
Attachments: |
|
Comment 1
Vijay Bellur
2015-12-15 08:05:53 UTC
REVIEW: http://review.gluster.org/12969 (tier/dht : Properly free file descriptors during data migration) posted (#2) for review on master by Joseph Fernandes COMMIT: http://review.gluster.org/12969 committed in master by Dan Lambright (dlambrig) ------ commit 9691ea1b203c82386ececc3c5ea9adad39304d7b Author: Joseph Fernandes <josferna> Date: Tue Dec 15 13:32:29 2015 +0530 tier/dht : Properly free file descriptors during data migration While tier migration, free src and dst fd's when create of destination or open of source fails. Change-Id: I62978a669c6c9fbab5fed9df2716b9b2ba00ddf1 BUG: 1291566 Signed-off-by: Joseph Fernandes <josferna> Reviewed-on: http://review.gluster.org/12969 Tested-by: Gluster Build System <jenkins.com> Tested-by: NetBSD Build System <jenkins.org> Reviewed-by: N Balachandran <nbalacha> Reviewed-by: Dan Lambright <dlambrig> Tested-by: Dan Lambright <dlambrig> *** Bug 1289163 has been marked as a duplicate of this bug. *** Following is my finding on the latest build(where the fix is supposed to be availbale):[root@zod dummy]# rpm -qa|grep gluster
glusterfs-api-3.7.5-13.el7rhgs.x86_64
glusterfs-client-xlators-3.7.5-13.el7rhgs.x86_64
glusterfs-server-3.7.5-13.el7rhgs.x86_64
glusterfs-3.7.5-13.el7rhgs.x86_64
glusterfs-cli-3.7.5-13.el7rhgs.x86_64
glusterfs-debuginfo-3.7.5-12.el7rhgs.x86_64
glusterfs-libs-3.7.5-13.el7rhgs.x86_64
glusterfs-fuse-3.7.5-13.el7rhgs.x86_64
[root@zod dummy]#
I used the same steps to validate and found the following observations/issues:
(refer the steps mentioned at the beginning)
1) at step 3 and 4, now when I create a file and it exceeds the disk capacity previously it used to fail out after the disk limit is hit saying input/output error, But now I don't see that happening, instead the mount point hangs there. A CLEAR CASE OF REGRESSION
2)Keeping the state as it is, I opened another terminal of same client and now tried step 5 and following is what i see:
a) the database entry is made for the file even though file create fails as below:
[root@rhs-client1 bztest]# touch x1
touch: cannot touch ‘x1’: No space left on device
[root@rhs-client1 bztest]# ls
gogy5 gogy7 gogy8 gony leg1 leg2 leg3 leg4 leg5 new1 new2 new3 new4 tile1 tile2 tile3 tile4 x10 x2 x3 x4 x5 x6 x7 x8 x9
====>database entry is there <========================
>>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==
cd1833b2-abfe-446a-8090-87abba9a7a6c|1450961884|272248|0|0|0|0|0|0|0|0
de8c1aad-8b0f-46dd-8890-9c6af4588b5e|1450962132|339551|0|0|0|0|0|0|0|0
5ba281ad-e194-4d75-a0b3-f68bd409020a|1450962204|62682|0|0|0|0|0|0|0|0
cd1833b2-abfe-446a-8090-87abba9a7a6c|00000000-0000-0000-0000-000000000001|new1|0|0
de8c1aad-8b0f-46dd-8890-9c6af4588b5e|00000000-0000-0000-0000-000000000001|x2|0|0
5ba281ad-e194-4d75-a0b3-f68bd409020a|00000000-0000-0000-0000-000000000001|x1|0|0
###############################
Thu Dec 24 18:37:48 IST 2015
/dummy/brick101/bztest_hot:
b)however, I don't see the file getting created.
Conclusion: It is a partial fix
Created attachment 1109206 [details]
qe validation log
Moving BZ to failed_qa due to it being a partial fix (In reply to nchilaka from comment #8) > Following is my finding on the latest build(where the fix is supposed to be > availbale):[root@zod dummy]# rpm -qa|grep gluster > glusterfs-api-3.7.5-13.el7rhgs.x86_64 > glusterfs-client-xlators-3.7.5-13.el7rhgs.x86_64 > glusterfs-server-3.7.5-13.el7rhgs.x86_64 > glusterfs-3.7.5-13.el7rhgs.x86_64 > glusterfs-cli-3.7.5-13.el7rhgs.x86_64 > glusterfs-debuginfo-3.7.5-12.el7rhgs.x86_64 > glusterfs-libs-3.7.5-13.el7rhgs.x86_64 > glusterfs-fuse-3.7.5-13.el7rhgs.x86_64 > [root@zod dummy]# > > > I used the same steps to validate and found the following > observations/issues: > (refer the steps mentioned at the beginning) > 1) at step 3 and 4, now when I create a file and it exceeds the disk > capacity previously it used to fail out after the disk limit is hit saying > input/output error, But now I don't see that happening, instead the mount > point hangs there. A CLEAR CASE OF REGRESSION Suggest you to create a new bug for this. > > 2)Keeping the state as it is, I opened another terminal of same client and > now tried step 5 and following is what i see: > a) the database entry is made for the file even though file create fails > as below: > [root@rhs-client1 bztest]# touch x1 > touch: cannot touch ‘x1’: No space left on device > [root@rhs-client1 bztest]# ls > gogy5 gogy7 gogy8 gony leg1 leg2 leg3 leg4 leg5 new1 new2 new3 > new4 tile1 tile2 tile3 tile4 x10 x2 x3 x4 x5 x6 x7 x8 x9 > > ====>database entry is there <======================== > > >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== > cd1833b2-abfe-446a-8090-87abba9a7a6c|1450961884|272248|0|0|0|0|0|0|0|0 > de8c1aad-8b0f-46dd-8890-9c6af4588b5e|1450962132|339551|0|0|0|0|0|0|0|0 > 5ba281ad-e194-4d75-a0b3-f68bd409020a|1450962204|62682|0|0|0|0|0|0|0|0 > cd1833b2-abfe-446a-8090-87abba9a7a6c|00000000-0000-0000-0000- > 000000000001|new1|0|0 > de8c1aad-8b0f-46dd-8890-9c6af4588b5e|00000000-0000-0000-0000- > 000000000001|x2|0|0 > 5ba281ad-e194-4d75-a0b3-f68bd409020a|00000000-0000-0000-0000- > 000000000001|x1|0|0 > ############################### > Thu Dec 24 18:37:48 IST 2015 > /dummy/brick101/bztest_hot: > > > > b)however, I don't see the file getting created. > > > Conclusion: It is a partial fix This is a clear issue of recording in the wind path and not the unwind path. This cannot be fixed right now as it would require substantial code change in CTR. Even this bug https://bugzilla.redhat.com/show_bug.cgi?id=1289118 was deferred for the same reason. (In reply to Joseph Elwin Fernandes from comment #11) > (In reply to nchilaka from comment #8) > > Following is my finding on the latest build(where the fix is supposed to be > > availbale):[root@zod dummy]# rpm -qa|grep gluster > > glusterfs-api-3.7.5-13.el7rhgs.x86_64 > > glusterfs-client-xlators-3.7.5-13.el7rhgs.x86_64 > > glusterfs-server-3.7.5-13.el7rhgs.x86_64 > > glusterfs-3.7.5-13.el7rhgs.x86_64 > > glusterfs-cli-3.7.5-13.el7rhgs.x86_64 > > glusterfs-debuginfo-3.7.5-12.el7rhgs.x86_64 > > glusterfs-libs-3.7.5-13.el7rhgs.x86_64 > > glusterfs-fuse-3.7.5-13.el7rhgs.x86_64 > > [root@zod dummy]# > > > > > > I used the same steps to validate and found the following > > observations/issues: > > (refer the steps mentioned at the beginning) > > 1) at step 3 and 4, now when I create a file and it exceeds the disk > > capacity previously it used to fail out after the disk limit is hit saying > > input/output error, But now I don't see that happening, instead the mount > > point hangs there. A CLEAR CASE OF REGRESSION > > Suggest you to create a new bug for this. I was able to reproduce this issue. Looking into it. Requesting you to create a new bug for this issue. raised a new bug "1295293 - first file created after hot tier full fails to create, but gets database entry " for the former part, which was not fixed. As the later part was fixed and now a new bug is raised for the former part of the problem, moving bz to verified changing the title according to my previous comment Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0193.html This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user |