Bug 1291566 - first file created after hot tier full fails to create, but later ends up as a stale erroneous file (file with ???????????)
first file created after hot tier full fails to create, but later ends up as ...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: tier (Show other bugs)
3.1
Unspecified Unspecified
high Severity high
: ---
: RHGS 3.1.2
Assigned To: Joseph Elwin Fernandes
nchilaka
: Reopened, Triaged, ZStream
: 1289163 (view as bug list)
Depends On: 1289163
Blocks: 1277154 1293348
  Show dependency treegraph
 
Reported: 2015-12-15 03:01 EST by Joseph Elwin Fernandes
Modified: 2016-09-17 11:37 EDT (History)
7 users (show)

See Also:
Fixed In Version: glusterfs-3.8rc2
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1289163
: 1293348 (view as bug list)
Environment:
Last Closed: 2016-06-16 09:50:45 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
qe validation log (100.01 KB, text/plain)
2015-12-24 08:23 EST, nchilaka
no flags Details

  None (edit)
Comment 1 Vijay Bellur 2015-12-15 03:05:53 EST
REVIEW: http://review.gluster.org/12969 (tier/dht : Multiple issues in HOT-TIER full) posted (#1) for review on master by Joseph Fernandes
Comment 2 Vijay Bellur 2015-12-20 09:21:36 EST
REVIEW: http://review.gluster.org/12969 (tier/dht : Properly free file descriptors during data migration) posted (#2) for review on master by Joseph Fernandes
Comment 3 Vijay Bellur 2015-12-21 08:20:13 EST
COMMIT: http://review.gluster.org/12969 committed in master by Dan Lambright (dlambrig@redhat.com) 
------
commit 9691ea1b203c82386ececc3c5ea9adad39304d7b
Author: Joseph Fernandes <josferna@redhat.com>
Date:   Tue Dec 15 13:32:29 2015 +0530

    tier/dht : Properly free file descriptors during data migration
    
    While tier migration, free src and dst fd's when create of
    destination or open of source fails.
    
    Change-Id: I62978a669c6c9fbab5fed9df2716b9b2ba00ddf1
    BUG: 1291566
    Signed-off-by: Joseph Fernandes <josferna@redhat.com>
    Reviewed-on: http://review.gluster.org/12969
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Tested-by: NetBSD Build System <jenkins@build.gluster.org>
    Reviewed-by: N Balachandran <nbalacha@redhat.com>
    Reviewed-by: Dan Lambright <dlambrig@redhat.com>
    Tested-by: Dan Lambright <dlambrig@redhat.com>
Comment 7 Vivek Agarwal 2015-12-23 01:46:32 EST
*** Bug 1289163 has been marked as a duplicate of this bug. ***
Comment 8 nchilaka 2015-12-24 08:16:44 EST
Following is my finding on the latest build(where the fix is supposed to be availbale):[root@zod dummy]# rpm -qa|grep gluster
glusterfs-api-3.7.5-13.el7rhgs.x86_64
glusterfs-client-xlators-3.7.5-13.el7rhgs.x86_64
glusterfs-server-3.7.5-13.el7rhgs.x86_64
glusterfs-3.7.5-13.el7rhgs.x86_64
glusterfs-cli-3.7.5-13.el7rhgs.x86_64
glusterfs-debuginfo-3.7.5-12.el7rhgs.x86_64
glusterfs-libs-3.7.5-13.el7rhgs.x86_64
glusterfs-fuse-3.7.5-13.el7rhgs.x86_64
[root@zod dummy]# 


I used the same steps to validate and found the following observations/issues:
(refer the steps mentioned at the beginning) 
1) at step 3 and 4, now when I create a file and it exceeds the disk capacity previously it used to fail out after the disk limit is hit saying input/output error, But now I don't see that happening, instead the mount point hangs there. A CLEAR CASE OF REGRESSION

2)Keeping the state as it is, I opened another terminal of same client and now tried step 5  and following is what i see:
  a) the database entry is made for the file even though file create fails as below:
[root@rhs-client1 bztest]# touch x1
touch: cannot touch ‘x1’: No space left on device
[root@rhs-client1 bztest]# ls
gogy5  gogy7  gogy8  gony  leg1  leg2  leg3  leg4  leg5  new1  new2  new3  new4  tile1  tile2  tile3  tile4  x10  x2  x3  x4  x5  x6  x7  x8  x9

====>database entry is there <========================

>>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==
cd1833b2-abfe-446a-8090-87abba9a7a6c|1450961884|272248|0|0|0|0|0|0|0|0
de8c1aad-8b0f-46dd-8890-9c6af4588b5e|1450962132|339551|0|0|0|0|0|0|0|0
5ba281ad-e194-4d75-a0b3-f68bd409020a|1450962204|62682|0|0|0|0|0|0|0|0
cd1833b2-abfe-446a-8090-87abba9a7a6c|00000000-0000-0000-0000-000000000001|new1|0|0
de8c1aad-8b0f-46dd-8890-9c6af4588b5e|00000000-0000-0000-0000-000000000001|x2|0|0
5ba281ad-e194-4d75-a0b3-f68bd409020a|00000000-0000-0000-0000-000000000001|x1|0|0
###############################
Thu Dec 24 18:37:48 IST 2015
/dummy/brick101/bztest_hot:



b)however, I don't see the file getting created.


Conclusion: It is a partial fix
Comment 9 nchilaka 2015-12-24 08:23 EST
Created attachment 1109206 [details]
qe validation log
Comment 10 nchilaka 2015-12-28 00:32:44 EST
Moving BZ to failed_qa due to it being a partial fix
Comment 11 Joseph Elwin Fernandes 2015-12-30 01:52:46 EST
(In reply to nchilaka from comment #8)
> Following is my finding on the latest build(where the fix is supposed to be
> availbale):[root@zod dummy]# rpm -qa|grep gluster
> glusterfs-api-3.7.5-13.el7rhgs.x86_64
> glusterfs-client-xlators-3.7.5-13.el7rhgs.x86_64
> glusterfs-server-3.7.5-13.el7rhgs.x86_64
> glusterfs-3.7.5-13.el7rhgs.x86_64
> glusterfs-cli-3.7.5-13.el7rhgs.x86_64
> glusterfs-debuginfo-3.7.5-12.el7rhgs.x86_64
> glusterfs-libs-3.7.5-13.el7rhgs.x86_64
> glusterfs-fuse-3.7.5-13.el7rhgs.x86_64
> [root@zod dummy]# 
> 
> 
> I used the same steps to validate and found the following
> observations/issues:
> (refer the steps mentioned at the beginning) 
> 1) at step 3 and 4, now when I create a file and it exceeds the disk
> capacity previously it used to fail out after the disk limit is hit saying
> input/output error, But now I don't see that happening, instead the mount
> point hangs there. A CLEAR CASE OF REGRESSION

  Suggest you to create a new bug for this.

> 
> 2)Keeping the state as it is, I opened another terminal of same client and
> now tried step 5  and following is what i see:
>   a) the database entry is made for the file even though file create fails
> as below:
> [root@rhs-client1 bztest]# touch x1
> touch: cannot touch ‘x1’: No space left on device
> [root@rhs-client1 bztest]# ls
> gogy5  gogy7  gogy8  gony  leg1  leg2  leg3  leg4  leg5  new1  new2  new3 
> new4  tile1  tile2  tile3  tile4  x10  x2  x3  x4  x5  x6  x7  x8  x9
> 
> ====>database entry is there <========================
> 
> >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==
> cd1833b2-abfe-446a-8090-87abba9a7a6c|1450961884|272248|0|0|0|0|0|0|0|0
> de8c1aad-8b0f-46dd-8890-9c6af4588b5e|1450962132|339551|0|0|0|0|0|0|0|0
> 5ba281ad-e194-4d75-a0b3-f68bd409020a|1450962204|62682|0|0|0|0|0|0|0|0
> cd1833b2-abfe-446a-8090-87abba9a7a6c|00000000-0000-0000-0000-
> 000000000001|new1|0|0
> de8c1aad-8b0f-46dd-8890-9c6af4588b5e|00000000-0000-0000-0000-
> 000000000001|x2|0|0
> 5ba281ad-e194-4d75-a0b3-f68bd409020a|00000000-0000-0000-0000-
> 000000000001|x1|0|0
> ###############################
> Thu Dec 24 18:37:48 IST 2015
> /dummy/brick101/bztest_hot:
> 
> 
> 
> b)however, I don't see the file getting created.
> 
> 
> Conclusion: It is a partial fix

 This is a clear issue of recording in the wind path and not the unwind path. This cannot be fixed right now as it would require substantial code change in CTR. Even this bug https://bugzilla.redhat.com/show_bug.cgi?id=1289118 was deferred for the same reason.
Comment 12 Joseph Elwin Fernandes 2015-12-31 05:11:30 EST
(In reply to Joseph Elwin Fernandes from comment #11)
> (In reply to nchilaka from comment #8)
> > Following is my finding on the latest build(where the fix is supposed to be
> > availbale):[root@zod dummy]# rpm -qa|grep gluster
> > glusterfs-api-3.7.5-13.el7rhgs.x86_64
> > glusterfs-client-xlators-3.7.5-13.el7rhgs.x86_64
> > glusterfs-server-3.7.5-13.el7rhgs.x86_64
> > glusterfs-3.7.5-13.el7rhgs.x86_64
> > glusterfs-cli-3.7.5-13.el7rhgs.x86_64
> > glusterfs-debuginfo-3.7.5-12.el7rhgs.x86_64
> > glusterfs-libs-3.7.5-13.el7rhgs.x86_64
> > glusterfs-fuse-3.7.5-13.el7rhgs.x86_64
> > [root@zod dummy]# 
> > 
> > 
> > I used the same steps to validate and found the following
> > observations/issues:
> > (refer the steps mentioned at the beginning) 
> > 1) at step 3 and 4, now when I create a file and it exceeds the disk
> > capacity previously it used to fail out after the disk limit is hit saying
> > input/output error, But now I don't see that happening, instead the mount
> > point hangs there. A CLEAR CASE OF REGRESSION
> 
>   Suggest you to create a new bug for this.

I was able to reproduce this issue. Looking into it. Requesting you to create a new bug for this issue.
Comment 13 nchilaka 2016-01-04 01:30:44 EST
raised a new bug "1295293 - first file created after hot tier full fails to create, but gets database entry " for the former part, which was not fixed.

As the later part was fixed and now a new bug is raised for the former part of the problem, moving bz to verified
Comment 14 nchilaka 2016-01-04 01:31:44 EST
changing the title according to my previous comment
Comment 16 errata-xmlrpc 2016-03-01 01:03:59 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html
Comment 17 Niels de Vos 2016-06-16 09:50:45 EDT
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.