Bug 1259079 - Data Tiering:3.7.0:data loss:detach-tier not flushing data to cold-tier
Data Tiering:3.7.0:data loss:detach-tier not flushing data to cold-tier
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: tiering (Show other bugs)
3.7.5
Unspecified Linux
urgent Severity urgent
: ---
: ---
Assigned To: Dan Lambright
bugs@gluster.org
: Triaged
Depends On: 1220047 1222088
Blocks: qe_tracker_everglades 1219513 1227485 1260923
  Show dependency treegraph
 
Reported: 2015-09-01 19:07 EDT by Dan Lambright
Modified: 2015-10-30 13:32 EDT (History)
7 users (show)

See Also:
Fixed In Version: glusterfs-3.7.5
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1222088
Environment:
Last Closed: 2015-10-14 06:30:01 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Dan Lambright 2015-09-01 19:07:34 EDT
+++ This bug was initially created as a clone of Bug #1222088 +++

+++ This bug was initially created as a clone of Bug #1219513 +++

+++ This bug was initially created as a clone of Bug #1205540 +++

Description of problem:
=======================
In a tiered volume, when we detach a tier, the operation passes successfully, but doesnt flush data to cold tier.
This leads to data loss.


Version-Release number of selected component (if applicable):
============================================================
3.7 upstream nightlies build http://download.gluster.org/pub/gluster/glusterfs/nightly/glusterfs/epel-6-x86_64/glusterfs-3.7dev-0.777.git2308c07.autobuild/


How reproducible:
=================
Easy to reproduce


Steps to Reproduce:
==================
1.create a gluster volume(i created a distribute type) and start the volume
2.attach a tier to the volume using attach-tier
3.now write some files to the volume. All files(if sufficient space available) will be written to the hot-tier
4. Now detach the tier using detach-tier command.


Actual results:
===============
When we detach the tier, the tier gets detached without flushing the data in hot tier to cold. Due to this there is data loss

Expected results:
================
Detach tier should succeed only after all data is flushed to cold tier.


Additional info(CLI logs):
===============
[root@rhs-client44 everglades]# gluster v info vol1
 
Volume Name: vol1
Type: Distribute
Volume ID: 3382e788-ee37-4d6c-b214-8469ca68e376
Status: Started
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: rhs-client44:/pavanbrick1/vol1/b1
Brick2: rhs-client38:/pavanbrick1/vol1/b1
Brick3: rhs-client37:/pavanbrick1/vol1/b1
[root@rhs-client44 everglades]# gluster v status vol1
Status of volume: vol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick rhs-client44:/pavanbrick1/vol1/b1     49152     0          Y       29969
Brick rhs-client38:/pavanbrick1/vol1/b1     49152     0          Y       30514
Brick rhs-client37:/pavanbrick1/vol1/b1     49152     0          Y       29475
NFS Server on localhost                     2049      0          Y       29993
NFS Server on rhs-client38                  2049      0          Y       30538
NFS Server on rhs-client37                  2049      0          Y       29499
 
Task Status of Volume vol1
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@rhs-client44 everglades]# gluster v attach-tier vol1 rhs-client44:/pavanbrick2/vol1_hot/hb1 rhs-client37:/pavanbrick2/vol1_hot/hb1
volume add-brick: success
[root@rhs-client44 everglades]# gluster v info vol1
 
Volume Name: vol1
Type: Tier
Volume ID: 3382e788-ee37-4d6c-b214-8469ca68e376
Status: Started
Number of Bricks: 5 x 1 = 5
Transport-type: tcp
Bricks:
Brick1: rhs-client37:/pavanbrick2/vol1_hot/hb1
Brick2: rhs-client44:/pavanbrick2/vol1_hot/hb1
Brick3: rhs-client44:/pavanbrick1/vol1/b1
Brick4: rhs-client38:/pavanbrick1/vol1/b1
Brick5: rhs-client37:/pavanbrick1/vol1/b1



[root@rhs-client44 everglades]# gluster v detach-tier vol1
volume remove-brick unknown: success
[root@rhs-client44 everglades]# gluster v info vol1
 
Volume Name: vol1
Type: Distribute
Volume ID: 3382e788-ee37-4d6c-b214-8469ca68e376
Status: Started
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: rhs-client44:/pavanbrick1/vol1/b1
Brick2: rhs-client38:/pavanbrick1/vol1/b1
Brick3: rhs-client37:/pavanbrick1/vol1/b1

--- Additional comment from Anand Avati on 2015-04-01 18:56:08 EDT ---

REVIEW: http://review.gluster.org/10108 (glusterd: WIP support for tier volumes 'detach start' and 'detach commit') posted (#1) for review on master by Dan Lambright (dlambrig@redhat.com)

--- Additional comment from Anand Avati on 2015-04-07 07:14:48 EDT ---

REVIEW: http://review.gluster.org/10108 (glusterd: WIP support for tier volumes 'detach start' and 'detach commit') posted (#2) for review on master by Dan Lambright (dlambrig@redhat.com)

--- Additional comment from Anand Avati on 2015-04-09 06:08:37 EDT ---

REVIEW: http://review.gluster.org/10108 (glusterd: WIP support for tier volumes 'detach start' and 'detach commit') posted (#3) for review on master by Dan Lambright (dlambrig@redhat.com)

--- Additional comment from Anand Avati on 2015-04-14 00:12:50 EDT ---

REVIEW: http://review.gluster.org/10108 (glusterd: support for tier volumes 'detach start' and 'detach commit') posted (#4) for review on master by Dan Lambright (dlambrig@redhat.com)

--- Additional comment from Anand Avati on 2015-04-16 06:19:59 EDT ---

REVIEW: http://review.gluster.org/10108 (glusterd: support for tier volumes 'detach start' and 'detach commit') posted (#5) for review on master by Dan Lambright (dlambrig@redhat.com)

--- Additional comment from Anand Avati on 2015-04-18 07:59:25 EDT ---

REVIEW: http://review.gluster.org/10108 (glusterd: support for tier volumes 'detach start' and 'detach commit') posted (#6) for review on master by Dan Lambright (dlambrig@redhat.com)

--- Additional comment from Anand Avati on 2015-04-21 16:52:11 EDT ---

REVIEW: http://review.gluster.org/10108 (glusterd: support for tier volumes 'detach start' and 'detach commit') posted (#7) for review on master by Dan Lambright (dlambrig@redhat.com)

--- Additional comment from Anand Avati on 2015-04-22 06:20:24 EDT ---

REVIEW: http://review.gluster.org/10108 (glusterd: support for tier volumes 'detach start' and 'detach commit') posted (#8) for review on master by Dan Lambright (dlambrig@redhat.com)

--- Additional comment from Anand Avati on 2015-04-22 10:39:46 EDT ---

REVIEW: http://review.gluster.org/10108 (glusterd: support for tier volumes 'detach start' and 'detach commit') posted (#9) for review on master by Kaleb KEITHLEY (kkeithle@redhat.com)

--- Additional comment from Anand Avati on 2015-04-22 10:51:06 EDT ---

COMMIT: http://review.gluster.org/10108 committed in master by Kaleb KEITHLEY (kkeithle@redhat.com) 
------
commit 86b02afab780e559e82399b9e96381d8df594ed6
Author: Dan Lambright <dlambrig@redhat.com>
Date:   Mon Apr 13 02:42:12 2015 +0100

    glusterd: support for tier volumes 'detach start' and 'detach commit'
    
    These commands work in a manner analagous to rebalancing when removing a
    brick. The existing migration daemon detects "detach start" and switches
    to moving data off the hot tier. While in this state all lookups are
    directed to the cold tier.
    
    gluster v detach-tier <vol> start
    gluster v detach-tier <vol> commit
    
    The status and stop cli commands shall be submitted separately.
    
    Change-Id: I24fda5cc3ba74f5fb8aa9a3234ad51f18b80a8a0
    BUG: 1205540
    Signed-off-by: Dan Lambright <dlambrig@redhat.com>
    Signed-off-by: root <root@localhost.localdomain>
    Signed-off-by: Dan Lambright <dlambrig@redhat.com>
    Reviewed-on: http://review.gluster.org/10108
    Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
    Tested-by: NetBSD Build System

--- Additional comment from Anand Avati on 2015-05-07 10:42:16 EDT ---

REVIEW: http://review.gluster.org/10647 (glusterd: support for tier volumes 'detach start' and 'detach commit') posted (#1) for review on release-3.7 by Dan Lambright (dlambrig@redhat.com)

--- Additional comment from Anand Avati on 2015-05-08 13:26:07 EDT ---

REVIEW: http://review.gluster.org/10647 (glusterd: support for tier volumes 'detach start' and 'detach commit') posted (#2) for review on release-3.7 by Dan Lambright (dlambrig@redhat.com)

--- Additional comment from Anoop on 2015-05-13 08:36:01 EDT ---

Reproduced this ont the BETA2 build too, hence moving it to ASSIGNED.

--- Additional comment from Dan Lambright on 2015-05-13 17:12:36 EDT ---

Well, the DHT rebalance code changed (parallel rebalancing performance fix). When it changed, tiering broke. I see the problem and have discussed with Du, a fix should come in shortly.

--- Additional comment from Anand Avati on 2015-05-15 13:54:39 EDT ---

REVIEW: http://review.gluster.org/10795 (cluster/tier: make attach/detach work with new rebalance logic) posted (#1) for review on master by Dan Lambright (dlambrig@redhat.com)

--- Additional comment from Anand Avati on 2015-05-18 09:45:37 EDT ---

REVIEW: http://review.gluster.org/10795 (cluster/tier: make attach/detach work with new rebalance logic) posted (#2) for review on master by Shyamsundar Ranganathan (srangana@redhat.com)

--- Additional comment from Anand Avati on 2015-05-22 16:48:30 EDT ---

REVIEW: http://review.gluster.org/10795 (cluster/tier: make attach/detach work with new rebalance logic) posted (#3) for review on master by Dan Lambright (dlambrig@redhat.com)

--- Additional comment from Anand Avati on 2015-06-02 07:27:43 EDT ---

COMMIT: http://review.gluster.org/10795 committed in master by Vijay Bellur (vbellur@redhat.com) 
------
commit 5a66d1e6186acfb15e9957b5f196659da8f3cf6d
Author: Dan Lambright <dlambrig@redhat.com>
Date:   Fri May 15 13:37:24 2015 -0400

    cluster/tier: make attach/detach work with new rebalance logic
    
    The new rebalance performance improvements added new
    datastructures which were not initialized in the
    tier case. Function dht_find_local_subvol_cbk() needs
    to accept a list built by lower level DHT translators
    in order to build the local subvolumes list.
    
    Change-Id: Iab03fc8e7fadc22debc08cd5bc781b9e3e270497
    BUG: 1222088
    Signed-off-by: Dan Lambright <dlambrig@redhat.com>
    Reviewed-on: http://review.gluster.org/10795
    Tested-by: NetBSD Build System <jenkins@build.gluster.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Comment 1 Vijay Bellur 2015-09-02 12:19:26 EDT
REVIEW: http://review.gluster.org/12085 (cluster/tier: make attach/detach work with new rebalance logic) posted (#2) for review on release-3.7 by Dan Lambright (dlambrig@redhat.com)
Comment 2 Vijay Bellur 2015-09-02 13:24:39 EDT
COMMIT: http://review.gluster.org/12085 committed in release-3.7 by Dan Lambright (dlambrig@redhat.com) 
------
commit 9f27ef94827e5b73276887011153633291549cda
Author: Dan Lambright <dlambrig@redhat.com>
Date:   Tue Sep 1 20:26:15 2015 -0400

    cluster/tier: make attach/detach work with new rebalance logic
    
    This is a backport of 10795.
    
    > The new rebalance performance improvements added new
    > datastructures which were not initialized in the
    > tier case. Function dht_find_local_subvol_cbk() needs
    > to accept a list built by lower level DHT translators
    > in order to build the local subvolumes list.
    
    > Change-Id: Iab03fc8e7fadc22debc08cd5bc781b9e3e270497
    > BUG: 1222088
    > Signed-off-by: Dan Lambright <dlambrig@redhat.com>
    > Reviewed-on: http://review.gluster.org/10795
    > Tested-by: NetBSD Build System <jenkins@build.gluster.org>
    > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
    
    Change-Id: Icbd51c96ae4d367d1edf41cdd0edb35095195699
    BUG: 1259079
    Signed-off-by: Dan Lambright <dlambrig@redhat.com>
    Reviewed-on: http://review.gluster.org/12085
    Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Comment 3 Pranith Kumar K 2015-10-14 06:30:01 EDT
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-glusterfs-3.7.5, please open a new bug report.

glusterfs-glusterfs-3.7.5 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://www.gluster.org/pipermail/gluster-users/2015-October/023968.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user
Comment 4 Pranith Kumar K 2015-10-14 06:38:41 EDT
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.5, please open a new bug report.

glusterfs-3.7.5 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://www.gluster.org/pipermail/gluster-users/2015-October/023968.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.