Bug 1024369 - Unable to shrink volumes without dataloss
Summary: Unable to shrink volumes without dataloss
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: pre-release
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Nagaprasad Sathyanarayana
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-10-29 14:17 UTC by Lukas Bezdicka
Modified: 2016-02-18 00:20 UTC (History)
9 users (show)

Fixed In Version: 3.4.2
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-15 06:13:52 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)
test-rebalance.log (202.38 KB, application/octet-stream)
2013-10-29 17:24 UTC, Joe Julian
no flags Details

Description Lukas Bezdicka 2013-10-29 14:17:50 UTC
Description of problem:
gluster volume remove-brick on distribute-replicate volume ends up in dataloss for clients.

Version-Release number of selected component (if applicable):
3.4.0
3.4.1

How reproducible:
always

Steps to Reproduce:
1. yes | gluster volume create test replica 2 servserv.generals.ea.com:/mnt/gluster/test1 servserv.generals.ea.com:/mnt/gluster/test2 servserv.generals.ea.com:/mnt/gluster/test3 servserv.generals.ea.com:/mnt/gluster/test4 servserv.generals.ea.com:/mnt/gluster/test5 servserv.generals.ea.com:/mnt/gluster/test6 ; gluster volume start test

2. mount -t glusterfs servserv.generals.ea.com:test /media/test/

3. cd /media/test ; git clone https://git.fedorahosted.org/git/freeipa.git

4. find /media/test | wc -l

5. gluster volume info
 
Volume Name: test
Type: Distributed-Replicate
Volume ID: 5467b9fe-9a3c-4850-8449-280ee9789c11
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: servserv.generals.ea.com:/mnt/gluster/test1
Brick2: servserv.generals.ea.com:/mnt/gluster/test2
Brick3: servserv.generals.ea.com:/mnt/gluster/test3
Brick4: servserv.generals.ea.com:/mnt/gluster/test4
Brick5: servserv.generals.ea.com:/mnt/gluster/test5
Brick6: servserv.generals.ea.com:/mnt/gluster/test6

6. gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 start

7. gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              341        17.3MB          1331             0      completed           678.00

8. gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 commit
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y

9. find /media/test | wc -l
977

Actual results:
I just lost files.

Expected results:
No dataloss on shrinking gluster as this makes gluster quite unusable for any production setup.

Comment 1 Joe Julian 2013-10-29 17:23:57 UTC
It's migrating the wrong dht subvolume! See attachment.

Comment 2 Joe Julian 2013-10-29 17:24:34 UTC
Created attachment 817145 [details]
test-rebalance.log

Comment 3 Joe Julian 2013-10-29 17:58:33 UTC
Meh, I jumped to conclusions again. It's not. It's just allowing the migration of files TO the decommissioned subvolume.

Comment 4 Lukas Bezdicka 2013-10-30 12:24:43 UTC
Notice make-doc file, in all tests I ran make-doc was on to be decommissioned replica pair and migration didn't even mentioned it.

Comment 5 Brian Cipriano 2013-10-30 16:10:08 UTC
Confirming that I have the same issue. Files appear to be migrated to the decommissioned subvolume, then after I run "commit", file are gone.

Comment 6 Lukas Bezdicka 2013-10-31 15:53:43 UTC
Note that one can reproduce issues from bug #966848, bug #1025404 by running rm -rf /media/test/freeipa after step 9.

Comment 7 Vijay Bellur 2013-11-01 11:59:42 UTC
Does the same behavior exist on upstream master? There have been several related fixes in dht in master and I would like to determine if master does have the same problem.

Comment 8 Lukas Bezdicka 2013-11-05 17:24:33 UTC
I built gluster from master and I can confirm that issue is not there, it would be nice to track down commits and backport them to 3.4 branch as 3.5 is far away.

Test:
[root@potwora test]# rpm -qa | grep gluster
glusterfs-fuse-3git-1.fc20.x86_64
glusterfs-cli-3git-1.fc20.x86_64
glusterfs-rdma-3git-1.fc20.x86_64
glusterfs-api-3git-1.fc20.x86_64
glusterfs-3git-1.fc20.x86_64
glusterfs-api-devel-3git-1.fc20.x86_64
glusterfs-libs-3git-1.fc20.x86_64
glusterfs-server-3git-1.fc20.x86_64
glusterfs-devel-3git-1.fc20.x86_64
glusterfs-debuginfo-3git-1.fc20.x86_64
glusterfs-geo-replication-3git-1.fc20.x86_64
glusterfs-regression-tests-3git-1.fc20.x86_64

[root@potwora ~]# yes | gluster volume stop test force ; yes | gluster volume del test ; rm -rf /mnt/gluster/test* ; yes | gluster volume create test replica 2 servserv.generals.ea.com:/mnt/gluster/test1 servserv.generals.ea.com:/mnt/gluster/test2 servserv.generals.ea.com:/mnt/gluster/test3 servserv.generals.ea.com:/mnt/gluster/test4 servserv.generals.ea.com:/mnt/gluster/test5 servserv.generals.ea.com:/mnt/gluster/test6 force ; gluster volume start test
volume stop: test: failed: Volume test does not exist
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) volume delete: test: failed: Volume test does not exist
Deleting volume will erase all information about the volume. Do you want to continue? (y/n) Multiple bricks of a replicate volume are present on the same server. This setup is not optimal.
Do you still want to continue creating the volume?  (y/n) volume create: test: success: please start the volume to access data
volume start: test: success
[root@potwora ~]# gluster volume info
 
Volume Name: test
Type: Distributed-Replicate
Volume ID: a0883ff4-a6b0-4caa-9fce-928616ca362e
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: servserv.generals.ea.com:/mnt/gluster/test1
Brick2: servserv.generals.ea.com:/mnt/gluster/test2
Brick3: servserv.generals.ea.com:/mnt/gluster/test3
Brick4: servserv.generals.ea.com:/mnt/gluster/test4
Brick5: servserv.generals.ea.com:/mnt/gluster/test5
Brick6: servserv.generals.ea.com:/mnt/gluster/test6
[root@potwora ~]# gluster volume status
Status of volume: test
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick servserv.generals.ea.com:/mnt/gluster/test1	49152	Y	20680
Brick servserv.generals.ea.com:/mnt/gluster/test2	49153	Y	20692
Brick servserv.generals.ea.com:/mnt/gluster/test3	49154	Y	20703
Brick servserv.generals.ea.com:/mnt/gluster/test4	49155	Y	20714
Brick servserv.generals.ea.com:/mnt/gluster/test5	49156	Y	20725
Brick servserv.generals.ea.com:/mnt/gluster/test6	49157	Y	20736
NFS Server on localhost					2049	Y	20750
Self-heal Daemon on localhost				N/A	Y	20754
 
Task Status of Volume test
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@potwora ~]# mkdir /media/test
[root@potwora ~]# cd /media/test/
[root@potwora test]# git clone https://git.fedorahosted.org/git/freeipa.git
Cloning into 'freeipa'...
remote: Counting objects: 58018, done.
remote: Compressing objects: 100% (18280/18280), done.
remote: Total 58018 (delta 47755), reused 48634 (delta 39585)
Receiving objects: 100% (58018/58018), 12.92 MiB | 805.00 KiB/s, done.
Resolving deltas: 100% (47755/47755), done.
Checking connectivity... done
Checking out files: 100% (1171/1171), done.
[root@potwora test]# find ./ | wc -l
1323
[root@potwora test]# cd
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 start
volume remove-brick start: success
ID: 36624241-2555-4c4a-9109-21b4ecae98fc
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              173        16.2MB           432             0             0    in progress             3.00
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              339        18.7MB           870             0             0    in progress             5.00
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              413        19.8MB          1084             0             0    in progress             6.00
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              489        20.6MB          1234             0             0      completed             7.00
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              489        20.6MB          1234             0             0      completed             7.00
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              489        20.6MB          1234             0             0      completed             7.00
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 commit
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit: success
[root@potwora ~]# cd /media/test/
[root@potwora test]# find ./ | wc -l
1323

Comment 10 Shishir Gowda 2013-12-10 09:42:49 UTC
Have backported DHT/remove-brick related fixes to release-3.4 @
http://review.gluster.org/#/c/6461/
http://review.gluster.org/#/c/6468/ <--Most likely this will fix the issue
http://review.gluster.org/#/c/6469/
http://review.gluster.org/#/c/6470/
http://review.gluster.org/#/c/6471/

Comment 11 Vijay Bellur 2013-12-16 06:18:47 UTC
Lukas,

Can you please verify if 3.4.2qa3 fixes this problem?

Comment 12 Lukas Bezdicka 2013-12-16 13:09:30 UTC
No change, issue is still there, should I try to bisect it?



[root@glusterkluster:~] rpm -qa | grep gluster
glusterfs-libs-3.4.2qa3-1.el6.x86_64
glusterfs-server-3.4.2qa3-1.el6.x86_64
glusterfs-fuse-3.4.2qa3-1.el6.x86_64
glusterfs-api-devel-3.4.2qa3-1.el6.x86_64
glusterfs-cli-3.4.2qa3-1.el6.x86_64
glusterfs-rdma-3.4.2qa3-1.el6.x86_64
glusterfs-devel-3.4.2qa3-1.el6.x86_64
glusterfs-debuginfo-3.4.2qa3-1.el6.x86_64
glusterfs-3.4.2qa3-1.el6.x86_64
glusterfs-geo-replication-3.4.2qa3-1.el6.x86_64
glusterfs-api-3.4.2qa3-1.el6.x86_64
 
[root@glusterkluster:~] gluster volume info test
Volume Name: test
Type: Distributed-Replicate
Volume ID: e31ec436-d6dd-4cce-ae57-54408aa1f620
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: glusterkluster:/mnt/gluster/test1
Brick2: glusterkluster:/mnt/gluster/test2
Brick3: glusterkluster:/mnt/gluster/test3
Brick4: glusterkluster:/mnt/gluster/test4
Brick5: glusterkluster:/mnt/gluster/test5
Brick6: glusterkluster:/mnt/gluster/test6

[root@glusterkluster:/media/test] find ./ | wc -l
1313

[root@glusterkluster:~] gluster volume remove-brick test glusterkluster:/mnt/gluster/test5 glusterkluster:/mnt/gluster/test6 start
volume remove-brick start: success
ID: 6cf1cac1-8469-4674-a488-277a7d611dcc

 gluster volume remove-brick test glusterkluster:/mnt/gluster/test5 glusterkluster:/mnt/gluster/test6 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              337        19.1MB          1319             0      completed           670.00

[root@glusterkluster:~] gluster volume remove-brick test glusterkluster:/mnt/gluster/test5 glusterkluster:/mnt/gluster/test6 commit
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit: success

[root@glusterkluster:~] find /media/test/ | wc -l
975

Comment 13 Lukas Bezdicka 2013-12-16 17:27:21 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=966845

please backport 4f63b631dce4cb97525ee13fab0b2a789bcf6b15

Comment 14 Vijay Bellur 2013-12-16 17:39:44 UTC
Backported - http://review.gluster.org/6517.

Does this fix the issue?

Comment 15 Lukas Bezdicka 2013-12-17 14:03:27 UTC
glusterfs-3.4.2qa4-1.el6.x86_64 works fine, thank you!


Note You need to log in before you can comment on or make changes to this bug.