Bug 1024369 - Unable to shrink volumes without dataloss
Unable to shrink volumes without dataloss
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: distribute (Show other bugs)
pre-release
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Nagaprasad Sathyanarayana
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-29 10:17 EDT by Lukas Bezdicka
Modified: 2016-02-17 19:20 EST (History)
9 users (show)

See Also:
Fixed In Version: 3.4.2
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-01-15 01:13:52 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
test-rebalance.log (202.38 KB, application/octet-stream)
2013-10-29 13:24 EDT, Joe Julian
no flags Details

  None (edit)
Description Lukas Bezdicka 2013-10-29 10:17:50 EDT
Description of problem:
gluster volume remove-brick on distribute-replicate volume ends up in dataloss for clients.

Version-Release number of selected component (if applicable):
3.4.0
3.4.1

How reproducible:
always

Steps to Reproduce:
1. yes | gluster volume create test replica 2 servserv.generals.ea.com:/mnt/gluster/test1 servserv.generals.ea.com:/mnt/gluster/test2 servserv.generals.ea.com:/mnt/gluster/test3 servserv.generals.ea.com:/mnt/gluster/test4 servserv.generals.ea.com:/mnt/gluster/test5 servserv.generals.ea.com:/mnt/gluster/test6 ; gluster volume start test

2. mount -t glusterfs servserv.generals.ea.com:test /media/test/

3. cd /media/test ; git clone https://git.fedorahosted.org/git/freeipa.git

4. find /media/test | wc -l

5. gluster volume info
 
Volume Name: test
Type: Distributed-Replicate
Volume ID: 5467b9fe-9a3c-4850-8449-280ee9789c11
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: servserv.generals.ea.com:/mnt/gluster/test1
Brick2: servserv.generals.ea.com:/mnt/gluster/test2
Brick3: servserv.generals.ea.com:/mnt/gluster/test3
Brick4: servserv.generals.ea.com:/mnt/gluster/test4
Brick5: servserv.generals.ea.com:/mnt/gluster/test5
Brick6: servserv.generals.ea.com:/mnt/gluster/test6

6. gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 start

7. gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              341        17.3MB          1331             0      completed           678.00

8. gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 commit
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y

9. find /media/test | wc -l
977

Actual results:
I just lost files.

Expected results:
No dataloss on shrinking gluster as this makes gluster quite unusable for any production setup.
Comment 1 Joe Julian 2013-10-29 13:23:57 EDT
It's migrating the wrong dht subvolume! See attachment.
Comment 2 Joe Julian 2013-10-29 13:24:34 EDT
Created attachment 817145 [details]
test-rebalance.log
Comment 3 Joe Julian 2013-10-29 13:58:33 EDT
Meh, I jumped to conclusions again. It's not. It's just allowing the migration of files TO the decommissioned subvolume.
Comment 4 Lukas Bezdicka 2013-10-30 08:24:43 EDT
Notice make-doc file, in all tests I ran make-doc was on to be decommissioned replica pair and migration didn't even mentioned it.
Comment 5 Brian Cipriano 2013-10-30 12:10:08 EDT
Confirming that I have the same issue. Files appear to be migrated to the decommissioned subvolume, then after I run "commit", file are gone.
Comment 6 Lukas Bezdicka 2013-10-31 11:53:43 EDT
Note that one can reproduce issues from bug #966848, bug #1025404 by running rm -rf /media/test/freeipa after step 9.
Comment 7 Vijay Bellur 2013-11-01 07:59:42 EDT
Does the same behavior exist on upstream master? There have been several related fixes in dht in master and I would like to determine if master does have the same problem.
Comment 8 Lukas Bezdicka 2013-11-05 12:24:33 EST
I built gluster from master and I can confirm that issue is not there, it would be nice to track down commits and backport them to 3.4 branch as 3.5 is far away.

Test:
[root@potwora test]# rpm -qa | grep gluster
glusterfs-fuse-3git-1.fc20.x86_64
glusterfs-cli-3git-1.fc20.x86_64
glusterfs-rdma-3git-1.fc20.x86_64
glusterfs-api-3git-1.fc20.x86_64
glusterfs-3git-1.fc20.x86_64
glusterfs-api-devel-3git-1.fc20.x86_64
glusterfs-libs-3git-1.fc20.x86_64
glusterfs-server-3git-1.fc20.x86_64
glusterfs-devel-3git-1.fc20.x86_64
glusterfs-debuginfo-3git-1.fc20.x86_64
glusterfs-geo-replication-3git-1.fc20.x86_64
glusterfs-regression-tests-3git-1.fc20.x86_64

[root@potwora ~]# yes | gluster volume stop test force ; yes | gluster volume del test ; rm -rf /mnt/gluster/test* ; yes | gluster volume create test replica 2 servserv.generals.ea.com:/mnt/gluster/test1 servserv.generals.ea.com:/mnt/gluster/test2 servserv.generals.ea.com:/mnt/gluster/test3 servserv.generals.ea.com:/mnt/gluster/test4 servserv.generals.ea.com:/mnt/gluster/test5 servserv.generals.ea.com:/mnt/gluster/test6 force ; gluster volume start test
volume stop: test: failed: Volume test does not exist
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) volume delete: test: failed: Volume test does not exist
Deleting volume will erase all information about the volume. Do you want to continue? (y/n) Multiple bricks of a replicate volume are present on the same server. This setup is not optimal.
Do you still want to continue creating the volume?  (y/n) volume create: test: success: please start the volume to access data
volume start: test: success
[root@potwora ~]# gluster volume info
 
Volume Name: test
Type: Distributed-Replicate
Volume ID: a0883ff4-a6b0-4caa-9fce-928616ca362e
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: servserv.generals.ea.com:/mnt/gluster/test1
Brick2: servserv.generals.ea.com:/mnt/gluster/test2
Brick3: servserv.generals.ea.com:/mnt/gluster/test3
Brick4: servserv.generals.ea.com:/mnt/gluster/test4
Brick5: servserv.generals.ea.com:/mnt/gluster/test5
Brick6: servserv.generals.ea.com:/mnt/gluster/test6
[root@potwora ~]# gluster volume status
Status of volume: test
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick servserv.generals.ea.com:/mnt/gluster/test1	49152	Y	20680
Brick servserv.generals.ea.com:/mnt/gluster/test2	49153	Y	20692
Brick servserv.generals.ea.com:/mnt/gluster/test3	49154	Y	20703
Brick servserv.generals.ea.com:/mnt/gluster/test4	49155	Y	20714
Brick servserv.generals.ea.com:/mnt/gluster/test5	49156	Y	20725
Brick servserv.generals.ea.com:/mnt/gluster/test6	49157	Y	20736
NFS Server on localhost					2049	Y	20750
Self-heal Daemon on localhost				N/A	Y	20754
 
Task Status of Volume test
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@potwora ~]# mkdir /media/test
[root@potwora ~]# cd /media/test/
[root@potwora test]# git clone https://git.fedorahosted.org/git/freeipa.git
Cloning into 'freeipa'...
remote: Counting objects: 58018, done.
remote: Compressing objects: 100% (18280/18280), done.
remote: Total 58018 (delta 47755), reused 48634 (delta 39585)
Receiving objects: 100% (58018/58018), 12.92 MiB | 805.00 KiB/s, done.
Resolving deltas: 100% (47755/47755), done.
Checking connectivity... done
Checking out files: 100% (1171/1171), done.
[root@potwora test]# find ./ | wc -l
1323
[root@potwora test]# cd
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 start
volume remove-brick start: success
ID: 36624241-2555-4c4a-9109-21b4ecae98fc
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              173        16.2MB           432             0             0    in progress             3.00
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              339        18.7MB           870             0             0    in progress             5.00
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              413        19.8MB          1084             0             0    in progress             6.00
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              489        20.6MB          1234             0             0      completed             7.00
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              489        20.6MB          1234             0             0      completed             7.00
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              489        20.6MB          1234             0             0      completed             7.00
[root@potwora ~]# gluster volume remove-brick test servserv.generals.ea.com:/mnt/gluster/test6  servserv.generals.ea.com:/mnt/gluster/test5 commit
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit: success
[root@potwora ~]# cd /media/test/
[root@potwora test]# find ./ | wc -l
1323
Comment 10 Shishir Gowda 2013-12-10 04:42:49 EST
Have backported DHT/remove-brick related fixes to release-3.4 @
http://review.gluster.org/#/c/6461/
http://review.gluster.org/#/c/6468/ <--Most likely this will fix the issue
http://review.gluster.org/#/c/6469/
http://review.gluster.org/#/c/6470/
http://review.gluster.org/#/c/6471/
Comment 11 Vijay Bellur 2013-12-16 01:18:47 EST
Lukas,

Can you please verify if 3.4.2qa3 fixes this problem?
Comment 12 Lukas Bezdicka 2013-12-16 08:09:30 EST
No change, issue is still there, should I try to bisect it?



[root@glusterkluster:~] rpm -qa | grep gluster
glusterfs-libs-3.4.2qa3-1.el6.x86_64
glusterfs-server-3.4.2qa3-1.el6.x86_64
glusterfs-fuse-3.4.2qa3-1.el6.x86_64
glusterfs-api-devel-3.4.2qa3-1.el6.x86_64
glusterfs-cli-3.4.2qa3-1.el6.x86_64
glusterfs-rdma-3.4.2qa3-1.el6.x86_64
glusterfs-devel-3.4.2qa3-1.el6.x86_64
glusterfs-debuginfo-3.4.2qa3-1.el6.x86_64
glusterfs-3.4.2qa3-1.el6.x86_64
glusterfs-geo-replication-3.4.2qa3-1.el6.x86_64
glusterfs-api-3.4.2qa3-1.el6.x86_64
 
[root@glusterkluster:~] gluster volume info test
Volume Name: test
Type: Distributed-Replicate
Volume ID: e31ec436-d6dd-4cce-ae57-54408aa1f620
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: glusterkluster:/mnt/gluster/test1
Brick2: glusterkluster:/mnt/gluster/test2
Brick3: glusterkluster:/mnt/gluster/test3
Brick4: glusterkluster:/mnt/gluster/test4
Brick5: glusterkluster:/mnt/gluster/test5
Brick6: glusterkluster:/mnt/gluster/test6

[root@glusterkluster:/media/test] find ./ | wc -l
1313

[root@glusterkluster:~] gluster volume remove-brick test glusterkluster:/mnt/gluster/test5 glusterkluster:/mnt/gluster/test6 start
volume remove-brick start: success
ID: 6cf1cac1-8469-4674-a488-277a7d611dcc

 gluster volume remove-brick test glusterkluster:/mnt/gluster/test5 glusterkluster:/mnt/gluster/test6 status
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost              337        19.1MB          1319             0      completed           670.00

[root@glusterkluster:~] gluster volume remove-brick test glusterkluster:/mnt/gluster/test5 glusterkluster:/mnt/gluster/test6 commit
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit: success

[root@glusterkluster:~] find /media/test/ | wc -l
975
Comment 13 Lukas Bezdicka 2013-12-16 12:27:21 EST
https://bugzilla.redhat.com/show_bug.cgi?id=966845

please backport 4f63b631dce4cb97525ee13fab0b2a789bcf6b15
Comment 14 Vijay Bellur 2013-12-16 12:39:44 EST
Backported - http://review.gluster.org/6517.

Does this fix the issue?
Comment 15 Lukas Bezdicka 2013-12-17 09:03:27 EST
glusterfs-3.4.2qa4-1.el6.x86_64 works fine, thank you!

Note You need to log in before you can comment on or make changes to this bug.