Created attachment 663
Created attachment 664
Created attachment 665
I created a volume with 3 replicated subvolumes and filled it up to 50% with 10MB files of random data. I then added another replicated subvolume and issues a rebalance. At first the df output shows the usage of the first 3 bricks go down and the fourth go up but then something goes wrong and the usage of the first 3 bricks goes up and beyond what they originally had. The volume gets filled with hidden files named after each real file and its file size as as .REALNAME.FILESIZE (see below). If I delete all of those hidden files the disk usage on each subvolume goes back to exactly what it was before the rebalance. Logs show "Transport endpoint is not connected". it appears that one node (not always the same one) can no longer talk to itself on 127.0.0.1 and other nodes show the same entries with the real IP of that node. I can reproduce the problem every time. If you need more verbose logs or debug info, please provide details on how to get it. RHEL 6.1 x64 2.6.32-131.12.1.el6.x86_64 Bricks are all 5GB with xfs. glusterfs 3.2.3 built on Aug 23 2011 05:40:51 #Installed via the RPMs hosted on gluster.com Volume Name: vol_mainhome Type: Distributed-Replicate Status: Started Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: gfs-dev-01b.cssd.pitt.edu:/bricks/lv_brick_01_001a Brick2: gfs-dev-03b.cssd.pitt.edu:/bricks/lv_brick_03_001b Brick3: gfs-dev-01b.cssd.pitt.edu:/bricks/lv_brick_01_002a Brick4: gfs-dev-03b.cssd.pitt.edu:/bricks/lv_brick_03_002b Brick5: gfs-dev-02b.cssd.pitt.edu:/bricks/lv_brick_02_003a Brick6: gfs-dev-03b.cssd.pitt.edu:/bricks/lv_brick_03_003b Brick7: gfs-dev-02b.cssd.pitt.edu:/bricks/lv_brick_02_004a # Was added with add-brick Brick8: gfs-dev-03b.cssd.pitt.edu:/bricks/lv_brick_03_004b # Was added with add-brick [root@gfs-dev-02b ~]# gluster volume rebalance vol_mainhome start [root@gfs-dev-02b ~]# gluster volume rebalance vol_mainhome status rebalance completed: rebalanced 70 files of size 716800000 (total files scanned 1010) [root@gfs-dev-03b ~]# tail /var/log/glusterfs/etc-glusterfs-glusterd.vol.log [2011-09-16 12:15:05.86955] W [glusterd-rebalance.c:242:gf_glusterd_rebalance_move_data] 0-glusterfs: file got changed after we started copying /etc/glusterd/mount/vol_mainhome/foo.238 [2011-09-16 12:15:05.749901] W [glusterd-rebalance.c:251:gf_glusterd_rebalance_move_data] 0-glusterfs: file sizes are not same : /etc/glusterd/mount/vol_mainhome/foo.240 [2011-09-16 12:15:06.384116] W [glusterd-rebalance.c:242:gf_glusterd_rebalance_move_data] 0-glusterfs: file got changed after we started copying /etc/glusterd/mount/vol_mainhome/foo.490 [2011-09-16 12:15:07.30771] W [glusterd-rebalance.c:242:gf_glusterd_rebalance_move_data] 0-glusterfs: file got changed after we started copying /etc/glusterd/mount/vol_mainhome/foo.491 [2011-09-16 12:15:07.809444] W [glusterd-rebalance.c:242:gf_glusterd_rebalance_move_data] 0-glusterfs: file got changed after we started copying /etc/glusterd/mount/vol_mainhome/foo.492 [2011-09-16 12:15:08.404402] W [glusterd-rebalance.c:242:gf_glusterd_rebalance_move_data] 0-glusterfs: file got changed after we started copying /etc/glusterd/mount/vol_mainhome/foo.499 [2011-09-16 12:15:15.798743] I [glusterd-rebalance.c:460:glusterd_defrag_start] 0-rebalance: rebalance on /etc/glusterd/mount/vol_mainhome complete [2011-09-16 12:15:16.273285] W [socket.c:1494:__socket_proto_state_machine] 0-socket.management: reading from socket failed. Error (Transport endpoint is not connected), peer (127.0.0.1:996) [2011-09-16 12:27:48.287927] I [glusterd-rebalance.c:603:glusterd_rebalance_cmd_attempted_log] 0-glusterd: Received rebalance volume 3 on vol_mainhome [2011-09-16 12:27:48.303323] W [socket.c:1494:__socket_proto_state_machine] 0-socket.management: reading from socket failed. Error (Transport endpoint is not connected), peer (127.0.0.1:1021) Before: Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_system-lv_brick_03_001b 5.0G 2.6G 2.5G 51% /bricks/lv_brick_03_001b /dev/mapper/vg_system-lv_brick_03_002b 5.0G 2.4G 2.7G 47% /bricks/lv_brick_03_002b /dev/mapper/vg_system-lv_brick_03_003b 5.0G 2.5G 2.6G 50% /bricks/lv_brick_03_003b /dev/mapper/vg_system-lv_brick_03_004b 5.0G 33M 5.0G 1% /bricks/lv_brick_03_004b During: Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_system-lv_brick_03_001b 5.0G 2.4G 2.6G 49% /bricks/lv_brick_03_001b /dev/mapper/vg_system-lv_brick_03_002b 5.0G 2.4G 2.7G 47% /bricks/lv_brick_03_002b /dev/mapper/vg_system-lv_brick_03_003b 5.0G 2.8G 2.3G 55% /bricks/lv_brick_03_003b /dev/mapper/vg_system-lv_brick_03_004b 5.0G 961M 4.1G 19% /bricks/lv_brick_03_004 After: Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_system-lv_brick_03_001b 5.0G 4.1G 916M 83% /bricks/lv_brick_03_001b /dev/mapper/vg_system-lv_brick_03_002b 5.0G 3.2G 1.9G 64% /bricks/lv_brick_03_002b /dev/mapper/vg_system-lv_brick_03_003b 5.0G 3.0G 2.1G 60% /bricks/lv_brick_03_003b /dev/mapper/vg_system-lv_brick_03_004b 5.0G 1.9G 3.2G 38% /bricks/lv_brick_03_004b [root@gfs-dev-04b ~]# ls -la /mnt/fuse/vol_mainhome/ -rw-r--r-- 1 root root 10240000 Sep 16 12:56 .foo.75.gfs10240000 -rw-r--r-- 1 root root 10240000 Sep 16 10:58 foo.76 -rw-r--r-- 1 root root 10240000 Sep 16 12:13 .foo.76.gfs10240000 -rw-r--r-- 1 root root 10240000 Sep 16 11:43 foo.77 -rw-r--r-- 1 root root 10240000 Sep 16 12:07 .foo.77.gfs10240000 -rw-r--r-- 1 root root 10240000 Sep 16 10:58 foo.78 -rw-r--r-- 1 root root 10240000 Sep 16 12:56 .foo.78.gfs10240000 -rw-r--r-- 1 root root 10240000 Sep 16 10:58 foo.79 -rw-r--r-- 1 root root 10240000 Sep 16 12:56 .foo.79.gfs10240000 -rw-r--r-- 1 root root 10240000 Sep 16 12:13 .foo.7.gfs10240000 -rw-r--r-- 1 root root 10240000 Sep 16 10:52 foo.8 -rw-r--r-- 1 root root 10240000 Sep 16 10:58 foo.80 -rw-r--r-- 1 root root 10240000 Sep 16 12:56 .foo.80.gfs10240000 -rw-r--r-- 1 root root 10240000 Sep 16 10:58 foo.81 -rw-r--r-- 1 root root 10240000 Sep 16 10:58 foo.82 -rw-r--r-- 1 root root 10240000 Sep 16 10:58 foo.83 -rw-r--r-- 1 root root 10240000 Sep 16 12:56 .foo.83.gfs10240000 -rw-r--r-- 1 root root 10240000 Sep 16 10:58 foo.84 -rw-r--r-- 1 root root 10240000 Sep 16 12:56 .foo.84.gfs10240000 The log entires about the file changing after the copy started are wrong, the data did not change. I have also tried to run a self-heal (using find on a FUSE mount) between adding the new bricks and doing the rebalance but I get the same result. I'll attach the /var/log/glusterfs directory from each server. Note I issued the rebalance on gfs-dev-03b.
Hi, Do you happen to still have log files on machine where you issued the rebalance operation? I want to check the log file '/var/log/glusterfs/etc-glusterd-mount-<volname>.log'. Please share the 'warning/error' messages in the log file please. That would help to see what would have happened. Regards, Amar
(In reply to comment #4) > Hi, > > Do you happen to still have log files on machine where you issued the rebalance > operation? > > I want to check the log file > '/var/log/glusterfs/etc-glusterd-mount-<volname>.log'. Please share the > 'warning/error' messages in the log file please. That would help to see what > would have happened. > > Regards, > Amar I do not have the logs anymore but what I already uploaded should be the entire /var/log/glusterfs of each server.
CHANGE: http://review.gluster.com/552 (Change-Id: Ie13b0f2748b15c0359b7868460d4bc65ec05a217) merged in release-3.2 by Vijay Bellur (vijay)
found issue to be write-behind. You can disable write behind on the volume and try to rebalance, now it should work fine.
bug 765476 should fix these issues in release-3.2 branch. The fixes are already in master branch. patches at : http://review.gluster.com/613 and http://review.gluster.com/614
This particular bug, which talks about consuming more space after a rebalance is fixed as per the commit http://review.gluster.com/552 more fixes will follow with bug 765476
After rebalance no new files are created now.