Description of problem: With a 4 nodes/bricks replicated-distributed Gluster volume, I have an issue when "rebalancing" data (from 2 to 4). So, when "start force" is used to "rebalance" data, file are moved from source to destination (which are new nodes). Unfortunately, files moved are still open on both side : - on source side : file is open and I can see "(deleted)" with "lsof" - on destination side : file is also open (but without any "(deleted)" because file is new) - this is not a socket problem Once file balancing is complete : - source: file are still open - destination : file are closed So, on destination nodes, I can reach maximum open file number and crash some other application : ------------ VFS: file-max limit 64000 reached VFS: file-max limit 64000 reached gmond[2964]: segfault at ffffffffffffffff rip 000000000040acc4 rsp 00007fff88a7db60 error 4 VFS: file-max limit 64000 reached VFS: file-max limit 64000 reached ------------ On source, file are still open ; lsof : ---------------- glusterfs 16840 root *258u REG 8,5 0 55644726 /mnt/data1/brick1/gluster_test2_old/<hidden>.pdf (deleted) glusterfs 16840 root *259u REG 8,5 0 55642740 /mnt/data1/brick1/gluster_test2_old/<hidden2>.pdf (deleted) glusterfs 16840 root *260u REG 8,5 0 55648839 /mnt/data1/brick1/gluster_test2_old/<hidden3>.pdf (deleted) glusterfs 16840 root *261u REG 8,5 0 55644412 /mnt/data1/brick1/gluster_test2_old/<hidden4>.pdf (deleted) glusterfs 16840 root *262u REG 8,5 0 55646524 /mnt/data1/brick1/gluster_test2_old/<hidden5>.pdf (deleted) glusterfs 16840 root *263u REG 8,5 0 55645671 /mnt/data1/brick1/gluster_test2_old/<hidden5>.pdf (deleted) ---------------- Please note that there is another big issue : sometime, at the and of "rebalance", CPU goes to 100% (even more) : it's because of "glusterfsd" which run several process eating all CPU (I can see it with Ganglia and htop). I also get this error in log (/var/log/glusterfs/gv1-rebalance.log) on source : -------------- 2013-04-25 17:09:30.499705] I [dht-rebalance.c:647:dht_migrate_file] 0-gv1-dht: /gluster_test2_old/<hidden>.pdf: attempting to move from gv1-replicate-0 to gv1-replicate-1 [2013-04-25 17:09:30.564466] I [dht-rebalance.c:856:dht_migrate_file] 0-gv1-dht: completed migration of /gluster_test2_old/<hidden2>.pdf from subvolume gv1-replicate-0 to gv1-replicate-1 [2013-04-25 17:09:41.476468] I [dht-common.c:2337:dht_setxattr] 0-gv1-dht: fixing the layout of /gluster_test2_old/<hidden3> [2013-04-25 17:09:41.477772] I [dht-rebalance.c:1063:gf_defrag_migrate_data] 0-gv1-dht: migrate data called on /gluster_test2_old/<hidden4> [2013-04-25 17:09:41.488814] I [dht-rebalance.c:1626:gf_defrag_status_get] 0-glusterfs: Rebalance is completed [2013-04-25 17:09:41.488872] I [dht-rebalance.c:1629:gf_defrag_status_get] 0-glusterfs: Files migrated: 58478, size: 2809361015, lookups: 206117, failures: 0 [2013-04-25 17:09:41.489467] W [glusterfsd.c:831:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x36236d500d] (-->/lib64/libpthread.so.0 [0x3623e0683d] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0x17c) [0x4052ec]))) 0-: received signum (15), shutting down [2013-04-25 17:09:41.489714] E [rpcsvc.c:1155:rpcsvc_program_unregister_portmap] 0-rpc-service: Could not unregister with portmap -------------- Note: these node are not clients => no fuse. Version-Release number of selected component (if applicable): glusterfs-3.3.1-1.el5 glusterfs-fuse-3.3.1-1.el5 glusterfs-server-3.3.1-1.el5 Steps to Reproduce: - create a replicated volume - fill it with 140.000 small file (pdf) - extends it to 2 more nodes - fix-layout, rebalance start (always only fails), rebalance start force - use "lsof" to check opened files (number increase at the begeining until the end of rebalance) - look for cpu (Ganglia/htop) : could go to 100% at the end of rebalance - for now, 20 minutes with 100% CPU and load of 20 on both source nodes Actual results: - file number open on both source/destination node - at the end, still open on the source node + CPU à 100% with high load Expected results: - file should NOT increase like that (these are even NOT used ...) - lsof should NOT show crazy number of opened files (thousand) - no CPU at 100% should occurs at the end Additional info: [0][root@serveur1 brick1]$ gluster volume info Volume Name: gv1 Type: Distributed-Replicate Volume ID: 7ef3f923-3dd2-4817-b5a7-5e946d8661fb Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: serveur1:/mnt/data1/brick1 Brick2: serveur2:/mnt/data1/brick2 Brick3: serveur3:/mnt/data1/brick3 Brick4: serveur4:/mnt/data1/brick4 [255][root@serveur1 brick1]$ gluster volume status gv1 detail Status of volume: gv1 ------------------------------------------------------------------------------ Brick : Brick serveur1:/mnt/data1/brick1 Port : 24009 Online : Y Pid : 16840 File System : ext3 Device : /dev/sda5 Mount Options : rw Inode Size : 128 Disk Space Free : 252.0GB Total Disk Space : 260.9GB Inode Count : 70615040 Free Inodes : 70442582 ------------------------------------------------------------------------------ Brick : Brick serveur2:/mnt/data1/brick2 Port : 24009 Online : Y Pid : 29293 File System : ext3 Device : /dev/sda5 Mount Options : rw Inode Size : 128 Disk Space Free : 110.9GB Total Disk Space : 122.5GB Inode Count : 33193984 Free Inodes : 33019070 ------------------------------------------------------------------------------ Brick : Brick serveur3:/mnt/data1/brick3 Port : 24009 Online : Y Pid : 17189 File System : ext3 Device : /dev/sda5 Mount Options : rw Inode Size : 128 Disk Space Free : 49.3GB Total Disk Space : 56.4GB Inode Count : 15287712 Free Inodes : 15141537 ------------------------------------------------------------------------------ Brick : Brick serveur4:/mnt/data1/brick4 Port : 24009 Online : Y Pid : 28968 File System : ext3 Device : /dev/sda5 Mount Options : rw Inode Size : 128 Disk Space Free : 50.1GB Total Disk Space : 56.3GB Inode Count : 15254976 Free Inodes : 15125681
More information on the "source" side : - file are still open - but, at the end, CPU go very high - at the same time, number of opened file start decreasing - so, when CPU is high, file number decrease - I suppose that file open number will go to "normal" number - and then, CPU will go back to normal load The client have trouble accessing the volume : very slow Command "gluster" is also very slow (and could fail). If you need more information ... just ask Please also note that I'm running RHEL5.9 on these 4 nodes.
On source nodes : - they get the problem, but end not specially at the same time - one is done and file opened is normal - but the second source node has not finished yet and have still more than 20.000 opened files For now, second node source is still at 100% (50 minutes) and will probably go to 1 hour before calm down ... How much open file Gluster is requiring ?
Thanks for reporting this. This issue has been reported earlier and is being tracked in another bug, for which we have a fix. Closing this as a duplicate. *** This bug has been marked as a duplicate of bug 928631 ***