Description of problem: Distributed-Replicate 8x2 volume on 3.3.1 . Moving a healthy brick to another server. Right after starting replace-brick the IOPS on the source brick saturate to what the single disk can provide causing all gluster client mounts to hang for 20 minutes. At that point the replace-brick process dies, I filed another bug for that at bug 950006 . This bug is about the gluster volume being down. How reproducible: 8*2T volume with about 2M directories (no more than 100 per subdirectory) and 30M files, start replace-brick. Actual results: The entire gluster volume hangs on all nodes and for all gluster clients. Expected results: Replace-brick does not prevent the volume from working. Additional info: ext4 being used on all bricks, none are suffering from the 64bit ext4 issue. This bug might relate to Bug 832609 - Glusterfsd hangs if brick filesystem becomes unresponsive, causing all clients to lock up.
Can you please provide the logs for the cluster? Also please provide more information on what the client work-load was which hung. Using ext4 as backend did cause clients to hang. Are you running a kernel version without the ext4 64bit offset kernel patches?
Created attachment 733660 [details] The -etc-glusterfs-glusterd.vol.log Logfile that goes with these commands : Apr 8 11:11:37 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b start Apr 8 11:13:39 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:14:05 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:15:59 stor3-idc1-lga bash[22161]: [hans->root] gluster volume status Apr 8 11:17:25 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:18:52 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:19:58 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:20:48 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:21:07 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b pause Apr 8 11:21:14 stor1-idc1-lga bash[9475]: [hans->root] restart glusterd Apr 8 11:21:19 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:23:24 stor1-idc1-lga bash[9475]: [hans->root] gluster volume status Apr 8 11:23:52 stor1-idc1-lga bash[9475]: [hans->root] gluster volume status Apr 8 11:27:18 stor1-idc1-lga bash[9475]: [hans->root] # kill all glusterfs and glusterfsd Apr 8 11:27:35 stor1-idc1-lga bash[9475]: [hans->root] stop glusterd Apr 8 11:27:56 stor1-idc1-lga bash[9475]: [hans->root] start glusterd Apr 8 11:28:43 stor1-idc1-lga bash[9475]: [hans->root] gluster volume status Apr 8 11:29:11 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b start Apr 8 11:32:30 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:32:36 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:33:58 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:35:02 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:38:48 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:39:33 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:40:07 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:47:49 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 11:51:37 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 12:11:44 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Apr 8 14:03:30 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
The client work load is our regular production level. If you need this more detailed like IOPS etc. please specify the exact command line tool with parameters you want to see the output of. About ext4 : We're not suffering from the 64bit issue (all are 3.0.0-17): > Additional info: > ext4 being used on all bricks, none are suffering from the 64bit ext4 issue. Thanks for looking into the issue !
New insights : I started the replace-brick a day after an upgrade from 3.2.5 to 3.3.1 . The 3.3.1 has the bricks/.glusterfs/ directory trees where 3.2.5 does not. Could this be the gluster single-brick IO saturation cause ? (And if so, how does one check if this tree is fully updated so that a next replace-brick won't DOS the entire gluster volume on all nodes ?)
After 7 days I stopped the destination glusterfs. The source node now says : gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status Number of files migrated = 28560 Migration complete The 'number of files migrated' should be some factor 100 higher.
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug. If there has been no update before 9 December 2014, this bug will get automatocally closed.