Bug 950024

Summary: replace-brick immediately saturates IO on source brick causing the entire volume to be unavailable, then dies
Product: [Community] GlusterFS Reporter: hans
Component: coreAssignee: bugs <bugs>
Status: CLOSED DEFERRED QA Contact:
Severity: urgent Docs Contact:
Priority: high    
Version: 3.3.1CC: bugs, gluster-bugs, nsathyan, yinyin2010
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-12-14 19:40:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
The -etc-glusterfs-glusterd.vol.log none

Description hans 2013-04-09 13:28:19 UTC
Description of problem:

Distributed-Replicate 8x2 volume on 3.3.1 .
Moving a healthy brick to another server.

Right after starting replace-brick the IOPS on the source brick saturate to what the single disk can provide causing all gluster client mounts to hang for 20 minutes. At that point the replace-brick process dies, I filed another bug for that at bug 950006 . This bug is about the gluster volume being down.

How reproducible:

8*2T volume with about 2M directories (no more than 100 per subdirectory) and 30M files, start replace-brick.

Actual results:

The entire gluster volume hangs on all nodes and for all gluster clients.

Expected results:

Replace-brick does not prevent the volume from working.

Additional info:

ext4 being used on all bricks, none are suffering from the 64bit ext4 issue.

This bug might relate to Bug 832609 - Glusterfsd hangs if brick filesystem becomes unresponsive, causing all clients to lock up.

Comment 1 shishir gowda 2013-04-10 04:14:39 UTC
Can you please provide the logs for the cluster?

Also please provide more information on what the client work-load was which hung.

Using ext4 as backend did cause clients to hang. Are you running a kernel version without the ext4 64bit offset kernel patches?

Comment 2 hans 2013-04-10 12:15:53 UTC
Created attachment 733660 [details]
The -etc-glusterfs-glusterd.vol.log

Logfile that goes with these commands :

Apr  8 11:11:37 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b start
Apr  8 11:13:39 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:14:05 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:15:59 stor3-idc1-lga bash[22161]: [hans->root] gluster volume status
Apr  8 11:17:25 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:18:52 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:19:58 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:20:48 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:21:07 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b pause
Apr  8 11:21:14 stor1-idc1-lga bash[9475]: [hans->root] restart glusterd
Apr  8 11:21:19 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:23:24 stor1-idc1-lga bash[9475]: [hans->root] gluster volume status
Apr  8 11:23:52 stor1-idc1-lga bash[9475]: [hans->root] gluster volume status
Apr  8 11:27:18 stor1-idc1-lga bash[9475]: [hans->root] # kill all glusterfs and glusterfsd
Apr  8 11:27:35 stor1-idc1-lga bash[9475]: [hans->root] stop glusterd
Apr  8 11:27:56 stor1-idc1-lga bash[9475]: [hans->root] start glusterd
Apr  8 11:28:43 stor1-idc1-lga bash[9475]: [hans->root] gluster volume status
Apr  8 11:29:11 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b start
Apr  8 11:32:30 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:32:36 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:33:58 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:35:02 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:38:48 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:39:33 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:40:07 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:47:49 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:51:37 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 12:11:44 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 14:03:30 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status

Comment 3 hans 2013-04-10 12:20:33 UTC
The client work load is our regular production level. If you need this more detailed like IOPS etc. please specify the exact command line tool with parameters you want to see the output of.

About ext4 : We're not suffering from the 64bit issue (all are 3.0.0-17):
> Additional info:
> ext4 being used on all bricks, none are suffering from the 64bit ext4 issue.

Thanks for looking into the issue !

Comment 4 hans 2013-04-12 09:50:27 UTC
New insights : I started the replace-brick a day after an upgrade from 3.2.5 to 3.3.1 . The 3.3.1 has the bricks/.glusterfs/ directory trees where 3.2.5 does not. Could this be the gluster single-brick IO saturation cause ?

(And if so, how does one check if this tree is fully updated so that a next replace-brick won't DOS the entire gluster volume on all nodes ?)

Comment 5 hans 2013-04-16 11:54:16 UTC
After 7 days I stopped the destination glusterfs. The source node now says :

gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Number of files migrated = 28560        Migration complete 

The 'number of files migrated' should be some factor 100 higher.

Comment 6 Niels de Vos 2014-11-27 14:54:20 UTC
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug.

If there has been no update before 9 December 2014, this bug will get automatocally closed.