Bug 950024 - replace-brick immediately saturates IO on source brick causing the entire volume to be unavailable, then dies
Summary: replace-brick immediately saturates IO on source brick causing the entire vol...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: 3.3.1
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-04-09 13:28 UTC by hans
Modified: 2014-12-14 19:40 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-12-14 19:40:30 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
The -etc-glusterfs-glusterd.vol.log (98.39 KB, text/plain)
2013-04-10 12:15 UTC, hans
no flags Details

Description hans 2013-04-09 13:28:19 UTC
Description of problem:

Distributed-Replicate 8x2 volume on 3.3.1 .
Moving a healthy brick to another server.

Right after starting replace-brick the IOPS on the source brick saturate to what the single disk can provide causing all gluster client mounts to hang for 20 minutes. At that point the replace-brick process dies, I filed another bug for that at bug 950006 . This bug is about the gluster volume being down.

How reproducible:

8*2T volume with about 2M directories (no more than 100 per subdirectory) and 30M files, start replace-brick.

Actual results:

The entire gluster volume hangs on all nodes and for all gluster clients.

Expected results:

Replace-brick does not prevent the volume from working.

Additional info:

ext4 being used on all bricks, none are suffering from the 64bit ext4 issue.

This bug might relate to Bug 832609 - Glusterfsd hangs if brick filesystem becomes unresponsive, causing all clients to lock up.

Comment 1 shishir gowda 2013-04-10 04:14:39 UTC
Can you please provide the logs for the cluster?

Also please provide more information on what the client work-load was which hung.

Using ext4 as backend did cause clients to hang. Are you running a kernel version without the ext4 64bit offset kernel patches?

Comment 2 hans 2013-04-10 12:15:53 UTC
Created attachment 733660 [details]
The -etc-glusterfs-glusterd.vol.log

Logfile that goes with these commands :

Apr  8 11:11:37 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b start
Apr  8 11:13:39 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:14:05 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:15:59 stor3-idc1-lga bash[22161]: [hans->root] gluster volume status
Apr  8 11:17:25 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:18:52 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:19:58 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:20:48 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:21:07 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b pause
Apr  8 11:21:14 stor1-idc1-lga bash[9475]: [hans->root] restart glusterd
Apr  8 11:21:19 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:23:24 stor1-idc1-lga bash[9475]: [hans->root] gluster volume status
Apr  8 11:23:52 stor1-idc1-lga bash[9475]: [hans->root] gluster volume status
Apr  8 11:27:18 stor1-idc1-lga bash[9475]: [hans->root] # kill all glusterfs and glusterfsd
Apr  8 11:27:35 stor1-idc1-lga bash[9475]: [hans->root] stop glusterd
Apr  8 11:27:56 stor1-idc1-lga bash[9475]: [hans->root] start glusterd
Apr  8 11:28:43 stor1-idc1-lga bash[9475]: [hans->root] gluster volume status
Apr  8 11:29:11 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b start
Apr  8 11:32:30 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:32:36 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:33:58 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:35:02 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:38:48 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:39:33 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:40:07 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:47:49 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:51:37 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 12:11:44 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 14:03:30 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status

Comment 3 hans 2013-04-10 12:20:33 UTC
The client work load is our regular production level. If you need this more detailed like IOPS etc. please specify the exact command line tool with parameters you want to see the output of.

About ext4 : We're not suffering from the 64bit issue (all are 3.0.0-17):
> Additional info:
> ext4 being used on all bricks, none are suffering from the 64bit ext4 issue.

Thanks for looking into the issue !

Comment 4 hans 2013-04-12 09:50:27 UTC
New insights : I started the replace-brick a day after an upgrade from 3.2.5 to 3.3.1 . The 3.3.1 has the bricks/.glusterfs/ directory trees where 3.2.5 does not. Could this be the gluster single-brick IO saturation cause ?

(And if so, how does one check if this tree is fully updated so that a next replace-brick won't DOS the entire gluster volume on all nodes ?)

Comment 5 hans 2013-04-16 11:54:16 UTC
After 7 days I stopped the destination glusterfs. The source node now says :

gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Number of files migrated = 28560        Migration complete 

The 'number of files migrated' should be some factor 100 higher.

Comment 6 Niels de Vos 2014-11-27 14:54:20 UTC
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug.

If there has been no update before 9 December 2014, this bug will get automatocally closed.


Note You need to log in before you can comment on or make changes to this bug.