Bug 950024

Summary:

replace-brick immediately saturates IO on source brick causing the entire volume to be unavailable, then dies

Product:

[Community] GlusterFS

Reporter:

hans

Component:

core

Assignee:

bugs <bugs>

Status:

CLOSED DEFERRED

QA Contact:

Severity:

urgent

Docs Contact:

Priority:

high

Version:

3.3.1

CC:

bugs, gluster-bugs, nsathyan, yinyin2010

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2014-12-14 19:40:30 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
The -etc-glusterfs-glusterd.vol.log	none

Description hans 2013-04-09 13:28:19 UTC

Description of problem:

Distributed-Replicate 8x2 volume on 3.3.1 .
Moving a healthy brick to another server.

Right after starting replace-brick the IOPS on the source brick saturate to what the single disk can provide causing all gluster client mounts to hang for 20 minutes. At that point the replace-brick process dies, I filed another bug for that at bug 950006 . This bug is about the gluster volume being down.

How reproducible:

8*2T volume with about 2M directories (no more than 100 per subdirectory) and 30M files, start replace-brick.

Actual results:

The entire gluster volume hangs on all nodes and for all gluster clients.

Expected results:

Replace-brick does not prevent the volume from working.

Additional info:

ext4 being used on all bricks, none are suffering from the 64bit ext4 issue.

This bug might relate to Bug 832609 - Glusterfsd hangs if brick filesystem becomes unresponsive, causing all clients to lock up.

Comment 1 shishir gowda 2013-04-10 04:14:39 UTC

Can you please provide the logs for the cluster?

Also please provide more information on what the client work-load was which hung.

Using ext4 as backend did cause clients to hang. Are you running a kernel version without the ext4 64bit offset kernel patches?

Comment 2 hans 2013-04-10 12:15:53 UTC

Created attachment 733660 [details]
The -etc-glusterfs-glusterd.vol.log

Logfile that goes with these commands :

Apr  8 11:11:37 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b start
Apr  8 11:13:39 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:14:05 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:15:59 stor3-idc1-lga bash[22161]: [hans->root] gluster volume status
Apr  8 11:17:25 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:18:52 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:19:58 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:20:48 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:21:07 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b pause
Apr  8 11:21:14 stor1-idc1-lga bash[9475]: [hans->root] restart glusterd
Apr  8 11:21:19 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:23:24 stor1-idc1-lga bash[9475]: [hans->root] gluster volume status
Apr  8 11:23:52 stor1-idc1-lga bash[9475]: [hans->root] gluster volume status
Apr  8 11:27:18 stor1-idc1-lga bash[9475]: [hans->root] # kill all glusterfs and glusterfsd
Apr  8 11:27:35 stor1-idc1-lga bash[9475]: [hans->root] stop glusterd
Apr  8 11:27:56 stor1-idc1-lga bash[9475]: [hans->root] start glusterd
Apr  8 11:28:43 stor1-idc1-lga bash[9475]: [hans->root] gluster volume status
Apr  8 11:29:11 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b start
Apr  8 11:32:30 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:32:36 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:33:58 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:35:02 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:38:48 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:39:33 stor1-idc1-lga bash[9475]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:40:07 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:47:49 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 11:51:37 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 12:11:44 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Apr  8 14:03:30 stor3-idc1-lga bash[25373]: [hans->root] gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status

Comment 3 hans 2013-04-10 12:20:33 UTC

The client work load is our regular production level. If you need this more detailed like IOPS etc. please specify the exact command line tool with parameters you want to see the output of.

About ext4 : We're not suffering from the 64bit issue (all are 3.0.0-17):
> Additional info:
> ext4 being used on all bricks, none are suffering from the 64bit ext4 issue.

Thanks for looking into the issue !

Comment 4 hans 2013-04-12 09:50:27 UTC

New insights : I started the replace-brick a day after an upgrade from 3.2.5 to 3.3.1 . The 3.3.1 has the bricks/.glusterfs/ directory trees where 3.2.5 does not. Could this be the gluster single-brick IO saturation cause ?

(And if so, how does one check if this tree is fully updated so that a next replace-brick won't DOS the entire gluster volume on all nodes ?)

Comment 5 hans 2013-04-16 11:54:16 UTC

After 7 days I stopped the destination glusterfs. The source node now says :

gluster volume replace-brick vol01 stor1:/gluster/c stor3-idc1-lga:/gluster/b status
Number of files migrated = 28560        Migration complete 

The 'number of files migrated' should be some factor 100 higher.

Comment 6 Niels de Vos 2014-11-27 14:54:20 UTC

The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug.

If there has been no update before 9 December 2014, this bug will get automatocally closed.