Bug 848859

Summary:

glusterd crash when performing at the same time more than one storage migrations (w/ volume replace-brick)

Product:

[Community] GlusterFS

Reporter:

Paschalis Korosoglou <pkoro>

Component:

glusterd

Assignee:

krishnan parthasarathi <kparthas>

Status:

CLOSED EOL

QA Contact:

Severity:

unspecified

Docs Contact:

Priority:

medium

Version:

mainline

CC:

bugs, gluster-bugs, jbuchta, nsathyan, rwheeler

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2015-10-22 15:46:38 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Core dumped directory	none

Description Paschalis Korosoglou 2012-08-16 15:41:53 UTC

Description of problem:

glusterd crashed on one of our storage servers (w/ signal 11 - SIGSEGV). Just a while before the crash occured we were migrating (at the same time) 3 bricks to the storage server (on which glusterd crashed) from two other storage servers (using the command volume replace-brick...). After checking with the status option that the migrations had been completed we decided to abort the replace-brick as df was giving us different results (comparing initial bricks with resulting bricks). After aborting (which exited cleanly) we unmounted the destination bricks and wiped two of them clean to restart the migration. A short while after the crash occured. A coredump was generated by the crash. From the investigation of the coredump the following result was withdrawn:

http://fpaste.org/ryAe/

Please let me know if I should attach files generated by the crash (i.e. the coredump) on this ticket.

Version-Release number of selected component (if applicable):

gluster-server: 3.3.0-1.el6
OS: Centos-6.3 (x86_64)

How reproducible:

We are not sure if this is reproducible. Since several production services were affected by the crash we cannot try and reproduce it.

Steps to Reproduce:

I would propose (though I cannot say with certainty it is reproducible):

1. Multiple simultaneous brick migrations to one storage server (i.e. 2-3)
2. After completion abort the migrations
3. Umount and reformat the former "destination" bricks

Actual results:
-

Expected results:
-

Additional info:
-

Comment 1 krishnan parthasarathi 2012-08-17 05:16:54 UTC

Paschalis,

Could you attach more frames/lines of the backtrace (if present) ? With only the top most frame suggesting that pthread_mutex_t was not initialised doesn't help
making any further analysis.
Also, if the core dump file is not 'too large' could you attach it to the bug or host it via a publicly accessible URL ?

Comment 2 Paschalis Korosoglou 2012-08-17 08:21:26 UTC

Created attachment 605118 [details]
Core dumped directory

Comment 3 Paschalis Korosoglou 2012-08-17 08:21:52 UTC

Hi Krishnan, 

I am afraid this is as much as I got out of gdb. There is one warning in the output as to some debuginfo not found but as it is not clearly stated which file is missing I was reluctant to go ahead with the proposed yum command. 

Please find attached the comlete dumped directory (including the core dump and several other text files) for further investigation. 

Thank you for your assistance, 
Paschalis

Comment 4 Paschalis Korosoglou 2012-09-06 07:56:14 UTC

Hi Krishnan, 

was the attachement I provided helpful in any way? Have you had time to investigate the issue? 

As we have several pending migration operations to perform I would appreciate your feedback on this. 

Best,
Paschalis

Comment 5 Paschalis Korosoglou 2012-09-17 08:32:35 UTC

Hi Krishnan, 

please let me know if you have had some luck debugging this error. We are about to retry performing the migration (this once after unmounting the filesystem from all clients through which it is accessible) but we would really appreciate it if you could suggest us also to try something else that would potentially help. 

Best regards,
Paschalis

Comment 6 Paschalis Korosoglou 2012-11-17 10:00:09 UTC

Hello, 

is there any news regarding this issue? Please let us know if you have any update. We are now facing additional problems on one of our gluster servers so we urgently need an update so that we may reduce the load on the specific server by offloading/migrating volumes to other servers. 

Thank you, 
Paschalis

Comment 7 Amar Tumballi 2012-11-19 07:19:55 UTC

Paschalis,

Can you try 'gluster volume add-brick <VOLNAME> NEW-BRICK', 'gluster volume remove-brick <VOLNAME> OLD-BRICK start'.

This should work same as replace-brick and migrate data to new brick.

Comment 8 Paschalis Korosoglou 2012-11-19 12:44:54 UTC

Hi Amar, 

thank you for your reply. 

Two questions on the proposed workaround:

1) Would/Will this work on community based gluster server version 3.3.0?
2) Using this approach I suppose we will have to first disconnect all clients accessing the storage on the volume/brick to be moved, right?

Best, 
Paschalis

Comment 9 Amar Tumballi 2012-11-20 06:55:26 UTC

> 1) Would/Will this work on community based gluster server version 3.3.0?

Yes, it works.

> 2) Using this approach I suppose we will have to first disconnect all
> clients accessing the storage on the volume/brick to be moved, right?

Nope, the clients will continue to access the files while they are getting migrated. No need to disconnect/umount the clients. Make sure that after remove-brick "start" completes (can check using "status"), issue a "commit" command.

Comment 10 Paschalis Korosoglou 2013-01-15 16:48:43 UTC

Hi Amar, 

we tried your suggestion earlier today. I will let you know by tomorrow morning if everything has gone fine. So far it seems OK. The only obvious side-effect is the high number of open files on the system the data (volume) is being migrated from. See below:

# su - ansible
su: /bin/bash: Too many open files in system

(ssh to ansible user works neither, only root). 

Under /var/log/messages I see the following message repeated for the last couple and half hours:

Jan 15 16:13:13 c1033 kernel: VFS: file-max limit 384166 reached

I hope nothing will break over night, but if you have any suggestions please reply promptly. 

Best, 
Paschalis

Comment 11 Paschalis Korosoglou 2013-01-16 08:14:19 UTC

Hi Amar, 

I am afraid I do not have good news on this. Using status we see that the migration status is "failed". On the gluster server we had noticed the messages reported above the following is now found in /var/log/messages

Jan 16 04:10:03 c1033 kernel: Out of memory: Kill process 13487 (glusterfs) score 377 or sacrifice child
Jan 16 04:10:03 c1033 kernel: Killed process 13487, UID 0, (glusterfs) total-vm:3441796kB, anon-rss:877148kB, file-rss:12kB

This is a production file system (we are trying to migrate) so if you have any suggestions we would appreciate a prompt response. We will try to increase the max number of files to a higher value and see if starting the migration again will do any good. 

Best, 
Paschalis

Comment 12 Amar Tumballi 2013-01-17 06:06:29 UTC

Hi Paschalis,

One of the ways to migrate data is to copy the data from the brick to glusterfs mount point using rsync, but for this, the applications referring to files in the removed brick will not be able to see them till they are copied over to the mount.

Wanted to check if you are running 'gluster volume replace-brick' or 'gluster volume remove-brick' (followed by a 'gluster volume add-brick').

As I understand it, there should not be that many number of fd open in case of migration, as migration will do one fd after another, and at any given time, it should have just 1 file fd open on a brick, and N fd for directory (N stands for depth of directory, which normally won't cross 2k in posix standard).

Comment 13 Kaleb KEITHLEY 2015-10-22 15:46:38 UTC

because of the large number of bugs filed against mainline version\ is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.