1298076 – Gluster crashes when adding volume with wrong brick hostnames

Bug 1298076 - Gluster crashes when adding volume with wrong brick hostnames

Summary: Gluster crashes when adding volume with wrong brick hostnames

Keywords:
Status:	CLOSED INSUFFICIENT_DATA
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	3.7.6
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	Atin Mukherjee
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-01-13 08:10 UTC by Arik
Modified:	2017-07-25 03:25 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2017-01-23 05:21:19 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
glusterd logs around the crash (16.99 KB, application/x-gzip) 2016-01-19 23:01 UTC, Arik	no flags	Details
View All

Description Arik 2016-01-13 08:10:31 UTC

Description of problem:

I have a 4 node gluster setup for use with oVirt and was adding the iso domain. When I did so, I accidentally used a script from different environment with the wrong hostnames. All other paramters were correct, just the hostnames of each brick in the new replica 2 volume I was trying to add.

I corrected the error and tried to add it again, but at that point glusterd processes on two servers had crashed, and those nodes had come out of the cluster in peer status. The fix was to remove the /var/lib/glusterd on those nodes and re-probe, but not before causing a fair amount of chaos.

Version-Release number of selected component (if applicable):

3.7.6


How reproducible:

Unfortunately I cannot tell. Our development environment has somehow become too critical. Will reproduce time permitting

Steps to Reproduce:
1. Add new volume but misspell brick hostname
2. Add same volume but with correct hostname

Actual results:
gluster crashes on at least some servers

Expected results:
An client error message is returned indicating hosts are not peers in cluster, but otherwise nothing.

Additional info:

Servers are up-to-date CentOS-7.1.1503 linux 3.10.0-229.20.1.el7.x86_64 on DELL R730 with recent firmware updates

Comment 1 Jiffin 2016-01-19 12:13:48 UTC

Can you attach the gluterd logs(/var/log/glusterfs/etc-glusterfs-glusterd.vol.log) ?

Usually for wrong hostname, it should throw host not in cluster instead of crash

Comment 2 Arik 2016-01-19 23:01:00 UTC

Created attachment 1116408 [details]
glusterd logs around the crash

I did the volume add on 2016-01-12 so I am sending all logs on that day removing lines with "nfs" as most log lines are those benign warnings.

Looking back, it looks rather like trying to start a volume that does not exist is what triggered the crash. I ran this from a script that is used to create volume and add properties for oVirt.

Appologies, I should have checked more carefully before making the report since it does look like gluster refused the wrong hostname, but I think this is still just as serious a bug.

Comment 3 Atin Mukherjee 2016-01-20 05:18:25 UTC

Could you also attach the core file as without it we can not analyse the reason of the crash.

Comment 4 Atin Mukherjee 2017-01-23 05:21:19 UTC

Given we have not received sufficient information (especially the core file) closing this bug now, please reopen if the issue persists.

Note You need to log in before you can comment on or make changes to this bug.