Bug 1325768 - glusterd crashed while stopping the volumes
Summary: glusterd crashed while stopping the volumes
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Atin Mukherjee
QA Contact: Byreddy
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-11 07:22 UTC by Byreddy
Modified: 2016-09-17 16:44 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-07-19 10:34:50 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Byreddy 2016-04-11 07:22:00 UTC
Description of problem:
=======================
glusterd got crashed when tried to stop the volumes when one of volume brick is down due to underlying xfs crash.


Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.7.9-1


How reproducible:
=================
Only one time.


Steps to Reproduce:
===================
1. Create a 1*2 volume using one node cluster and start it.
2. Crash one of volume brick using godown tool.
3. Create new volume using bricks not part of volume created in step-1.
4. Try to stop volume created in step-1 and step-3


Actual results:
===============
glusterd crash happened.


Expected results:
=================
glusterd should not crash.


Additional info:
=================
will attach the core file.

Comment 3 Atin Mukherjee 2016-04-12 03:36:53 UTC
Looks like a corruption in the RPC req which started with the following log indication:

[2016-04-11 03:57:07.471752] C [mem-pool.c:571:mem_put] (-->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(gd_syncop_mgmt_brick_op+0x1b7) [0x7f094fed2257] -->/usr/lib64/glusterfs/3.7.9/x
lator/mgmt/glusterd.so(gd_syncop_submit_request+0x2f4) [0x7f094fed0ce4] -->/lib64/libglusterfs.so.0(mem_put+0xb8) [0x7f095b30ba38] ) 0-mem-pool: mem_put called on freed ptr 0x7f0958c37718 of
 mem pool 0x7f095d54a960

The backtrace indicates that request object was already freed and put back to the mem pool and hence the crash.

We tried to reproduce the same scenario but didn't observe any crash. Given that it happened only once lowering down the priority and severity.

Comment 4 Byreddy 2016-04-12 08:39:24 UTC
core dump of this crash is placed in the location "http://rhsqe-repo.lab.eng.blr.redhat.com/coredumps/1325768/"

Comment 5 Atin Mukherjee 2016-06-09 04:45:04 UTC
Byreddy,

FWIW, can you please retry this scenario and see if its reproducible? Otherwise we can close this bug and reopen if we hit it back.

~Atin

Comment 7 Byreddy 2016-07-19 10:34:50 UTC
(In reply to Atin Mukherjee from comment #5)
> Byreddy,
> 
> FWIW, can you please retry this scenario and see if its reproducible?
> Otherwise we can close this bug and reopen if we hit it back.
> 
> ~Atin

Verified this bug using the 3.1.3 bits, issue didn't reproduced 

So closing as working in current release.


Note You need to log in before you can comment on or make changes to this bug.