Bug 1325768

Summary: glusterd crashed while stopping the volumes
Product: Red Hat Gluster Storage Reporter: Byreddy <bsrirama>
Component: glusterdAssignee: Atin Mukherjee <amukherj>
Status: CLOSED CURRENTRELEASE QA Contact: Byreddy <bsrirama>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: bsrirama, rhs-bugs, storage-qa-internal, vbellur
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-07-19 10:34:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Byreddy 2016-04-11 07:22:00 UTC
Description of problem:
=======================
glusterd got crashed when tried to stop the volumes when one of volume brick is down due to underlying xfs crash.


Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.7.9-1


How reproducible:
=================
Only one time.


Steps to Reproduce:
===================
1. Create a 1*2 volume using one node cluster and start it.
2. Crash one of volume brick using godown tool.
3. Create new volume using bricks not part of volume created in step-1.
4. Try to stop volume created in step-1 and step-3


Actual results:
===============
glusterd crash happened.


Expected results:
=================
glusterd should not crash.


Additional info:
=================
will attach the core file.

Comment 3 Atin Mukherjee 2016-04-12 03:36:53 UTC
Looks like a corruption in the RPC req which started with the following log indication:

[2016-04-11 03:57:07.471752] C [mem-pool.c:571:mem_put] (-->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(gd_syncop_mgmt_brick_op+0x1b7) [0x7f094fed2257] -->/usr/lib64/glusterfs/3.7.9/x
lator/mgmt/glusterd.so(gd_syncop_submit_request+0x2f4) [0x7f094fed0ce4] -->/lib64/libglusterfs.so.0(mem_put+0xb8) [0x7f095b30ba38] ) 0-mem-pool: mem_put called on freed ptr 0x7f0958c37718 of
 mem pool 0x7f095d54a960

The backtrace indicates that request object was already freed and put back to the mem pool and hence the crash.

We tried to reproduce the same scenario but didn't observe any crash. Given that it happened only once lowering down the priority and severity.

Comment 4 Byreddy 2016-04-12 08:39:24 UTC
core dump of this crash is placed in the location "http://rhsqe-repo.lab.eng.blr.redhat.com/coredumps/1325768/"

Comment 5 Atin Mukherjee 2016-06-09 04:45:04 UTC
Byreddy,

FWIW, can you please retry this scenario and see if its reproducible? Otherwise we can close this bug and reopen if we hit it back.

~Atin

Comment 7 Byreddy 2016-07-19 10:34:50 UTC
(In reply to Atin Mukherjee from comment #5)
> Byreddy,
> 
> FWIW, can you please retry this scenario and see if its reproducible?
> Otherwise we can close this bug and reopen if we hit it back.
> 
> ~Atin

Verified this bug using the 3.1.3 bits, issue didn't reproduced 

So closing as working in current release.