Bug 852307

Summary: abort after an interrupted replace-brick operation causes glusterd to hang
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Vidya Sakar <vinaraya>
Component: glusterdAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED DEFERRED QA Contact: Sudhir D <sdharane>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 2.0CC: amarts, gluster-bugs, nsathyan, rabhat, rfortier, rgowdapp, rhs-bugs, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 816915 Environment:
Last Closed: 2012-10-05 17:20:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 816915    
Bug Blocks:    

Description Vidya Sakar 2012-08-28 07:20:09 UTC
+++ This bug was initially created as a clone of Bug #816915 +++

Description of problem:

If source brick was killed while replace-brick operation in progress, a subsequent replace-brick abort will result in hang of glusterd. Though glusterd seems to be in _Interruptible sleep_ ('S' state of ps output), one cannot attach gdb or strace to glusterd process. Even other commands on gluster-cli fail. However attaching strace to glusterd process even before abort was attempted showed that glusterd to be hung in lsetxattr syscall. A statedump of client - a maintainance mount - and src brick revealed that setxattr call to be stuck in pump translator.

Code analysis with KP pointed the cause to crawl operation not being started after restart of brick.

Version-Release number of selected component (if applicable):

8b6534031ab9b60da293e9c2ffb95141d714f973

How reproducible: Consistently


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

--- Additional comment from kparthas on 2012-05-03 03:17:49 EDT ---

*** Bug 787123 has been marked as a duplicate of this bug. ***

--- Additional comment from amarts on 2012-07-11 02:23:08 EDT ---

patch sent @ http://review.gluster.com/3264

--- Additional comment from kparthas on 2012-07-11 03:11:18 EDT ---

*** Bug 818519 has been marked as a duplicate of this bug. ***

--- Additional comment from kparthas on 2012-07-11 03:19:27 EDT ---

*** Bug 797729 has been marked as a duplicate of this bug. ***

Comment 2 Amar Tumballi 2012-10-05 17:20:46 UTC
replace-brick functionality can be achieved by 'add-brick + remove-brick' today, so not planning to work on that.