Bug 1564490 - [Remove-brick] Few files are not migrated on the decommissioned bricks when bricks are brought down while remove-brick is in-progress
Summary: [Remove-brick] Few files are not migrated on the decommissioned bricks when b...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: replicate
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Ravishankar N
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-06 12:36 UTC by Prasad Desala
Modified: 2018-11-23 08:48 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-23 08:48:41 UTC
Embargoed:


Attachments (Terms of Use)

Description Prasad Desala 2018-04-06 12:36:45 UTC
Description of problem:
=======================
On a 11x3 volume, when a replica pair brick is brought down while remove-brick is Rebalance on a node failed when a replica pair brick is brought down


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Prasad Desala 2018-04-06 13:10:13 UTC
Hit enter by mistake while filing bug. Please see the complete details of the bug below.

Description of problem:
=======================
Few files are not migrated on the decommissioned bricks when bricks are brought down while remove-brick is in-progress.


Version-Release number of selected component (if applicable):
3.12.2-7.el7rhgs.x86_64

How reproducible:
1/1

Steps to Reproduce:
===================
1) Create a x3 volume and start it.
2) FUSE mount on multiple clients and start linux kernel untar and lookups from clients.
3) Start removing few bricks.
4) While remove-brick is in-porgress, kill a brick from a replica pair. As brick mux is enabled killing single brick on the server using kill -9 would take down all the bricks on the node.
5) wait till the rebalance completes on the nodes.

Actual results:
===============
Few files are not migrated on the decommissioned bricks; commit results in data loss.

Expected results:
================
Remove-brick operation should migrate all the files from the decommissioned brick.

Comment 6 Susant Kumar Palai 2018-04-27 10:07:13 UTC
From rebalance log:
[2018-04-06 11:52:54.484298] I [MSGID: 0] [dht-rebalance.c:3732:gf_defrag_fix_layout] 0-nithya: entry->name = vexpress-scc.txt
[2018-04-06 11:52:54.484568] W [MSGID: 114061] [client-common.c:1197:client_pre_readdirp] 0-pingx3-client-30:  (6072b1ff-c676-4c76-993c-cfad73e0a4f5) remote_fd is -1. EBADFD [File descriptor in bad state]
[2018-04-06 11:52:54.485145] E [MSGID: 109058] [dht-rebalance.c:3715:gf_defrag_fix_layout] 0-pingx3-dht: readdirp failed for path /linux-4.4.36/Documentation/devicetree/bindings/arm. Aborting fix-layout [File descriptor in bad state]


This is a known issue. The brick that was brought down was serving readdirp request. And this generally are not transferred to other afr children as there will be a offset mismatch between bricks. 

Moving this to AFR component for clarification on the same. Please move this back to DHT if you feel otherwise.


- Susant

Comment 7 Ravishankar N 2018-04-27 10:27:12 UTC
(In reply to Susant Kumar Palai from comment #6)
> From rebalance log:
> [2018-04-06 11:52:54.484298] I [MSGID: 0]
> [dht-rebalance.c:3732:gf_defrag_fix_layout] 0-nithya: entry->name =
> vexpress-scc.txt
> [2018-04-06 11:52:54.484568] W [MSGID: 114061]
> [client-common.c:1197:client_pre_readdirp] 0-pingx3-client-30: 
> (6072b1ff-c676-4c76-993c-cfad73e0a4f5) remote_fd is -1. EBADFD [File
> descriptor in bad state]
> [2018-04-06 11:52:54.485145] E [MSGID: 109058]
> [dht-rebalance.c:3715:gf_defrag_fix_layout] 0-pingx3-dht: readdirp failed
> for path /linux-4.4.36/Documentation/devicetree/bindings/arm. Aborting
> fix-layout [File descriptor in bad state]
> 
> 
> This is a known issue. The brick that was brought down was serving readdirp
> request. And this generally are not transferred to other afr children as
> there will be a offset mismatch between bricks. 
> z
> Moving this to AFR component for clarification on the same. Please move this
> back to DHT if you feel otherwise.
> 

Yes, AFR has fail over for readdirs only if it on offset 0. If a readdir cbk fails in the middle, then it cannot be re-tried on a different brick.  Atin, is it okay to close this as WONTFIX if we feel that is the appropriate thing to do?


Note You need to log in before you can comment on or make changes to this bug.