Bug 1060676

Summary:

[add-brick]: I/O on NFS fails when bricks are added to a distribute-replicate volume

Product:

[Red Hat Storage] Red Hat Gluster Storage

Reporter:

Sachidananda Urs <surs>

Component:

distribute

Assignee:

Nithya Balachandran <nbalacha>

Status:

CLOSED ERRATA

QA Contact:

storage-qa-internal <storage-qa-internal>

Severity:

high

Docs Contact:

Priority:

high

Version:

rhgs-3.0

CC:

annair, byarlaga, nbalacha, nlevinki, rgowdapp, rmekala, sankarshan, smohan, srangana, vbellur

Target Milestone:

---

Keywords:

ZStream

Target Release:

RHGS 3.1.2

Hardware:

x86_64

OS:

Linux

Whiteboard:

dht-add-brick

Fixed In Version:

glusterfs-3.7.5-6

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2016-03-01 05:22:22 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1260783

Attachments:

Description	Flags
NFS client logs	none

Description Sachidananda Urs 2014-02-03 10:40:18 UTC

Created attachment 858495 [details]
NFS client logs

Description of problem:

When new bricks are added to distribute-replicate volume, IO fails on NFS mount with error messages:

dd: opening `dir.94/file.100': No such file or directory
mkdir: cannot create directory `dir.95': Invalid argument
dd: opening `dir.95/file.1': No such file or directory

Attaching nfs logs. No errors reported in glusterd logs.

Version-Release number of selected component (if applicable):

glusterfs 3.4.0.58rhs built on Jan 25 2014 07:04:06

How reproducible:
Always.

Steps to Reproduce:
1. Create a 2x2 volume, and do some IO on nfs mount
2. Peer probe two more machine
3. add-brick to the cluster

IO will fail on the mount

Expected results:

add-brick should be seamless.

Comment 2 santosh pradhan 2014-06-24 08:28:40 UTC

NFS protocol has no relation with brick operation like adding/removing or rebalance. This needs to be looked by DHT or AFR why the NFS fop fails, mostly DHT team.

Comment 3 Shyamsundar 2014-08-13 13:53:41 UTC

From the logs, I see that there are errors in the dht_access function which had an issue of treating directories as files in certain cases, where the cluster is expanded (i.e bricks added etc.).

This is being fixed as a part of the bug #1125824

Once fixed there and downstream, would like to repeat this test case to ensure that this problem is not present.

Comment 4 Raghavendra G 2015-11-10 05:16:33 UTC

A duplicate of:
https://bugzilla.redhat.com/show_bug.cgi?id=1278399

Fixed by:
https://code.engineering.redhat.com/gerrit/#/c/61036/2

With Patch #61036 and fixes to dht-access, this issue should be fixed.

Comment 6 RajeshReddy 2015-11-23 07:40:32 UTC

Tested with build glusterfs-server-3.7.5-6, created 2x2 volume and mounted it on client using nfs and created 200 deep directories and cd to the leaf directory (../dir199/dir200) and then added two new bricks to the volume and while re-balance is going on, from the client able to run ls  and mkdir so marking this bug a verified

Comment 8 errata-xmlrpc 2016-03-01 05:22:22 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html