Bug 1176062

Summary: Force replace-brick lead to the persistent write(use dd) return Input/output error
Product: [Community] GlusterFS Reporter: jiademing.dd <iesool>
Component: disperseAssignee: Xavi Hernandez <jahernan>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, gluster-bugs, jahernan, lidi, pkarampu
Target Milestone: ---Keywords: Reopened, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8rc2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1183716 1220011 (view as bug list) Environment:
Last Closed: 2016-06-16 12:41:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1183716, 1220011    

Description jiademing.dd 2014-12-19 10:28:44 UTC
Description of problem:
    I mkdir /mountpoint/a/b/c -p, after that exec dd if=/dev/zero of=/mountpoint/a/b/c/test.bak bs=1M.  then I relace-brick commit force.  replace-brick success, but the write return Input/output error.

Version-Release number of selected component (if applicable):
 glusterfs-master or glusterfs-3.6.2beta1

How reproducible:


Steps to Reproduce:
1.I create a disperse 3 redundancy 1 volume

Volume Name: test
Type: Disperse
Volume ID: bfdbfc8e-3dcc-4459-a1e4-9de17df03db5
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: node-1:/sda/
Brick2: node-1:/sdb/
Brick3: node-1:/sdc/
Options Reconfigured:
features.quota: on
performance.high-prio-threads: 64
performance.low-prio-threads: 64
performance.least-prio-threads: 64
performance.normal-prio-threads: 64
performance.io-thread-count: 64
server.allow-insecure: on
features.lock-heal: on
network.ping-timeout: 5
performance.client-io-threads: enable

2.mkdir -p /mountpoint/a/b/c

3.dd if=/dev/zero of=/mountpoint/a/b/c/test.bak bs=1M

4.gluster volume replace-brick node-1:/sda node-1:/sdd commit force

Actual results:

replace-brick success, but dd write return Input/output error.

Expected results:

replace-brick success and the persistent write all should be OK.

Additional info:

Comment 1 jiademing.dd 2014-12-19 10:30:26 UTC
I test the the persistent read also has this problem.(glusterfs-master or glusterfs-release-3.6.2beta1)

Comment 2 Anand Avati 2015-01-07 11:51:33 UTC
REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files) posted (#1) for review on master by Xavier Hernandez (xhernandez)

Comment 3 jiademing.dd 2015-01-09 08:08:45 UTC
(In reply to Anand Avati from comment #2)
> REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files)
> posted (#1) for review on master by Xavier Hernandez (xhernandez)

I test this patch, after force relpace-brick,it can persistent write, but  I ls /mountpoint,  return Input/output error Occasionally. then I stop the dd write, ls /mountpoint is OK.

Comment 4 jiademing.dd 2015-01-09 09:36:54 UTC
(In reply to jiademing from comment #3)
> (In reply to Anand Avati from comment #2)
> > REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files)
> > posted (#1) for review on master by Xavier Hernandez (xhernandez)
> 
> I test this patch, after force relpace-brick,it can persistent write, but  I
> ls /mountpoint,  return Input/output error Occasionally. then I stop the dd
> write, ls /mountpoint is OK.


Error logs:

[2015-01-09 17:30:04.058135] E [ec-helpers.c:410:ec_loc_setup_path] 3-test-disperse-0: Invalid path '<gfid:060bd8ef-6e58-4fcd-ac21-2c0e85b70e54>' in loc
[2015-01-09 17:30:04.058165] I [dht-layout.c:663:dht_layout_normalize] 3-test-dht: Found anomalies in <gfid:060bd8ef-6e58-4fcd-ac21-2c0e85b70e54> (gfid = 060bd8ef-6e58-4fcd-ac21-2c0e85b70e54). Holes=1 overlaps=0
[2015-01-09 17:30:04.058187] W [fuse-resolve.c:147:fuse_resolve_gfid_cbk] 0-fuse: 060bd8ef-6e58-4fcd-ac21-2c0e85b70e54: failed to resolve (Input/output error)
[2015-01-09 17:30:04.058201] E [fuse-bridge.c:808:fuse_getattr_resume] 0-digioceanfs-fuse: 47449: GETATTR 6883340 (060bd8ef-6e58-4fcd-ac21-2c0e85b70e54) resolution failed

Comment 5 Xavi Hernandez 2015-01-09 10:59:53 UTC
(In reply to jiademing from comment #3)
> (In reply to Anand Avati from comment #2)
> > REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files)
> > posted (#1) for review on master by Xavier Hernandez (xhernandez)
> 
> I test this patch, after force relpace-brick,it can persistent write, but  I
> ls /mountpoint,  return Input/output error Occasionally. then I stop the dd
> write, ls /mountpoint is OK.

I've tried to do an ls of <mountpoint>, <mountpoint>/a, <mountpoint>/a/b and <mountpoint>/a/b/c while the dd was running in background and replace brick had completed. I haven't seen any Input/Output error. However I've seen that 'ls' sometimes takes more time than expected to complete. I'll try to see why.

The error logs you show seem to come from a different version of ec (program lines do not match with current code). I've tried it with current master with this patch added. What version are you trying ?

Comment 6 jiademing.dd 2015-01-12 06:01:00 UTC
(In reply to Xavier Hernandez from comment #5)
> (In reply to jiademing from comment #3)
> > (In reply to Anand Avati from comment #2)
> > > REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files)
> > > posted (#1) for review on master by Xavier Hernandez (xhernandez)
> > 
> > I test this patch, after force relpace-brick,it can persistent write, but  I
> > ls /mountpoint,  return Input/output error Occasionally. then I stop the dd
> > write, ls /mountpoint is OK.
> 
> I've tried to do an ls of <mountpoint>, <mountpoint>/a, <mountpoint>/a/b and
> <mountpoint>/a/b/c while the dd was running in background and replace brick
> had completed. I haven't seen any Input/Output error. However I've seen that
> 'ls' sometimes takes more time than expected to complete. I'll try to see
> why.
> 
> The error logs you show seem to come from a different version of ec (program
> lines do not match with current code). I've tried it with current master
> with this patch added. What version are you trying ?

Sorry, I merged this patch by manual.Then I try on master + this patch, that's OK.

Comment 7 Anand Avati 2015-05-03 11:38:51 UTC
REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 8 Anand Avati 2015-05-04 02:58:27 UTC
REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 9 Anand Avati 2015-05-04 04:24:07 UTC
REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files) posted (#4) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 10 Anand Avati 2015-05-06 14:30:42 UTC
REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files) posted (#5) for review on master by Xavier Hernandez (xhernandez)

Comment 11 Anand Avati 2015-05-06 16:51:05 UTC
REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files) posted (#6) for review on master by Xavier Hernandez (xhernandez)

Comment 12 Anand Avati 2015-05-07 07:34:33 UTC
REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files) posted (#7) for review on master by Xavier Hernandez (xhernandez)

Comment 13 Anand Avati 2015-05-07 10:51:44 UTC
REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files) posted (#8) for review on master by Xavier Hernandez (xhernandez)

Comment 14 Anand Avati 2015-05-08 06:51:22 UTC
REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files) posted (#9) for review on master by Vijay Bellur (vbellur)

Comment 15 Anand Avati 2015-05-09 05:07:21 UTC
REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files) posted (#10) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 16 Anand Avati 2015-05-09 09:13:58 UTC
REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files) posted (#11) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 17 Anand Avati 2015-05-09 09:15:25 UTC
REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files) posted (#12) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 18 Anand Avati 2015-05-09 14:53:00 UTC
REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files) posted (#13) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 19 Anand Avati 2015-05-09 17:03:40 UTC
REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files) posted (#14) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 20 Nagaprasad Sathyanarayana 2015-10-25 14:53:50 UTC
Fix for this bug is already made in a GlusterFS release. The cloned BZ has details of the fix and the release. Hence closing this mainline BZ.

Comment 21 Nagaprasad Sathyanarayana 2015-10-25 15:03:09 UTC
Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well.

Comment 22 Niels de Vos 2016-06-16 12:41:05 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user