Bug 1163330 - DHT+add bricks:Intermittent IO failures upon add-bricks.
Summary: DHT+add bricks:Intermittent IO failures upon add-bricks.
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: distribute
Version: 2.1
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard: dht-add-brick
Depends On:
Blocks: 1286064
TreeView+ depends on / blocked
 
Reported: 2014-11-12 14:13 UTC by Amit Chaurasia
Modified: 2015-11-27 10:29 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1286064 (view as bug list)
Environment:
Last Closed: 2015-11-27 10:29:31 UTC
Embargoed:


Attachments (Terms of Use)

Description Amit Chaurasia 2014-11-12 14:13:52 UTC
Description of problem:
I/O fails while bricks are being added to the volume.


Version-Release number of selected component (if applicable):
3.4.0.70rhs-1.el6rhs.x86_64


How reproducible:
Intermittent.Usually occurs while files are being created on the sub-folders and not on the root of the volume.

Steps to Reproduce:
1) Create a distribute volume with 3 bricks and start it. 
 

2) Fuse mount the volume and create directories and files on the mount point

mount -t glusterfs <hostname or ip-addr >:/<vol_name> /<mount point>


3) When file creation is in progress on the client, add 2 bricks to the volume 

gluster v add-brick <vol_name> <hostname or ip-addr >:/<brick1>  <hostname or ip-addr >:/<brick2> 
 

Actual results:
The creation of new files fails on the volume with error code 22 (Invalid argument).

[root@dht-rhs-20 mnt1]# python create_newf.py 
Traceback (most recent call last):
  File "create_newf.py", line 5, in <module>
    ff.write("data"*500000)
IOError: [Errno 22] Invalid argument
close failed in file object destructor:
IOError: [Errno 22] Invalid argument

OR

tar: linux-3.17.2/Documentation/devicetree/bindings/clock/fixed-clock.txt: Cannot open: No such file or directory
linux-3.17.2/Documentation/devicetree/bindings/clock/fixed-factor-clock.txt
linux-3.17.2/Documentation/devicetree/bindings/clock/hix5hd2-clock.txt
tar: linux-3.17.2/Documentation/devicetree/bindings/clock/hix5hd2-clock.txt: Cannot open: No such file or directory



Expected results:
File creation should have been successful without any error messages.



Additional info:
vol info


log snippet


[root@dht-rhs-20 new]# tail -f /var/log/glusterfs/mnt3-.log 
[2014-11-12 19:22:12.271069] W [client-rpc-fops.c:1983:client3_3_setattr_cbk] 3-gv1-client-27: remote operation failed: No such file or directory
[2014-11-12 19:22:12.271336] W [client-rpc-fops.c:1983:client3_3_setattr_cbk] 3-gv1-client-30: remote operation failed: No such file or directory
[2014-11-12 19:22:12.275608] W [client-rpc-fops.c:1983:client3_3_setattr_cbk] 3-gv1-client-28: remote operation failed: No such file or directory
[2014-11-12 19:22:12.275669] W [client-rpc-fops.c:1983:client3_3_setattr_cbk] 3-gv1-client-29: remote operation failed: No such file or directory
[2014-11-12 19:22:12.275908] W [client-rpc-fops.c:1983:client3_3_setattr_cbk] 3-gv1-client-30: remote operation failed: No such file or directory
[2014-11-12 19:22:12.275986] W [client-rpc-fops.c:1983:client3_3_setattr_cbk] 3-gv1-client-27: remote operation failed: No such file or directory
[2014-11-12 19:22:12.293449] W [client-rpc-fops.c:322:client3_3_mkdir_cbk] 3-gv1-client-28: remote operation failed: No such file or directory. Path: <gfid:2c118acb-cfa3-480d-ad8c-08a6b130fc13>/linux-3.17.2/Documentation/devicetree/bindings/dma
[2014-11-12 19:22:12.293990] W [client-rpc-fops.c:322:client3_3_mkdir_cbk] 3-gv1-client-30: remote operation failed: No such file or directory. Path: <gfid:2c118acb-cfa3-480d-ad8c-08a6b130fc13>/linux-3.17.2/Documentation/devicetree/bindings/dma
[2014-11-12 19:22:12.294047] W [client-rpc-fops.c:322:client3_3_mkdir_cbk] 3-gv1-client-27: remote operation failed: No such file or directory. Path: <gfid:2c118acb-cfa3-480d-ad8c-08a6b130fc13>/linux-3.17.2/Documentation/devicetree/bindings/dma
[2014-11-12 19:22:12.294264] W [client-rpc-fops.c:322:client3_3_mkdir_cbk] 3-gv1-client-29: remote operation failed: No such file or directory. Path: <gfid:2c118acb-cfa3-480d-ad8c-08a6b130fc13>/linux-3.17.2/Documentation/devicetree/bindings/dma
^C

Comment 3 Amit Chaurasia 2014-11-12 14:48:10 UTC
Missed on adding vol info above hence adding it here.


[root@dht-rhs-20 ~]# gluster volume info
 
Volume Name: gv0
Type: Distributed-Replicate
Volume ID: 164ca1be-edfe-4b1b-8c4a-841078582078
Status: Stopped
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.47.99:/data/newbrick1/brk1
Brick2: 10.70.47.101:/data/newbrick1/brk1
Brick3: 10.70.47.99:/data/newbrick1/brk2
Brick4: 10.70.47.101:/data/newbrick1/brk2
 
Volume Name: gv1
Type: Distribute
Volume ID: 3571aa3f-2517-441f-92f1-fbe312e9c782
Status: Started
Number of Bricks: 31
Transport-type: tcp
Bricks:
Brick1: 10.70.47.99:/data/newbrick2/brk1
Brick2: 10.70.47.101:/data/newbrick2/brk1
Brick3: 10.70.47.101:/data/newbrick2/brk2
Brick4: 10.70.47.99:/data/newbrick2/brk2
Brick5: 10.70.47.99:/data/newbrick2/brk3
Brick6: 10.70.47.101:/data/newbrick2/brk3
Brick7: 10.70.47.99:/data/newbrick2/brk4
Brick8: 10.70.47.101:/data/newbrick2/brk4
Brick9: 10.70.47.99:/data/newbrick2/brk5
Brick10: 10.70.47.101:/data/newbrick2/brk5
Brick11: 10.70.47.99:/data/newbrick2/brk6
Brick12: 10.70.47.101:/data/newbrick2/brk6
Brick13: 10.70.47.99:/data/newbrick2/brk7
Brick14: 10.70.47.101:/data/newbrick2/brk7
Brick15: 10.70.47.99:/data/newbrick2/brk8
Brick16: 10.70.47.101:/data/newbrick2/brk10
Brick17: 10.70.47.99:/data/newbrick2/brk9
Brick18: 10.70.47.101:/data/newbrick2/brk9
Brick19: 10.70.47.99:/data/newbrick2/brk10
Brick20: 10.70.47.101:/data/newbrick2/brka
Brick21: 10.70.47.99:/data/newbrick2/brka
Brick22: 10.70.47.101:/data/newbrick2/brkb
Brick23: 10.70.47.99:/data/newbrick2/brkb
Brick24: 10.70.47.101:/data/newbrick2/brkc
Brick25: 10.70.47.99:/data/newbrick2/brkc
Brick26: 10.70.47.101:/data/newbrick2/brkd
Brick27: 10.70.47.99:/data/newbrick2/brkd
Brick28: 10.70.47.101:/data/newbrick2/brke
Brick29: 10.70.47.99:/data/newbrick2/brke
Brick30: 10.70.47.101:/data/newbrick2/brkf
Brick31: 10.70.47.99:/data/newbrick2/brkf
Options Reconfigured:
features.quota: on

Comment 5 Susant Kumar Palai 2015-11-27 10:29:31 UTC
Cloning this bug in 3.1. To be fixed in future release.


Note You need to log in before you can comment on or make changes to this bug.