Bug 1395699 - getting Input/output error on doing deletes simultaneously from two clients
Summary: getting Input/output error on doing deletes simultaneously from two clients
Keywords:
Status: CLOSED DUPLICATE of bug 1395161
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: disperse
Version: rhgs-3.2
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Ashish Pandey
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-16 13:21 UTC by Nag Pavan Chilakam
Modified: 2016-11-17 09:46 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-17 08:47:53 UTC
Embargoed:


Attachments (Terms of Use)

Description Nag Pavan Chilakam 2016-11-16 13:21:31 UTC
Description of problem:
=========================
on a 2x(4+2) ec volume, when I try to do delete using rm -rf of same directory from two different clients I see rm: cannot remove ‘linux-4.8.8/Documentation/arm/SA1100’: Input/output error

I see this consistently.
For easy reproducer. try to delete untarred linux image directory.

Also, I see the following warnings in the fuse log

ct] 0-erasure-disperse-0: Executing operation with some subvolumes unavailable (8)
[2016-11-16 13:19:38.635387] W [MSGID: 122033] [ec-common.c:1466:ec_locked] 0-erasure-disperse-1: Failed to complete preop lock [Stale file handle]
[2016-11-16 13:19:38.636952] W [MSGID: 122033] [ec-common.c:1466:ec_locked] 0-erasure-disperse-0: Failed to complete preop lock [Stale file handle]
[2016-11-16 13:19:38.637006] W [MSGID: 122035] [ec-common.c:419:ec_child_select] 0-erasure-disperse-0: Executing operation with some subvolumes unavailable (8)
[2016-11-16 13:19:38.642687] W [MSGID: 109065] [dht-common.c:7826:dht_rmdir_lock_cbk] 0-erasure-dht: acquiring inodelk failed rmdir for /kern/linux-4.8.8/Documentation/devicetree/bindings/iio/frequency) [Stale file handle]
[2016-11-16 13:19:38.642773] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 372781: RMDIR() /kern/linux-4.8.8/Documentation/devicetree/bindings/iio/frequency => -1 (Stale file handle)
[2016-11-16 13:19:38.643925] W [MSGID: 122035] [ec-common.c:419:ec_child_select] 0-erasure-disperse-0: Executing operation with some subvolumes unavailable (8)
[2016-11-16 13:19:38.648189] W [MSGID: 122033] [ec-common.c:1466:ec_locked] 0-erasure-disperse-0: Failed to complete preop lock [Stale file handle]
[2016-11-16 13:19:38.648562] W [MSGID: 122033] [ec-common.c:1466:ec_locked] 0-erasure-disperse-1: Failed to complete preop lock [Stale file handle]
[2016-11-16 13:19:38.648628] W [fuse-bridge.c:989:fuse_fd_cbk] 0-glusterfs-fuse: 372784: OPENDIR() /kern/linux-4.8.8/Documentation/devicetree/bindings/iio/light => -1 (Stale file handle)
[2016-11-16 13:19:38.648871] W [MSGID: 122035] [ec-common.c:419:ec_child_select] 0-erasure-disperse-0: Executing operation with some subvolumes unavailable (8)
[2016-11-16 13:19:38.651638] W [MSGID: 122033] [ec-common.c:1466:ec_locked] 0-erasure-disperse-0: Failed to complete preop lock [Stale file handle]
[2016-11-16 13:19:38.653222] W [MSGID: 122033] [ec-common.c:1466:ec_locked] 0-erasure-disperse-1: Failed to complete preop lock [Stale file handle]
[2016-11-16 13:19:38.653301] W [fuse-bridge.c:989:fuse_fd_cbk] 0-glusterfs-fuse: 372785: OPENDIR() /kern/linux-4.8.8/Documentation/devicetree/bindings/iio/light => -1 (Stale file handle)
[2016-11-16 13:19:38.653685] W [MSGID: 122035] [ec-common.c:419:ec_child_select] 0-erasure-disperse-0: Executing operation with some subvolumes unavailable (8)
[2016-11-16 13:19:38.657179] W [MSGID: 122033] [ec-common.c:1466:ec_locked] 0-erasure-disperse-0: Failed to complete preop lock [Stale file handle]
[2016-11-16 13:19:38.662996] W [MSGID: 122033] [ec-common.c:1466:ec_locked] 0-erasure-disperse-1: Failed to complete preop lock [Stale file handle]
[2016-11-16 13:19:38.663091] W [MSGID: 122035] [ec-common.c:419:ec_child_select] 0-erasure-disperse-0: Executing operation with some subvolumes unavailable (8)
[2016-11-16 13:19:38.666281] W [MSGID: 109065] [dht-common.c:7826:dht_rmdir_lock_cbk] 0-erasure-dht: acquiring inodelk failed rmdir for /kern/linux-4.8.8/Documentation/devicetree/bindings/iio/light) [Stale file handle]
[2016-11-16 13:19:38.666377] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 372787: RMDIR() /kern/linux-4.8.8/Documentation/devicetree/bindings/iio/light => -1 (Stale file handle)
[2016-11-16 13:19:38.666911] W [MSGID: 122035] [ec-common.c:419:ec_child_select] 0-erasure-disperse-0: Executing operation with some subvolumes unavailable (8)
[2016-11-16 13:19:38.670483] W [MSGID: 122033] [ec-common.c:1466:ec_locked] 0-erasure-disperse-0: Failed to complete preop lock [Stale file handle]
[2016-11-16 13:19:38.670495] W [MSGID: 122033] [ec-common.c:1466:ec_locked] 0-erasure-disperse-1: Failed to complete preop lock [Stale file handle]
[2016-11-16 13:19:38.670595] W [fuse-bridge.c:989:fuse_fd_cbk] 0-glusterfs-fuse: 372790: OPENDIR() /kern/linux-4.8.8/Documentation/devicetree/bindings/iio/proximity => -1 (Stale file handle)
[2016-11-16 13:19:38.670801] W [MSGID: 122035] [ec-common.c:419:ec_child_select] 0-erasure-disperse-0: Executing operation with some subvolumes unavailable (8)
[2016-11-16 13:19:38.674498] W [MSGID: 122033] [ec-common.c:1466:ec_locked] 0-erasure-disperse-1: Failed to complete preop lock [Stale file handle]
[2016-11-16 13:19:38.674502] W [MSGID: 122033] [ec-common.c:1466:ec_locked] 0-erasure-disperse-0: Failed to complete preop lock [Stale file handle]
[2016-11-16 13:19:38.675700] W [fuse-bridge.c:989:fuse_fd_cbk] 0-glusterfs-fuse: 372791: OPENDIR() /kern/linux-4.8.8/Documentation/devicetree/bindings/iio/proximity => -1 (Stale file handle)
[2016-11-16 13:19:38.676947] W [MSGID: 122035] [ec-common.c:419:ec_child_select] 0-erasure-disperse-0: Executing operation with some subvolumes unavailable (8)


[root@dhcp35-126 glusterfs]# rpm -qa|grep gluster
glusterfs-fuse-3.8.4-3.el7rhgs.x86_64
glusterfs-rdma-3.8.4-3.el7rhgs.x86_64
glusterfs-libs-3.8.4-3.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-3.el7rhgs.x86_64
glusterfs-api-3.8.4-3.el7rhgs.x86_64
glusterfs-server-3.8.4-3.el7rhgs.x86_64
glusterfs-debuginfo-3.8.4-3.el7rhgs.x86_64
glusterfs-3.8.4-3.el7rhgs.x86_64
glusterfs-cli-3.8.4-3.el7rhgs.x86_64
[root@dhcp35-126 glusterfs]#

Comment 2 Nag Pavan Chilakam 2016-11-16 13:22:36 UTC
note that One brick was down
root@dhcp35-37 ~]# gluster v status erasure
Status of volume: erasure
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.37:/rhs/brick1/erasure       49152     0          Y       31636
Brick 10.70.35.116:/rhs/brick1/erasure      49152     0          Y       32577
Brick 10.70.35.239:/rhs/brick1/erasure      49152     0          Y       13279
Brick 10.70.35.135:/rhs/brick1/erasure      N/A       N/A        N       N/A  
Brick 10.70.35.8:/rhs/brick1/erasure        49152     0          Y       29016
Brick 10.70.35.196:/rhs/brick1/erasure      49152     0          Y       30329
Brick 10.70.35.37:/rhs/brick2/erasure       49153     0          Y       26096
Brick 10.70.35.116:/rhs/brick2/erasure      49153     0          Y       32596
Brick 10.70.35.239:/rhs/brick2/erasure      49153     0          Y       13298
Brick 10.70.35.135:/rhs/brick2/erasure      49153     0          Y       16724
Brick 10.70.35.8:/rhs/brick2/erasure        49153     0          Y       29024
Brick 10.70.35.196:/rhs/brick2/erasure      49153     0          Y       30321
Snapshot Daemon on localhost                49154     0          Y       26281
Self-heal Daemon on localhost               N/A       N/A        Y       12130
Quota Daemon on localhost                   N/A       N/A        Y       31664
Snapshot Daemon on 10.70.35.196             49154     0          Y       30336
Self-heal Daemon on 10.70.35.196            N/A       N/A        Y       2749 
Quota Daemon on 10.70.35.196                N/A       N/A        Y       22663
Snapshot Daemon on 10.70.35.135             49154     0          Y       16837
Self-heal Daemon on 10.70.35.135            N/A       N/A        Y       1328 
Quota Daemon on 10.70.35.135                N/A       N/A        Y       21241
Snapshot Daemon on 10.70.35.116             49154     0          Y       32710
Self-heal Daemon on 10.70.35.116            N/A       N/A        Y       17057
Quota Daemon on 10.70.35.116                N/A       N/A        Y       4667 
Snapshot Daemon on 10.70.35.8               49154     0          Y       29030
Self-heal Daemon on 10.70.35.8              N/A       N/A        Y       1217 
Quota Daemon on 10.70.35.8                  N/A       N/A        Y       21068
Snapshot Daemon on 10.70.35.239             49154     0          Y       13426
Self-heal Daemon on 10.70.35.239            N/A       N/A        Y       30091
Quota Daemon on 10.70.35.239                N/A       N/A        Y       17733
 
Task Status of Volume erasure
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp35-37 ~]# gluster  v info erasure
 
Volume Name: erasure
Type: Distributed-Disperse
Volume ID: 95cd2d01-3452-46c3-9edc-946470738052
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x (4 + 2) = 12
Transport-type: tcp
Bricks:
Brick1: 10.70.35.37:/rhs/brick1/erasure
Brick2: 10.70.35.116:/rhs/brick1/erasure
Brick3: 10.70.35.239:/rhs/brick1/erasure
Brick4: 10.70.35.135:/rhs/brick1/erasure
Brick5: 10.70.35.8:/rhs/brick1/erasure
Brick6: 10.70.35.196:/rhs/brick1/erasure
Brick7: 10.70.35.37:/rhs/brick2/erasure
Brick8: 10.70.35.116:/rhs/brick2/erasure
Brick9: 10.70.35.239:/rhs/brick2/erasure
Brick10: 10.70.35.135:/rhs/brick2/erasure
Brick11: 10.70.35.8:/rhs/brick2/erasure
Brick12: 10.70.35.196:/rhs/brick2/erasure
Options Reconfigured:
features.cache-invalidation-timeout: 600
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.md-cache-timeout: 300
disperse.shd-max-threads: 4
features.uss: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
[root@dhcp35-37 ~]#

Comment 3 Ashish Pandey 2016-11-17 05:40:17 UTC
Nag,

This is exactly the same bug which has already been raised by Ambarish.
https://bugzilla.redhat.com/show_bug.cgi?id=1395161

I would like to mark this as duplicate and close it.
Your thoughts?

Comment 4 Ashish Pandey 2016-11-17 06:10:51 UTC
Although it is duplicate, If you can collect sosreport , that will be helpful .

Comment 5 Pranith Kumar K 2016-11-17 08:47:53 UTC

*** This bug has been marked as a duplicate of bug 1395161 ***


Note You need to log in before you can comment on or make changes to this bug.