Bug 1128593

Summary: lookup referring to null GFID's
Product: Red Hat Gluster Storage Reporter: spandura
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED NOTABUG QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.0CC: rhs-bugs, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-12-10 07:04:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description spandura 2014-08-11 07:37:01 UTC
Description of problem:
===========================
In a distribute-replicate volume, when a brick device crashes, on the fuse mount logs we see lot following warning messages. 

[2014-08-08 12:10:36.262063] W [client-rpc-fops.c:2761:client3_3_lookup_cbk] 0-vol1-client-2: remote operation failed: Input/output error. Path: /test_dir.23185 (00000000-0000-0000-0000-000000000000)
[2014-08-08 12:10:36.263401] W [client-rpc-fops.c:2761:client3_3_lookup_cbk] 0-vol1-client-2: remote operation failed: Input/output error. Path: /test_dir.23185 (00000000-0000-0000-0000-000000000000)

lookup's are referring to null GFID's for the given path. 

Version-Release number of selected component (if applicable):
================================================================
glusterfs 3.6.0.27 built on Aug  4 2014 11:49:58

How reproducible:
====================
Often

Steps to Reproduce:
=======================
1. Create a 2 x 2 distribute-replicate volume with 4 bricks . 1 brick per storage node. Start the volume.

2. Create fuse mount. Start creating files and dirs from fuse mount. 

3. While create's are in progress, crash brick1 and brick3 disk.


Actual results:
==================
[2014-08-08 12:10:36.281559] E [afr-self-heal-entry.c:2348:afr_sh_post_nonblocking_entry_cbk] 0-vol1-replicate-1: Non Blocking entrylks failed for /.
[2014-08-08 12:10:36.281604] E [afr-self-heal-common.c:2869:afr_log_self_heal_completion_status] 0-vol1-replicate-1:  entry self heal  failed,   on /
[2014-08-08 12:10:36.282385] W [client-rpc-fops.c:2761:client3_3_lookup_cbk] 0-vol1-client-2: remote operation failed: Input/output error. Path: /test_dir.23186 (00000000-0000-0000-0000-000000000000)
[2014-08-08 12:10:36.283113] W [client-rpc-fops.c:2761:client3_3_lookup_cbk] 0-vol1-client-2: remote operation failed: Input/output error. Path: /test_dir.23186 (00000000-0000-0000-0000-000000000000)
[2014-08-08 12:10:36.283793] W [client-rpc-fops.c:2761:client3_3_lookup_cbk] 0-vol1-client-2: remote operation failed: Input/output error. Path: /test_dir.23186 (00000000-0000-0000-0000-000000000000)
[2014-08-08 12:10:36.284334] W [client-rpc-fops.c:2761:client3_3_lookup_cbk] 0-vol1-client-2: remote operation failed: Input/output error. Path: /test_dir.23186 (00000000-0000-0000-0000-000000000000)
[2014-08-08 12:10:36.285025] W [client-rpc-fops.c:2761:client3_3_lookup_cbk] 0-vol1-client-2: remote operation failed: Input/output error. Path: /test_dir.23186 (00000000-0000-0000-0000-000000000000)
[2014-08-08 12:10:36.285542] W [client-rpc-fops.c:2761:client3_3_lookup_cbk] 0-vol1-client-2: remote operation failed: Input/output error. Path: /test_dir.23186 (00000000-0000-0000-0000-000000000000)
[2014-08-08 12:10:36.286404] W [client-rpc-fops.c:306:client3_3_mkdir_cbk] 0-vol1-client-2: remote operation failed: Input/output error. Path: /test_dir.23186


Expected results:
================
path shouldn't have null gfid's. 

Additional info:
====================
root@mia [Aug-11-2014-13:05:57] >gluster v info
 
Volume Name: vol1
Type: Distributed-Replicate
Volume ID: 18aee4e3-8177-4756-aebc-a759d9b32fa0
Status: Started
Snap Volume: no
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: rhs-client11:/rhs/device0/b1
Brick2: rhs-client12:/rhs/device0/b2
Brick3: rhs-client13:/rhs/device0/b3
Brick4: rhs-client14:/rhs/device0/b4
Options Reconfigured:
features.barrier: disable
performance.readdir-ahead: on
auto-delete: disable
snap-max-soft-limit: 90
snap-max-hard-limit: 256
root@mia [Aug-11-2014-13:06:01] >