Bug 830121 - Nfs mount doesn't report "I/O Error" when there is GFID mismatch for a file
Nfs mount doesn't report "I/O Error" when there is GFID mismatch for a file
Status: CLOSED DUPLICATE of bug 853682
Product: GlusterFS
Classification: Community
Component: replicate (Show other bugs)
3.3-beta
Unspecified Unspecified
unspecified Severity urgent
: ---
: ---
Assigned To: Vivek Agarwal
: Reopened, Triaged
Depends On:
Blocks: 853683 858498
  Show dependency treegraph
 
Reported: 2012-06-08 05:48 EDT by Shwetha Panduranga
Modified: 2016-02-17 19:02 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 853683 (view as bug list)
Environment:
Last Closed: 2013-08-28 07:03:59 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Shwetha Panduranga 2012-06-08 05:48:01 EDT
Description of problem:
------------------------
When there is a GFID mismatch on 2 bricks for a file, "cat <file_name>" from nfs mount doesn't report "I/O Error"

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
3.3.0qa45

How reproducible:
-----------------
Often


Steps to Reproduce:
---------------------
1.Create a replicate volume(1x2. brick1 and brick2)
2.set self-heal-daemon off for the volume
3.Start the volume.
4.Create a NFS mount.
5.Create d directory <testdir> from NFS mount
6.Bring down "brick1".
7.From nfs mount execute: echo "Test Case: GFID Mismatch should report I/O Error" > testdir/file
8.Bring back the brick "brick1"
9.Bring down "brick2"
10.From nfs mount execute:echo "Test Case: GFID Mismatch should report I/O Error when Brick2 is down" > testdir/file
11.Bring back the brick "brick2"
12.From the mount execute : cat testdir/file

Actual results:
----------------
[06/08/12 - 20:32:19 root@APP-CLIENT1 nfsc1]# cd testdir/
[06/08/12 - 20:32:38 root@APP-CLIENT1 testdir]# ls
file
[06/08/12 - 20:32:39 root@APP-CLIENT1 testdir]# cat file 
Test Case: GFID Mismatch should report I/O Error when Brick2 is down


Expected results:
-----------------
Input/Ouput Error


Additional info:
-----------------
[06/08/12 - 20:27:23 root@APP-SERVER1 ~]# gluster v info
 
Volume Name: dstore
Type: Replicate
Volume ID: ed21634a-27c8-496a-a765-de068ce9dc8e
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.2.35:/export_sdb/dir1
Brick2: 192.168.2.36:/export_sdb/dir1
Options Reconfigured:
cluster.self-heal-daemon: off

[06/08/12 - 20:32:05 root@APP-SERVER1 ~]# gluster v status
Status of volume: dstore
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 192.168.2.35:/export_sdb/dir1			24009	Y	2830
Brick 192.168.2.36:/export_sdb/dir1			24009	Y	9539
NFS Server on localhost					38467	Y	2889
NFS Server on 192.168.2.36				38467	Y	9546

Data from Brick1 
------------------
[06/08/12 - 20:32:07 root@APP-SERVER1 ~]# 
[06/08/12 - 20:33:05 root@APP-SERVER1 ~]# 
[06/08/12 - 20:33:05 root@APP-SERVER1 ~]# getfattr -d -m . -e hex /export_sdb/dir1/testdir/file
getfattr: Removing leading '/' from absolute path names
# file: export_sdb/dir1/testdir/file
trusted.afr.dstore-client-0=0x000000000000000000000000
trusted.afr.dstore-client-1=0x000000010000000000000000
trusted.gfid=0x2dc5c2a385984b98828b9393fd4873db

[06/08/12 - 20:33:07 root@APP-SERVER1 ~]# getfattr -d -m . -e hex /export_sdb/dir1/testdir/
getfattr: Removing leading '/' from absolute path names
# file: export_sdb/dir1/testdir/
trusted.afr.dstore-client-0=0x000000000000000000000000
trusted.afr.dstore-client-1=0x000000000000000000000001
trusted.gfid=0xc20d53aa528644a8a6d74cd155413f19

[06/08/12 - 20:33:09 root@APP-SERVER1 ~]# 
[06/08/12 - 20:33:38 root@APP-SERVER1 ~]# cat /export_sdb/dir1/testdir/file
Test Case: GFID Mismatch should report I/O Error when all bricks are up

Data from Brick2:-
------------------

[06/08/12 - 20:31:47 root@APP-SERVER2 ~]# getfattr -d -m . -e hex /export_sdb/dir1/testdir/file 
getfattr: Removing leading '/' from absolute path names
# file: export_sdb/dir1/testdir/file
trusted.afr.dstore-client-0=0x000000010000000000000000
trusted.afr.dstore-client-1=0x000000000000000000000000
trusted.gfid=0x9f334349bcac445a9c8479a629068da1

[06/08/12 - 20:33:12 root@APP-SERVER2 ~]# getfattr -d -m . -e hex /export_sdb/dir1/testdir/
getfattr: Removing leading '/' from absolute path names
# file: export_sdb/dir1/testdir/
trusted.afr.dstore-client-0=0x000000000000000000000001
trusted.afr.dstore-client-1=0x000000000000000000000000
trusted.gfid=0xc20d53aa528644a8a6d74cd155413f19

[06/08/12 - 20:33:14 root@APP-SERVER2 ~]# cat /export_sdb/dir1/testdir/file
Test Case: GFID Mismatch should report I/O Error
Comment 1 Shwetha Panduranga 2012-06-08 05:53:33 EDT
[06/08/12 - 20:49:16 root@APP-CLIENT1 nfsc1]# rm testdir/file 
rm: remove regular file `testdir/file'? y
rm: cannot remove `testdir/file': Input/output error
Comment 2 Shwetha Panduranga 2012-06-08 05:55:06 EDT
[06/08/12 - 20:48:59 root@APP-CLIENT1 nfsc1]# rm testdir/file 
rm: remove regular file `testdir/file'? y
rm: cannot remove `testdir/file': Input/output error

[06/08/12 - 20:49:04 root@APP-CLIENT1 nfsc1]# cat testdir/file 
Test Case: GFID Mismatch should report I/O Error when all bricks are up

[06/08/12 - 20:49:16 root@APP-CLIENT1 nfsc1]# rm testdir/file 
rm: remove regular file `testdir/file'? y
rm: cannot remove `testdir/file': Input/output error
Comment 3 Jeff Darcy 2012-10-26 17:10:23 EDT
The symptom's not quite the same, but it's very close and the underlying cause is identical.

*** This bug has been marked as a duplicate of bug 830134 ***
Comment 4 spandura 2013-07-19 06:25:37 EDT
Reopening this bug as this is not duplicate of the bug 830134. 

In bug 830134 the EIO is not reported on NFS Mount even when files are in data split-brain. 

In this bug it's entry split-brain not the data split-brain.
Comment 5 Vivek Agarwal 2013-08-28 07:03:59 EDT
The root cause looks similar to 853682.

*** This bug has been marked as a duplicate of bug 853682 ***
Comment 6 Vivek Agarwal 2013-08-28 07:04:45 EDT
NFS lookups are cached by NFS client. Therefore all NFS lookup call may not reach server.

To test this behavior we can mount NFS with lookupcache=none option. This will disable client side lookup cache.

Fuse mount does not seem to cache lookups and the split-brain check is done only at lookups therefore Fuse mount seems to be working.

I think AFR should handle such scenario (lookup cache) as well.

Note You need to log in before you can comment on or make changes to this bug.