Bug 764365 (GLUSTER-2633)

Summary: [4597929cc527f8abaf9ef9e1d5499ea416e5c7ff] Data on nfs mount not consistent
Product: [Community] GlusterFS Reporter: Rahul C S <rahulcs>
Component: nfsAssignee: shishir gowda <sgowda>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: mainlineCC: gluster-bugs, nsathyan, saurabh, sgowda
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTNR Mount Type: nfs
Documentation: DNR CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rahul C S 2011-03-30 07:36:22 UTC
Distributed replicate volume. Reproducible with above steps.

Volume Name: test
Type: Distributed-Replicate
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: centos-qa-sanity:/export/brick1
Brick2: centos-qa-sanity:/export/brick2
Brick3: centos-qa-sanity:/export/brick3
Brick4: centos-qa-sanity:/export/brick4
Options Reconfigured:
diagnostics.client-log-level: TRACE
diagnostics.brick-log-level: TRACE

Comment 1 Rahul C S 2011-03-30 10:33:22 UTC
[root@centos-qa-sanity nfs]# cp -r /etc/ .
[root@centos-qa-sanity nfs]# cd etc/
[root@centos-qa-sanity etc]# ls -l | wc -l
4
[root@centos-qa-sanity etc]# cd ..
[root@centos-qa-sanity nfs]# ls
etc
[root@centos-qa-sanity nfs]# rm -rf etc/
rm: cannot remove directory `etc//audisp/plugins.d': Directory not empty
rm: cannot remove directory `etc//vmware-tools': Directory not empty
rm: cannot remove directory `etc//sound/events': Directory not empty
[root@centos-qa-sanity nfs]#

[root@centos-qa-sanity fuse]# cd etc/
[root@centos-qa-sanity etc]# ls -l| wc -l
235
[root@centos-qa-sanity etc]#

Comment 2 Shehjar Tikoo 2011-03-31 03:09:03 UTC
*** Bug 2605 has been marked as a duplicate of this bug. ***

Comment 3 Shehjar Tikoo 2011-03-31 05:02:18 UTC
first problem here is:

10.1.10.176:/dr on /home/shehjart/mount type nfs (rw,soft,intr,nolock,addr=10.1.10.176)
[root@FC11-5 shehjart]# ls /home/shehjart/mount/
[root@FC11-5 shehjart]# cp -R /etc /home/shehjart/mount
[root@FC11-5 shehjart]# ls -l /etc|wc -l
251
[root@FC11-5 shehjart]# ls -l /home/shehjart/mount/etc/|wc -l
4

Only four items show up on the dir listing for the newly copied /etc dir on the mount point.

This is a regression caused by the following commit:

commit cd3d977b10e24c4b46e55f9831113aba3a241583
Author: shishir gowda <shishirng>
Date:   Tue Mar 22 04:43:56 2011 +0000

    Process dir/link from other subvol if error in dht_readdir
    
    Signed-off-by: shishir gowda <shishirng>
    Signed-off-by: Vijay Bellur <vijay.com>
    
    BUG: 2137 (dhtafr - self heal after renaming directory)
    URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2137


Problem exists in mainline and 3.2.1qa5.

Shishir, can you help fix this?

https://github.com/gluster/glusterfs/commit/cd3d977b10e24c4b46e55f9831113aba3a241583

Comment 4 Shehjar Tikoo 2011-03-31 05:15:22 UTC
Trace log is not helpful. I can only see NFS receiving 3 entries from dht.

[2011-03-31 06:43:04.477816] D [nfs3-helpers.c:2498:nfs3_log_readdirp_res] 0-nfs-nfsv3: XID: acb63e52, READDIRPLUS: NFS: 0(Call completed successfully.), POSIX: 117(Structure needs cleaning), dircount: 512, maxcount: 4096, cverf: 140172809177636, is_eof: 0
[2011-03-31 06:43:04.477841] T [nfs3-helpers.c:894:nfs3_fill_entryp3] 0-nfs-nfsv3: Entry: updatedb.conf, ino: 6681120423915091114
[2011-03-31 06:43:04.477855] T [nfs3-helpers.c:894:nfs3_fill_entryp3] 0-nfs-nfsv3: Entry: PackageKit, ino: 17650619994134751931
[2011-03-31 06:43:04.477866] T [nfs3-helpers.c:894:nfs3_fill_entryp3] 0-nfs-nfsv3: Entry: fonts, ino: 14026039477647892365

Comment 5 Shehjar Tikoo 2011-04-05 02:55:37 UTC
Re-assigning to Shishir. He is aware of the patch that caused the regression and will be taking the necessary steps to fix the problem.

Comment 6 shishir gowda 2011-04-05 03:47:34 UTC
The patch which caused the regression has been reverted in this commit:
https://github.com/gluster/glusterfs/commit/05daec675f1716554864e34e0a3c9c71423b6594

The original issue has been fixed as part of this commit: http://patches.gluster.com/patch/6385/