Bug 1344841

Summary:	[nfs]: cp fails with error "Too many levels of symbolic links"
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Rahul Hinduja <rhinduja>
Component:	gluster-nfs	Assignee:	Niels de Vos <ndevos>
Status:	CLOSED CURRENTRELEASE	QA Contact:	storage-qa-internal <storage-qa-internal>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	rhgs-3.1	CC:	rhs-bugs, skoduri, storage-qa-internal
Target Milestone:	---	Keywords:	ZStream
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-02-14 07:13:49 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Rahul Hinduja 2016-06-11 15:56:30 UTC

Description of problem:
=======================

I had 4x2 volume mounted on client (Fuse and NFS). While copying etc recursively via nfs client, found the following error:

[root@dj n]# for i in {1..5} ; do cp -rf /etc etc.$i ; done
cp: failed to access ‘etc.2’: Too many levels of symbolic links
[root@dj n]# 

As a result of lookup failure, the directory doesn't exist/copied at target. Causing data loss. 

[root@dj n]# ll
total 16
drwxr-xr-x. 85 root root 4096 Jun 11  2016 etc.1
drwxr-xr-x. 85 root root 4096 Jun 11  2016 etc.3
drwxr-xr-x. 85 root root 4096 Jun 11  2016 etc.4
drwxr-xr-x. 85 root root 4096 Jun 11  2016 etc.5
[root@dj n]# 


Dmesg logs says:
================

[540414.781146] VFS: Lookup of 'etc.2' in nfs 0:38 would have caused loop

Have tried reproducing this multiple times after that and did not see the issue. Raising a bug as a tracker if we could hit again


Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.7.9-10


How reproducible:
=================
Tried multiple times didn't see the issue. 


Steps to Reproduce:
===================

Will try to reproduce and if it happens will collect the packet trace and update the result.

Comment 3 Soumya Koduri 2016-06-11 16:55:25 UTC

We had once encountered similar issue while using nfs-ganesha. The issue was that the volume bricks already contained few sub-directories which were used as bricks at some point. So both root directory and those sub-directories had root gfid resulting ELOOP during inode_link. But Rahul confirmed that the bricks in this setup were created a fresh and doesn't contain any stale data.

Next time this issue is hit, along with the packet trace, I request to check  gfid of the directory/file which got this error and all its parent directories as well.