Bug 1374720 - Simultaneous add-brick and copy files on NFS mount results in intermittent "File exists" error
Summary: Simultaneous add-brick and copy files on NFS mount results in intermittent "F...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: gluster-nfs
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Jiffin
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-09 13:20 UTC by Prasad Desala
Modified: 2018-11-19 06:05 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-19 06:05:13 UTC
Target Upstream Version:


Attachments (Terms of Use)
strace of "cp -r /etc/ /mnt/nfs" (411.37 KB, application/x-gzip)
2016-10-18 07:24 UTC, Raghavendra G
no flags Details

Description Prasad Desala 2016-09-09 13:20:00 UTC
Description:
=============================================================================
When copy on NFS mount point is in progress simultaneously started adding new bricks using add-brick command. Below error were seen during the copy operation,

cp: cannot create directory ‘/mnt/nfs/test/dir/A1/B1/C1/D5/E4’: File exists
cp: cannot create directory ‘/mnt/nfs/test/dir/A2/B2/C2/D5/E3’: File exists
cp: cannot create directory ‘/mnt/nfs/test/dir/A4/B4/C1/D5/E2’: File exists
cp: cannot create directory ‘/mnt/nfs/test/dir/A5/B3/C5/D1/E4’: File exists
cp: cannot create directory ‘/mnt/nfs/test/dir/A6/B5/C5’: File exists
cp: cannot create directory ‘/mnt/nfs/test/dir/A7/B3/C3/D5/E2’: File exists
cp: cannot create directory ‘/mnt/nfs/test/dir/A20/B5/C1/D4/E1’: File exists

Version-Release number of selected component (if applicable):
3.7.9-10.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Create a distribute replica volume and start the volume.
2. On a client, mount the volume though NFS. 
3. Copy some files/directories to the NFS mount point with adding few bricks.

Actual results:

Intermittent "File exists" errors during copy.

Expected results:

There should not be any errors.

Additional info:

Volume name: distrep
mount point: /mnt/nfs

Comment 3 Prasad Desala 2016-09-15 05:39:31 UTC
This "File exists" error during copy operation is not impacting the basic DHT functionality here and after copy operation we are able to see the dir/file being present at both mount point and sub-vols.

Comment 4 Raghavendra G 2016-10-18 05:56:40 UTC
nfs server is restarted on each add-brick. So, it might happen that mkdir could be complete, but nfs server process died before it could send response back to nfs client. If nfs client retries mkdir, the retried mkdir would get an EEXIST from GNFS. How does nfs client handle this scenario? 
1. Does it retry mkdir?
2. Does it ignore EEXIST?
3. Does it send back EEXIST to application?

Comment 5 Raghavendra G 2016-10-18 06:20:56 UTC
Just to point out that EEXIST is seen not just for directories, but also while creating regular files.

[root@unused glusterfs]# strace -o /tmp/cp-etc-strace.log cp -rf /etc /usr .
cp: cannot create regular file `./etc/lvm/archive/patchy_snap_vg_4_00012-1776335244.vg': File exists
cp: cannot create regular file `./etc/lvm/archive/patchy_snap_vg_4_00004-102188235.vg': File exists
cp: cannot create regular file `./etc/lvm/archive/patchy_00093-1192875909.vg': File exists
cp: cannot create directory `./etc/lvm/cache': File exists

Comment 6 Raghavendra G 2016-10-18 07:22:17 UTC
[root@unused glusterfs]# strace -TCrttt -o /tmp/cp-etc-strace-2.log cp -rf /etc /usr .
cp: cannot create directory `./etc/xpdf': File exists
cp: cannot create directory `./etc/polkit-1/localauthority/50-local.d': File exists
cp: cannot create directory `./etc/cron.d': File exists
cp: cannot create directory `./etc/prelink.conf.d': File exists
cp: cannot create directory `./usr/share/evince/icons/hicolor/scalable/mimetypes': File exists
cp: cannot create directory `./usr/share/gimp/2.0/gradients': File exists


Just to gather more evidence on my hypothesis, I added timing parameter to strace and I could see all the mkdirs that failed took times in the order of 10s (A successful mkdir took time in the order of 0.02s). This strongly points that the mkdir was in progress at the time nfs server was restarted and the increased time is due to retrying by nfs-client.


[root@unused ~]# grep -i exist /tmp/cp-etc-strace-2.log 
     0.000227 mkdir("./etc/xpdf", 0755) = -1 EEXIST (File exists) <10.147695>
     0.000106 write(2, ": File exists", 13) = 13 <0.000031>
     0.000037 mkdir("./etc/polkit-1/localauthority/50-local.d", 0755) = -1 EEXIST (File exists) <10.201716>
     0.000036 write(2, ": File exists", 13) = 13 <0.000009>
     0.000141 mkdir("./etc/cron.d", 0755) = -1 EEXIST (File exists) <10.204024>
     0.000037 write(2, ": File exists", 13) = 13 <0.000009>
     0.000318 mkdir("./etc/prelink.conf.d", 0755) = -1 EEXIST (File exists) <10.410888>
     0.000034 write(2, ": File exists", 13) = 13 <0.000008>
     0.000465 mkdir("./usr/share/evince/icons/hicolor/scalable/mimetypes", 0755) = -1 EEXIST (File exists) <10.398526>
     0.000034 write(2, ": File exists", 13) = 13 <0.000008>
     0.000077 mkdir("./usr/share/gimp/2.0/gradients", 0755) = -1 EEXIST (File exists) <10.459501>
     0.000033 write(2, ": File exists", 13) = 13 <0.000008>

So, I think the problem is because of retry logic of nfs-client.

@Soumya/Niels,

Can I move this bug to GNFS? I don't see a problem with DHT here.

Comment 7 Raghavendra G 2016-10-18 07:24:09 UTC
Created attachment 1211621 [details]
strace of "cp -r /etc/ /mnt/nfs"

Comment 8 Niels de Vos 2016-10-18 07:38:50 UTC
Possible sequence that is problematic:

1. initiate recursive copy/mkdir (with 'cp')
2. the client sends a MKDIR
3. the nfs-server receives the MKDIR
4. the nfs-server passes the MKDIR on to a brick
5. before the brick replies, the nfs-server restarts (replies get lost)
6. the client is still waiting for a reply
7. the client (kernel NFS) resends the MKDIR
8. the reply from the brick and nfs-server contain EEXISTS
9. cp gets confused, the directory did not exist before

To prevent this, the nfs-server needs a persistent duplicate-request-cache and its own retry logic when a (DRC-d) request did not receive a reply yet.


Note You need to log in before you can comment on or make changes to this bug.