Hide Forgot
Description: ============================================================================= When copy on NFS mount point is in progress simultaneously started adding new bricks using add-brick command. Below error were seen during the copy operation, cp: cannot create directory ‘/mnt/nfs/test/dir/A1/B1/C1/D5/E4’: File exists cp: cannot create directory ‘/mnt/nfs/test/dir/A2/B2/C2/D5/E3’: File exists cp: cannot create directory ‘/mnt/nfs/test/dir/A4/B4/C1/D5/E2’: File exists cp: cannot create directory ‘/mnt/nfs/test/dir/A5/B3/C5/D1/E4’: File exists cp: cannot create directory ‘/mnt/nfs/test/dir/A6/B5/C5’: File exists cp: cannot create directory ‘/mnt/nfs/test/dir/A7/B3/C3/D5/E2’: File exists cp: cannot create directory ‘/mnt/nfs/test/dir/A20/B5/C1/D4/E1’: File exists Version-Release number of selected component (if applicable): 3.7.9-10.el7rhgs.x86_64 How reproducible: Always Steps to Reproduce: 1. Create a distribute replica volume and start the volume. 2. On a client, mount the volume though NFS. 3. Copy some files/directories to the NFS mount point with adding few bricks. Actual results: Intermittent "File exists" errors during copy. Expected results: There should not be any errors. Additional info: Volume name: distrep mount point: /mnt/nfs
This "File exists" error during copy operation is not impacting the basic DHT functionality here and after copy operation we are able to see the dir/file being present at both mount point and sub-vols.
nfs server is restarted on each add-brick. So, it might happen that mkdir could be complete, but nfs server process died before it could send response back to nfs client. If nfs client retries mkdir, the retried mkdir would get an EEXIST from GNFS. How does nfs client handle this scenario? 1. Does it retry mkdir? 2. Does it ignore EEXIST? 3. Does it send back EEXIST to application?
Just to point out that EEXIST is seen not just for directories, but also while creating regular files. [root@unused glusterfs]# strace -o /tmp/cp-etc-strace.log cp -rf /etc /usr . cp: cannot create regular file `./etc/lvm/archive/patchy_snap_vg_4_00012-1776335244.vg': File exists cp: cannot create regular file `./etc/lvm/archive/patchy_snap_vg_4_00004-102188235.vg': File exists cp: cannot create regular file `./etc/lvm/archive/patchy_00093-1192875909.vg': File exists cp: cannot create directory `./etc/lvm/cache': File exists
[root@unused glusterfs]# strace -TCrttt -o /tmp/cp-etc-strace-2.log cp -rf /etc /usr . cp: cannot create directory `./etc/xpdf': File exists cp: cannot create directory `./etc/polkit-1/localauthority/50-local.d': File exists cp: cannot create directory `./etc/cron.d': File exists cp: cannot create directory `./etc/prelink.conf.d': File exists cp: cannot create directory `./usr/share/evince/icons/hicolor/scalable/mimetypes': File exists cp: cannot create directory `./usr/share/gimp/2.0/gradients': File exists Just to gather more evidence on my hypothesis, I added timing parameter to strace and I could see all the mkdirs that failed took times in the order of 10s (A successful mkdir took time in the order of 0.02s). This strongly points that the mkdir was in progress at the time nfs server was restarted and the increased time is due to retrying by nfs-client. [root@unused ~]# grep -i exist /tmp/cp-etc-strace-2.log 0.000227 mkdir("./etc/xpdf", 0755) = -1 EEXIST (File exists) <10.147695> 0.000106 write(2, ": File exists", 13) = 13 <0.000031> 0.000037 mkdir("./etc/polkit-1/localauthority/50-local.d", 0755) = -1 EEXIST (File exists) <10.201716> 0.000036 write(2, ": File exists", 13) = 13 <0.000009> 0.000141 mkdir("./etc/cron.d", 0755) = -1 EEXIST (File exists) <10.204024> 0.000037 write(2, ": File exists", 13) = 13 <0.000009> 0.000318 mkdir("./etc/prelink.conf.d", 0755) = -1 EEXIST (File exists) <10.410888> 0.000034 write(2, ": File exists", 13) = 13 <0.000008> 0.000465 mkdir("./usr/share/evince/icons/hicolor/scalable/mimetypes", 0755) = -1 EEXIST (File exists) <10.398526> 0.000034 write(2, ": File exists", 13) = 13 <0.000008> 0.000077 mkdir("./usr/share/gimp/2.0/gradients", 0755) = -1 EEXIST (File exists) <10.459501> 0.000033 write(2, ": File exists", 13) = 13 <0.000008> So, I think the problem is because of retry logic of nfs-client. @Soumya/Niels, Can I move this bug to GNFS? I don't see a problem with DHT here.
Created attachment 1211621 [details] strace of "cp -r /etc/ /mnt/nfs"
Possible sequence that is problematic: 1. initiate recursive copy/mkdir (with 'cp') 2. the client sends a MKDIR 3. the nfs-server receives the MKDIR 4. the nfs-server passes the MKDIR on to a brick 5. before the brick replies, the nfs-server restarts (replies get lost) 6. the client is still waiting for a reply 7. the client (kernel NFS) resends the MKDIR 8. the reply from the brick and nfs-server contain EEXISTS 9. cp gets confused, the directory did not exist before To prevent this, the nfs-server needs a persistent duplicate-request-cache and its own retry logic when a (DRC-d) request did not receive a reply yet.