Bug 1374720

Summary:

Simultaneous add-brick and copy files on NFS mount results in intermittent "File exists" error

Product:

[Red Hat Storage] Red Hat Gluster Storage

Reporter:

Prasad Desala <tdesala>

Component:

gluster-nfs

Assignee:

Jiffin <jthottan>

Status:

CLOSED WONTFIX

QA Contact:

storage-qa-internal <storage-qa-internal>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

rhgs-3.1

CC:

jthottan, ndevos, rhs-bugs, skoduri, storage-qa-internal

Target Milestone:

---

Keywords:

FutureFeature

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Known Issue

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-11-19 06:05:13 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
strace of "cp -r /etc/ /mnt/nfs"	none

Description Prasad Desala 2016-09-09 13:20:00 UTC

Description:
=============================================================================
When copy on NFS mount point is in progress simultaneously started adding new bricks using add-brick command. Below error were seen during the copy operation,

cp: cannot create directory ‘/mnt/nfs/test/dir/A1/B1/C1/D5/E4’: File exists
cp: cannot create directory ‘/mnt/nfs/test/dir/A2/B2/C2/D5/E3’: File exists
cp: cannot create directory ‘/mnt/nfs/test/dir/A4/B4/C1/D5/E2’: File exists
cp: cannot create directory ‘/mnt/nfs/test/dir/A5/B3/C5/D1/E4’: File exists
cp: cannot create directory ‘/mnt/nfs/test/dir/A6/B5/C5’: File exists
cp: cannot create directory ‘/mnt/nfs/test/dir/A7/B3/C3/D5/E2’: File exists
cp: cannot create directory ‘/mnt/nfs/test/dir/A20/B5/C1/D4/E1’: File exists

Version-Release number of selected component (if applicable):
3.7.9-10.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Create a distribute replica volume and start the volume.
2. On a client, mount the volume though NFS. 
3. Copy some files/directories to the NFS mount point with adding few bricks.

Actual results:

Intermittent "File exists" errors during copy.

Expected results:

There should not be any errors.

Additional info:

Volume name: distrep
mount point: /mnt/nfs

Comment 3 Prasad Desala 2016-09-15 05:39:31 UTC

This "File exists" error during copy operation is not impacting the basic DHT functionality here and after copy operation we are able to see the dir/file being present at both mount point and sub-vols.

Comment 4 Raghavendra G 2016-10-18 05:56:40 UTC

nfs server is restarted on each add-brick. So, it might happen that mkdir could be complete, but nfs server process died before it could send response back to nfs client. If nfs client retries mkdir, the retried mkdir would get an EEXIST from GNFS. How does nfs client handle this scenario? 
1. Does it retry mkdir?
2. Does it ignore EEXIST?
3. Does it send back EEXIST to application?

Comment 5 Raghavendra G 2016-10-18 06:20:56 UTC

Just to point out that EEXIST is seen not just for directories, but also while creating regular files.

[root@unused glusterfs]# strace -o /tmp/cp-etc-strace.log cp -rf /etc /usr .
cp: cannot create regular file `./etc/lvm/archive/patchy_snap_vg_4_00012-1776335244.vg': File exists
cp: cannot create regular file `./etc/lvm/archive/patchy_snap_vg_4_00004-102188235.vg': File exists
cp: cannot create regular file `./etc/lvm/archive/patchy_00093-1192875909.vg': File exists
cp: cannot create directory `./etc/lvm/cache': File exists

Comment 6 Raghavendra G 2016-10-18 07:22:17 UTC

[root@unused glusterfs]# strace -TCrttt -o /tmp/cp-etc-strace-2.log cp -rf /etc /usr .
cp: cannot create directory `./etc/xpdf': File exists
cp: cannot create directory `./etc/polkit-1/localauthority/50-local.d': File exists
cp: cannot create directory `./etc/cron.d': File exists
cp: cannot create directory `./etc/prelink.conf.d': File exists
cp: cannot create directory `./usr/share/evince/icons/hicolor/scalable/mimetypes': File exists
cp: cannot create directory `./usr/share/gimp/2.0/gradients': File exists


Just to gather more evidence on my hypothesis, I added timing parameter to strace and I could see all the mkdirs that failed took times in the order of 10s (A successful mkdir took time in the order of 0.02s). This strongly points that the mkdir was in progress at the time nfs server was restarted and the increased time is due to retrying by nfs-client.


[root@unused ~]# grep -i exist /tmp/cp-etc-strace-2.log 
     0.000227 mkdir("./etc/xpdf", 0755) = -1 EEXIST (File exists) <10.147695>
     0.000106 write(2, ": File exists", 13) = 13 <0.000031>
     0.000037 mkdir("./etc/polkit-1/localauthority/50-local.d", 0755) = -1 EEXIST (File exists) <10.201716>
     0.000036 write(2, ": File exists", 13) = 13 <0.000009>
     0.000141 mkdir("./etc/cron.d", 0755) = -1 EEXIST (File exists) <10.204024>
     0.000037 write(2, ": File exists", 13) = 13 <0.000009>
     0.000318 mkdir("./etc/prelink.conf.d", 0755) = -1 EEXIST (File exists) <10.410888>
     0.000034 write(2, ": File exists", 13) = 13 <0.000008>
     0.000465 mkdir("./usr/share/evince/icons/hicolor/scalable/mimetypes", 0755) = -1 EEXIST (File exists) <10.398526>
     0.000034 write(2, ": File exists", 13) = 13 <0.000008>
     0.000077 mkdir("./usr/share/gimp/2.0/gradients", 0755) = -1 EEXIST (File exists) <10.459501>
     0.000033 write(2, ": File exists", 13) = 13 <0.000008>

So, I think the problem is because of retry logic of nfs-client.

@Soumya/Niels,

Can I move this bug to GNFS? I don't see a problem with DHT here.

Comment 7 Raghavendra G 2016-10-18 07:24:09 UTC

Created attachment 1211621 [details]
strace of "cp -r /etc/ /mnt/nfs"

Comment 8 Niels de Vos 2016-10-18 07:38:50 UTC

Possible sequence that is problematic:

1. initiate recursive copy/mkdir (with 'cp')
2. the client sends a MKDIR
3. the nfs-server receives the MKDIR
4. the nfs-server passes the MKDIR on to a brick
5. before the brick replies, the nfs-server restarts (replies get lost)
6. the client is still waiting for a reply
7. the client (kernel NFS) resends the MKDIR
8. the reply from the brick and nfs-server contain EEXISTS
9. cp gets confused, the directory did not exist before

To prevent this, the nfs-server needs a persistent duplicate-request-cache and its own retry logic when a (DRC-d) request did not receive a reply yet.