RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 625078 - [NFS]: silly renamed .nfs0000* files can be left on fs forever
Summary: [NFS]: silly renamed .nfs0000* files can be left on fs forever
Keywords:
Status: CLOSED DUPLICATE of bug 653066
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Jeff Layton
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On: 511901
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-08-18 14:34 UTC by Fabio Olive Leite
Modified: 2010-11-15 16:50 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 511901
Environment:
Last Closed: 2010-11-15 14:03:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch -- make sillyrename an async operation (23.39 KB, patch)
2010-09-10 14:03 UTC, Jeff Layton
no flags Details | Diff

Description Fabio Olive Leite 2010-08-18 14:34:43 UTC
+++ This bug was initially created as a clone of Bug #511901 +++

Description of problem:

Parallels Virtuozzo Containers/OpenVZ linux kernel team found that sometimes NFS temporary (silly-rename: .nfs0000*) files can be left on a filesystem when no process has them open.
Kernel version affected: checked 2.6.18-8.el5, 2.6.18-128.2.1.el5 x86_64, both are affected.

Reproducer source (test_nfs_exit_thr4.c) is in attach.
Brief description: create a thread which will create and unlink files in a cycle, then exit main thread without waiting for children.

How to reproduce:

[root@tom nfs]# uname -a
Linux HOSTNAME 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
[root@tom ~]# mount NFSSERVERNAME:/vz/export/tom /mnt/nfs/
[root@tom ~]# cd /mnt/nfs/
[root@tom nfs]# mount |grep nfs
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
NFSSERVERNAME:/vz/export/tom on /mnt/nfs type nfs (rw,addr=NFSSERVERIP)
[root@tom nfs]# gcc /tmp/test_nfs_exit_thr4.c -pthread -o /tmp/test
[root@tom nfs]# ls -Flaio
total 12
13403720 drwxr-xr-x  2 root 4096 Jul 15 18:00 ./
 1498497 drwxr-xr-x  3 root 4096 Jul 15 16:40 ../
[root@tom nfs]# /tmp/test
main: exit ok!
[root@tom nfs]# /tmp/test
main: exit ok!
[root@tom nfs]# ls -Flaio
total 12
13403720 drwxr-xr-x  2 root 4096 Jul 15 17:55 ./
 1498497 drwxr-xr-x  3 root 4096 Jul 15 16:40 ../
 13403721 ----------  1 root    0 Jan  5  1970 ashfilecs2QiM
 13403722 -rw-------  1 root    0 Jul 15 17:55 .nfs0000000000cc864a000005a8
[root@tom nfs]# ps axf |grep test
 3147 pts/0    S+     0:00          \_ grep test

Note1: ashfiles* files left are ok, they were created but not unlinked before the thread killed, but .nfs* files left seems to be wrong.
Note2: the reproducer does not trigger .nfs* files appearance in 100% cases but still very often.

--- Additional comment from jlayton on 2010-06-09 10:37:44 EDT ---

FWIW, I've been able to reproduce this on 2.6.34-ish kernels too. The race window there seems to be slightly smaller for reasons that aren't exactly clear to me, but it's still present there.

--> CLONING FOR RHEL-6

--- Additional comment from jlayton on 2010-06-28 14:26:46 EDT ---

cc'ing Trond in case he has thoughts on this...

We can't easily make this function use an uninterruptible sleep. What we may actually need to do is make the rename asynchronous, and have the sillyrename thread wait on its completion. That way, if the thread is killed, everything still should proceed to completion.

Comment 2 Jeff Layton 2010-08-18 15:15:14 UTC
I own the RHEL-5 one, so I'll grab the RHEL6 one as well. We'll definitely want to fix this in RHEL6, but a fix will need to go upstream first. I sent a note about this problem to the upstream ML along with what I think is the best way to fix it:

    http://www.spinics.net/lists/linux-nfs/msg15082.html

I'll look at this as soon as I have some time to spend on it.

Comment 3 Jeff Layton 2010-09-10 14:03:01 UTC
Created attachment 446520 [details]
patch -- make sillyrename an async operation

This patchset seems to fix the problem for me. The basic approach is to make sillyrename an asynchronous operation. The caller just waits for the task to complete. If a task is interrupted via SIGKILL, the sillyrename operation will still continue in the background.

This still needs more testing and I need to clean up the commitlog contents, but I'll probably post this upstream within the next week or so. Any testing feedback would be appreciated...

Comment 4 Jeff Layton 2010-09-10 17:24:06 UTC
Ran reproducer in a loop over several hours and ended up with 0 leftover .nfs* files. Seems to work as expected. I'll clean up the set a bit and plan to send it upstream soon.

Comment 5 Jeff Layton 2010-09-19 22:28:46 UTC
Trond is planning to push the patchset for 2.6.37. If all goes well, we should be able to make 6.1 with this.

Comment 6 RHEL Program Management 2010-10-05 01:57:06 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.


Note You need to log in before you can comment on or make changes to this bug.