1208065 – O_TRUNC ignored on NFS file with invalid cache entry

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1208065 - O_TRUNC ignored on NFS file with invalid cache entry

Summary: O_TRUNC ignored on NFS file with invalid cache entry

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	6.6
Hardware:	Unspecified
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Benjamin Coddington
QA Contact:	JianHong Yin
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-04-01 10:03 UTC by Brano Zarnovican
Modified:	2015-07-22 08:46 UTC (History)
CC List:	5 users (show)
Fixed In Version:	kernel-2.6.32-569.el6
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-07-22 08:46:34 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Node2 NFS calls (4.93 KB, application/octet-stream) 2015-04-01 10:03 UTC, Brano Zarnovican	no flags	Details
Node2 NFS calls in GOOD case (5.30 KB, application/octet-stream) 2015-04-01 10:06 UTC, Brano Zarnovican	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2015:1272	0	normal	SHIPPED_LIVE	Moderate: kernel security, bug fix, and enhancement update	2015-07-22 11:56:25 UTC

Description Brano Zarnovican 2015-04-01 10:03:47 UTC

Created attachment 1009603 [details]
Node2 NFS calls

Description of problem:
If you open an existing file for write and truncate (O_WRONLY|O_CREAT|O_TRUNC) which has a invalid negative cache on this host, file is NOT truncated. Instead it is simply rewriting existing content.

The problem is only reproducible if you pre-cache non-existence of the file first. Without this first step, file is truncated as expected.

Version-Release number of selected component (if applicable):
kernel 2.6.32-504.8.1.el6.x86_64
nfs-utils-1.2.3-54.el6.x86_64

How reproducible:
consistently

Steps to Reproduce:
1. Node2: do stat() on non-existing file "test_file1" on NFS

ls -l test_file1

2. Node1: create and populate "test_file1" with "AAAAAAAA". Close the file.

echo -n AAAAAAAA > test_file1

3. Node2: open("test_file1", O_WRONLY|O_CREAT|O_TRUNC, ..), write new content "BBBB" and close the file.

echo -n BBBB > test_file1

4. Node1/2: view file's content

cat test_file1

Actual results:
File's content is "BBBBAAAA"


Expected results:
File's content is "BBBB"


Additional info:
Note, that the content created by Node1 is written and committed to server before Node2 calls open(). This is not the case of concurrent writer-writer.

I was able to reproduce the problem on NFSv3, NFSv4.
I was able to reproduce in on 2.6.32 and 3.12.33-1.el6.x86_64 kernels.
I have tested it against Netapp NFS server. Looking at the tcpdump, the problem seems to be on client. So it should be NFS server independent.

I'm attaching tcpdump for the BAD case, as well as GOOD case, where the first step was skipped. The tcpdump is from C prog which is using only the minimal number of syscalls to reproduce it.

If this is expected behavior, I apologize for your wasted time ;)

Brano Zarnovican

Comment 1 Brano Zarnovican 2015-04-01 10:06:28 UTC

Created attachment 1009604 [details]
Node2 NFS calls in GOOD case

Comment 3 Brano Zarnovican 2015-04-28 13:26:53 UTC

More info on the problem..

* I'm able to reproduce it on Linux NFS server => it's client specific problem
* I'm able to reproduce it even if client mount the volume with "sync,noac,lookupcache=none". I was convinced that attribute caching contributes to the problem. What is weird, that the problem is reproducible even if you leave 5min delay between steps 2) and 3).
* You can workaround the problem by explicitly calling ftruncate() between open() and write() in step 3)

This issue was created as Private Bug by mistake. If someone is reading it that has permission to make it public, please do so. Apparently I cannot..

Regards,

Brano Zarnovican

Comment 4 J. Bruce Fields 2015-04-28 14:04:39 UTC

That doesn't look like expected behavior to me.  I'm surprised we haven't run across this before, but bugzilla searches aren't turning up a relevant bug.

Comment 5 Benjamin Coddington 2015-05-05 10:13:03 UTC

We're not setting the size attribute in nfs_open_create() because its expected to be done in nfs_atomic_lookup().  But in this case we do ->lookup without an open intent which creates the dentry, then lookup is skipped to create the file so the size attribute is not set.

This was fixed upstream a long time ago by moving to atomic_open().  Probably what needs to be done is to check intent and set attributes appropriately in nfs_open_create just as in nfs_atomic_lookup().

Comment 6 RHEL Program Management 2015-05-05 10:19:42 UTC

This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 7 Benjamin Coddington 2015-05-05 13:59:52 UTC

Well, lookup is not skipped, but setting the size attribute which would normally happen in ->lookup if we had an open intent is skipped in nfs_open_revalidate() since nfs_neg_need_reval() is optimizing away revalidation of negative dentries on create.

I'd prefer to fix this in nfs_open_create() rather than nfs_neg_need_reval().  Probably all that's needed here is something small and targeted to RHEL6, such as:

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index a7592b4..fb39e53 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1717,6 +1717,11 @@ static int nfs_open_create(struct inode *dir, struct dentry *dentry, int mode,
        if (IS_ERR(ctx))
                goto out_err_drop;

+       if (open_flags & O_TRUNC) {
+               attr.ia_valid |= ATTR_SIZE;
+               attr.ia_size = 0;
+       }
+
        error = NFS_PROTO(dir)->create(dir, dentry, &attr, open_flags, ctx);
        if (error != 0)
                goto out_put_ctx;

Comment 13 Kurt Stutsman 2015-06-16 18:06:57 UTC

Patch(es) available on kernel-2.6.32-569.el6

Comment 18 errata-xmlrpc 2015-07-22 08:46:34 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1272.html

Note You need to log in before you can comment on or make changes to this bug.