Bug 869942 - Kernel crashes on reading an ACL containing 190 ACEs over NFSv4
Kernel crashes on reading an ACL containing 190 ACEs over NFSv4
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.3
x86_64 Linux
urgent Severity high
: rc
: ---
Assigned To: nfs-maint
Yongcheng Yang
: Reopened, ZStream
Depends On:
Blocks: 1427974 1449096
  Show dependency treegraph
 
Reported: 2012-10-25 03:52 EDT by Albert Flügel
Modified: 2018-06-19 00:52 EDT (History)
16 users (show)

See Also:
Fixed In Version: kernel-2.6.32-703.el6
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1427974 1449096 (view as bug list)
Environment:
Last Closed: 2018-06-19 00:50:10 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Network trace until the crash (23.71 KB, application/octet-stream)
2012-10-25 03:52 EDT, Albert Flügel
no flags Details
Stack trace when crashing (14.34 KB, image/png)
2012-10-25 03:53 EDT, Albert Flügel
no flags Details
Patch for fs/nfs/nfs4proc.c which fixes the issue here (2.11 KB, patch)
2012-10-25 11:16 EDT, Albert Flügel
no flags Details | Diff
Patch to really fix the NFSv4 ACL problem (3.02 KB, patch)
2012-10-29 06:07 EDT, Albert Flügel
no flags Details | Diff
Patch to fix NFS4 ACL reading for kernel 2.6.32-279.19.1 (1.49 KB, patch)
2013-01-17 03:54 EST, Albert Flügel
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:1854 None None None 2018-06-19 00:52 EDT

  None (edit)
Description Albert Flügel 2012-10-25 03:52:17 EDT
Created attachment 633216 [details]
Network trace until the crash

Description of problem:
kernel crashes when reading an ACL containing 190 ACES over NFSv4

Version-Release number of selected component (if applicable):
2.6.32-279.9.1 or
2.6.32-279.5.2

How reproducible:
mount a directory using NFS version 4, set an ACL containing 190 ACEs
on a directory and read the ACL using e.g. nfs4_getfacl

Steps to Reproduce:
1. e.g. mount -o vers=4 fileserver:/whatever/directory
2. use nfs4_setfacl to set 190 ACEs or do the same from Solaris-10 or another NFS4 capable OS on the fileserver (a NetApp here)
3. use nfs4_getfacl to read the ACL
  
Actual results:
Kernel crashes

Expected results:
ACL is listed like with more or less ACEs

Additional info:
Tried with RHE6.2 and RHE6.3 (kernel versions see above)
It happens with 190 ACEs, not with more (up to 400 tested)
and not with less.
The Fedora-17 kernel (e.g. 3.6.2-4) does not crash, but i get a
strange error message:
Failed getxattr operation: Numerical result out of range 
On this Fedora installation, with more than 190 ACEs a different
problem appears: the ACL is listed only every 2nd call of
nfs4_getfacl, but i think, this is a different issue.
The network trace with the reply package is attached and looks ok.
When the retransmits occur, the machine has crashed.
Solaris-10 does not crash and has no problem with this 190 ACEs.
A stack trace of the crash situation is also attached.
Comment 1 Albert Flügel 2012-10-25 03:53:54 EDT
Created attachment 633217 [details]
Stack trace when crashing
Comment 3 Albert Flügel 2012-10-25 11:16:21 EDT
Created attachment 633405 [details]
Patch for fs/nfs/nfs4proc.c which fixes the issue here

It is a partial backport from the 3.6.2-4.fc17 kernel of Fedora 17.
As RPC handling has changed a bit there i only extracted the parts
i think are relevant. They affect mainly the (as far as i have
understood) incorrect page memory handling.
Comment 4 Albert Flügel 2012-10-29 06:07:28 EDT
Created attachment 634959 [details]
Patch to really fix the NFSv4 ACL problem

The previous attachment had the effect, that the problem was not visible anymore, but did not fix the real cause.
As far as i see the problem is as follows:
When an ACL covers exactly the multiple of a page size, not enough pages are allocated for calling the nfs4_call_sync function. This function adds an offset in the beginning (res.data_offset). Frankly i didn't examine, what it is gogd for (alignment ?). However, it is greater than zero and the function seems not to check, if the passed page list is still large enough to hold the ACL plus offset.
So if the ACL is e.g. 8192 bytes in size (what i've seen here), the function writes somewhere onto a third page. The crash does not happen immediately, but later in __copy_from_pages . Seems, some pointer have been corrupted by nfs4_call_sync .
So a solution is to allocate another page, just in case. Probably it is sufficient to add some amount when calculating the required pages. But then it must be known, how large the data_offset can become (looking at nfs4xdr.c it might be 32 bytes minus 1 for alignment ?). However i chose the easy way, what might be a little bit suboptimal for performance.
Other things taken from the Fedora kernel 3.6.2-4 is the capability to copy from more than one page to the cache in nfs4_write_cached_acl and always allocating the scratch page, what seems to make sense for the same reason like outlined above.

BTW i'm not sure whether this problem is really fixed in the 3.6.2-4 or whether it's just good luck, that things are working there due to other small changes. Will try to examine, when i find the time.
Comment 5 Sachin Prabhu 2012-10-29 08:15:09 EDT
Thanks for the bug report Albert. This problem is caused when the size of the ACL is such that the size is less than a page size but ACL size + bitmap size is greater than a page.

This has been fixed in the devel version of the RHEL 6.4 kernel and the patches are also being backported to an errata release for 6.3.

Given below is the changelog for the RHEL kernel which pertains to the fix.

* Wed Oct 17 2012 Jarod Wilson <jarod@redhat.com> [2.6.32-333.el6]
..
- [net] nfs: Fix buffer overflow checking in __nfs4_get_acl_uncached (Sachin Prabhu) [822871] {CVE-2012-2375}
- [net] nfs: Fix the acl cache size calculation (Sachin Prabhu) [822871] {CVE-2012-2375}
- [net] nfs: Fix range checking in __nfs4_get_acl_uncached and __nfs4_proc_set_acl (Sachin Prabhu) [822871] {CVE-2012-2375}
- [net] nfs: nfs_getaclargs.acl_len is a size_t (Sachin Prabhu) [822871] {CVE-2012-2375}
- [net] nfs: Don't use private xdr_stream fields in decode_getacl (Sachin Prabhu) [822871] {CVE-2012-2375}
- [net] nfs: Fix pointer arithmetic in decode_getacl (Sachin Prabhu) [822871] {CVE-2012-2375}
- [net] nfs: Simplify the GETATTR attribute length calculation (Sachin Prabhu) [822871] {CVE-2012-2375}
- [net] sunrpc: Add the helper xdr_stream_pos (Sachin Prabhu) [822871] {CVE-2012-2375}
- [net] sunrpc: Don't decode beyond the end of the RPC reply message (Sachin Prabhu) [822871] {CVE-2012-2375}
- [net] sunrpc: Clean up xdr_set_iov() (Sachin Prabhu) [822871] {CVE-2012-2375}
- [net] sunrpc: xdr_read_pages needs to clear xdr->page_ptr (Sachin Prabhu) [822871] {CVE-2012-2375}
- [fs] nfs: Avoid beyond bounds copy while caching ACL (Sachin Prabhu) [822871] {CVE-2012-2375}
- [fs] nfs: Avoid reading past buffer when calling GETACL (Sachin Prabhu) [822871] {CVE-2012-2375}


I am closing this bz as a dup. Please reopen if you have any concerns about the fix.

*** This bug has been marked as a duplicate of bug 822871 ***
Comment 6 Albert Flügel 2013-01-16 08:57:01 EST
Unfortunately there's a different problem now:

Doing this in userland makes the kernel 2.6.32-279.19.1.el6 crash:

  char	buf[8192];

    r = getxattr(argv[i], "system.nfs4_acl", buf, 128);
    r = getxattr(argv[i], "system.nfs4_acl", buf, 0);
    r = getxattr(argv[i], "system.nfs4_acl", buf, 4096);

They all return 420 in my testcase, especially the 1st one, too,
writing 420 bytes to buf. Really ! See the strace of the 1st call:

getxattr("setfacl", "system.nfs4_acl", "\x00\x00\x00\x0c\x00\x00\x00\x00\x00\x00\x00\x00\x00\x16\x01\xe7\x00\x00\x00\x06OWNER@v\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x06OWNER@fi\x00\x00\x00\x01\x00\x00\x00@\x00\x04\x01F\x00\x00\x00\x06GROUP@71\x00\x00\x00\x00\x00\x00\x00@\x00\x12\x00\xa1\x00\x00\x00\x06GROUP@\x00\xa0\x00\x00\x00\x01\x00\x00\x00@\x00\x04\x01F\x00\x00\x00\x1somegroup@my.domain.com/z\x00\x00\x00\x00\x00\x00\x00@\x00\x12\x00\x81\x00\x00\x00\x1somegroup2@my.domain.com\x00\x00\x00\x00\x00\x01\x00\x00\x00@\x00\x04\x01F\x00\x00\x00\x1somegroup3@my.domain.com\x00\x00\x00\x00\x00\x00\x00\x00\x00@\x00\x12\x00\x81\x00\x00\x00\x1somegroup4@my.domain.com\xbb\xbd\x00\x00\x00\x01\x00\x00\x00@\x00\x04\x01f\x00\x00\x00\x1somegroup5@my.domain.com2l\x00\x00\x00\x01\x00\x00\x00@\x00\x04\x01f\x00\x00\x00\x1somegroup6@my.domain.com\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x12\x00\xa1\x00\x00\x00\x09EVERYONE@\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x04\x01F\x00\x00\x00\x09EVERYONE@GB ", 128) = 420

This should not happen and causes also kernel internal problems making the system crash.
Comment 7 Albert Flügel 2013-01-16 09:24:23 EST
The problem seems to me to be in this section starting with line 3492 in fs/nfs/nfs4proc.c:

	/* Handle the case where the passed-in buffer is too short */
	if (res.acl_flags & NFS4_ACL_TRUNC) {
		/* Did the user only issue a request for the acl length? */
		if (buf == NULL)
			goto out_ok;
		ret = -ERANGE;
		goto out_free;
	}
	nfs4_write_cached_acl(inode, pages, res.acl_data_offset, res.acl_len);
	if (buf)
		_copy_from_pages(buf, pages, res.acl_data_offset, res.acl_len);
out_ok:
	ret = res.acl_len;

If buf is not NULL, but too short, the _copy_from_pages is called nonetheless.
NFS4_ACL_TRUNC is in res.acl_flags, if the "passed-in buffer" is too short.
But this buffer is a multiple of page sizes, not of the buflen coming from
userland as 4th argument of getxattr. So the check for too small buf must
be done nonetheless.

From the man-page of getxattr:
       The  interface  is  designed to allow guessing of initial buffer sizes,
       and to enlarge buffers when the return value indicates that the  buffer
       provided was too small.

this must be obeyed.
Comment 8 Albert Flügel 2013-01-17 03:52:42 EST
And:
There's still an issue, when acl-size + bitmap-size requires an additional
page compared to acl-size alone. In this case, the ACL is not completely
decoded in nfs4xdr.c and leads e.g. to 'Invalid argument' in user space of
the nfs4_getfacl program, though enough bytes are written back to user
space, but, as stated the last ones are random bytes, because
nfs4_call_sync(NFS_SERVER(inode), &msg, &args, &res, 0);
(line 3488 in my version of nfs4proc.c) did not decode the entire acl data
received from the server. Still, an additional page is required.

The lack of checking the buffer size and overwriting other data like outlined
in the previous comment, when the buffer is too short, but not NULL, might
have security relevance.

I suggest the attached patch kernel-2.6.32-279.19_fs_nfs_nfs4proc.c.patch
Comment 9 Albert Flügel 2013-01-17 03:54:30 EST
Created attachment 680079 [details]
Patch to fix NFS4 ACL reading for kernel 2.6.32-279.19.1
Comment 10 Sachin Prabhu 2013-01-18 05:21:38 EST
Created attachment 682296 [details]
Reproducer

Mount share on /mnt and touch file /mnt/test. 

The reproducer attempts to fetch nfs4 acls for the file /mnt/test. It sends a buffer of just 10 bytes to the getxattr command. This buffer size was selected because a) It is smaller than the actual acl length returned. b) It will result in a minimum of a page allocated in __nfs4_get_acl_uncached().

On the first call, it fails in the following manner
# ./mytest 
Size = 80
Segmentation fault

Any subsequent call will result in a ERANGE error being returned as the getxattr command attempts to service the request from cache.
The machine crashes when we attempt to unmount the share.

I can reproduce this on RHEL 6 and on F18(3.6.10-4.fc18.x86_64). 

This has been fixed upstream by the commit
7d3e91a89b7adbc2831334def9e494dd9892f9af
Comment 11 Sachin Prabhu 2013-01-18 06:05:24 EST
Few corrections and additions to the note above.

The server may not crash on unmount. In which case we may need to do this multiple times. The memory corruption will eventually result in OOps messages.
This can also be trigerred by an unpriviledged user.
Comment 12 Sachin Prabhu 2013-01-18 09:02:47 EST
Albert, 

About patch in c#9, you shouldn't actually need the first part of the patch for versions 2.6.32-279.19.1.el6 since it contains the patches mentioned in c#5. The only patch required is the one provided by upstream commit 
7d3e91a89b7adbc2831334def9e494dd9892f9af

I've created a test package and tested it locally and can confirm that it works fine.

Sachin Prabhu
Comment 14 Albert Flügel 2013-01-21 02:57:00 EST
Sorry, the first part of the patch IS needed. When you test only with small simple examples like acl sizes of 80 bytes, you surely do not see the problem, because you never reach the point, where several pages are required and the additional page would be necessary to store the bitmap. Please test setting as many ACEs as possible using nfs4_setfacl with account and group names of varying length and you will get "invalid argument errors" on some ACE number. Especially try to set an ACL with a size: N * 1024 - 4 <= size <= N * 1024

And: sorry, i have no access to "commit 7d3e91a89b7adbc2831334def9e494dd9892f9af" (at least i cannot find it) and no access to the duplicate bug 822871 and to many other patches. The current kernel SRPMs do not contain patches, that are applied when building the RPM. So it's a bit hard for me to judge what you are talking about.
Comment 15 Sachin Prabhu 2013-01-21 08:41:59 EST
Albert,

What I mean by the upstream commit 7d3e91a89b7adbc2831334def9e494dd9892f9af is the commit which went into Linus's linux kernel tree.
ie. 
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=7d3e91a89b7adbc2831334def9e494dd9892f9af;hp=4c1002100898d03c5c9142ffaf58351c841ab94a

This is the same upstream commit which part of your patch is based on.

As for the remaining part of the patch, We have had a number of patches in the kernel to fix the problem when the ACL size + bitmap size is greater than the buffer allocated. These patches are listed in c#5 and are contained in the 2.6.32-279.19.el6 kernel. 
The corresponding upstream commits are
5a00689930ab975fdd1b37b034475017e460cf2a
5794d21ef4639f0e33440927bb903f9598c21e92
76cacaabf15a593833d96a65a1a251002bd88178
1537693ceaa8d6484f1ce631bec85658bfa9816c
bfeea1dc1c9238f9e5a17a265489ecd0d19f66bb
4517d526c8aa31b5c14165ef180cc19518ff0a35
256e48bb473b631fbb5aa03d6ed38c652ad3caa7
519d3959e30a98f8e135e7a16647c10af5ad63d5
cff298c721099c9ac4cea7196a37097ba2847946
56d08fef2369d5ca9ad2e1fc697f5379fd8af751
21f498c2f73bd6150d82931f09965826dca0b5f2
b291f1b1c86aa0c7bc3df2994e6a1a4e53f1fde0
1f1ea6c2d9d8c0be9ec56454b05315273b5de8ce

You can access these individual upstream patches using the git show <commit message> on the upstream kernel git repo.

Can you please test your reproducer on the 2.6.32-279.19.el6 kernel with just the single patch from upstream commit
7d3e91a89b7adbc2831334def9e494dd9892f9af

BZ 822871 contains a RHEL 6.4 only bz for this problem. The corresponding public bz is bz 822869.

Sachin Prabhu
Comment 16 Sachin Prabhu 2013-01-21 09:48:13 EST
Created attachment 684351 [details]
Proposed patch
Comment 17 Sachin Prabhu 2013-01-21 10:19:51 EST
Albert,

Can you please contact Red Hat support for a test kernel based on the patch above. Or alternatively, you can build a test kernel using the 2.6.32-279.19.1.el6 srpm and test for both issues.

I will push this for inclusion in the Red Hat kernels once you have successfully tested. If you still think there is an issue with certain sizes of ACLs, can you please provide me with a reproducer.

Sachin Prabhu
Comment 18 Albert Flügel 2013-01-21 10:38:24 EST
Hi,

this patch from #16 added to 2.6.32-279.19.1 was exactly, what i did first
to fix the crashing. Then i found,
that i still have a functional problem and increased the number of pages
by 1. Ok, if this helps, i'll build a kernel with only this fix and try to
construct a case, where the ACL gets written back to userspace incompletely.
Around 100 ACEs are needed to see the problem. But it will take a bit of time
as i have also other things to do and i have a working machine behaving
correctly.

Albert Flügel
Comment 19 Albert Flügel 2013-01-22 08:12:11 EST
The reproducer is to try to set ACEs with a total ACL size between 4084 and 4096. As the size depends also on the NFS4 domain name, please try yourself how to reproduce. The size of the base ACL is 156 bytes. What i do:
add 2..5 ACLs of size 40, then 100 ACLs of size 44 using
nfs4_setfacl -a A::account@nfs4domain:R testdirectory

The ACE size is always a multiple of 4 bytes. Please check yourself
using strace, what size getxattr returns to obtain a size increase of 40 and 44.
When the ACL has a size starting with 4084 up to 4096, you will see the
error message
Failed getxattr operation: Numerical result out of range
Failed to instantiate ACL.

and in strace one of those:
getxattr("/home/NFSv4_ACL_Test/acltest/test", "system.nfs4_acl", 0x0, 0) = 4084
getxattr("/home/NFSv4_ACL_Test/acltest/test", "system.nfs4_acl", 0x7f0e25f6f040, 4084) = -1 ERANGE (Numerical result out of range)

getxattr("/home/NFSv4_ACL_Test/acltest/test", "system.nfs4_acl", 0x0, 0) = 4088
getxattr("/home/NFSv4_ACL_Test/acltest/test", "system.nfs4_acl", 0x7fd8717d4040, 4088) = -1 ERANGE (Numerical result out of range)

getxattr("/home/NFSv4_ACL_Test/acltest/test", "system.nfs4_acl", 0x0, 0) = 4092
getxattr("/home/NFSv4_ACL_Test/acltest/test", "system.nfs4_acl", 0x7f5524b1b040, 4092) = -1 ERANGE (Numerical result out of range)

getxattr("/home/NFSv4_ACL_Test/acltest/test", "system.nfs4_acl", 0x0, 0) = 4096
getxattr("/home/NFSv4_ACL_Test/acltest/test", "system.nfs4_acl", 0x7fa51d44b040, 4096) = -1 ERANGE (Numerical result out of range)
Comment 20 Sachin Prabhu 2013-01-30 13:20:53 EST
Hello Albert,

I have created bz 906056 to fix the security issue related to the short buffer lengths being passed so that it can be independently fixed and released quickly. You have been cc-ed on that issue and should receive the public updates related to that patch.

As for the case where the ACL length approaches a page cache size, I was attempting to reproduce the problem and faced a problem with the linux NFS server which returns NFS4ERR_RESOURCE when the acl size approaches a PAGESIZE. I am still trying to work around this problem and hopefully have a working reproducer which I can use to debug the problem. I was curious to know what NFS server are you using to reproduce this issue.

Sachin Prabhu
Comment 21 Albert Flügel 2013-02-08 05:25:41 EST
The servers are Netapps with ONTAP 8.1P3 and 8.1.2P2.
I tcpdumped the traffic and according to wireshark it's all consistent.
I could also verify the ACL lengths in the packages.
Comment 22 Albert Flügel 2013-11-26 09:56:51 EST
Today i tried with kernel-2.6.32-358.23.2.el6.x86_64.rpm .
It does not crash. However, the ERANGE is still there:
# strace nfs4_getfacl test
...
stat("test", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
getxattr("test", "system.nfs4_acl", 0x0, 0) = 4092
brk(0)                                  = 0x7fbe59149000
brk(0x7fbe5916b000)                     = 0x7fbe5916b000
getxattr("test", "system.nfs4_acl", 0x7fbe59149010, 4092) = -1 ERANGE (Numerical result out of range)
dup(2)                                  = 3
fcntl(3, F_GETFL)                       = 0x8002 (flags O_RDWR|O_LARGEFILE)
fstat(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbe56f56000
lseek(3, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
write(3, "Failed getxattr operation: Numer"..., 57
Failed getxattr operation: Numerical result out of range
close(3)                                = 0
munmap(0x7fbe56f56000, 4096)            = 0
brk(0x7fbe5916a000)                     = 0x7fbe5916a000
exit_group(1)                           = ?

It should do (on a kernel with my patch it does):
...
stat("test", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
getxattr("test", "system.nfs4_acl", 0x0, 0) = 4092
brk(0)                                  = 0x7f4904091000
brk(0x7f49040b3000)                     = 0x7f49040b3000
getxattr("test", "system.nfs4_acl", "\x00\x00\x00h\x00\x00\x00\x01\x00\x00\x00\x00\x00\x12\x00\x81\x00\x00\x00\ blah blah very long string ... \x09EVERYONE@nfi", 4092) = 4092
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4902ff4000
write(1, "D::name@domain.org:rtcy"..., 33D::name@domain.org:rtcy
) = 33
...

This is still not fixed. What do you propose ?
Comment 23 Sachin Prabhu 2013-11-26 10:15:32 EST
Hello Albert,

This issue had fallen out of my radar. I will take another look at this case. The fix needs to made upstream before we will be able to backport it to RHEL 6.

Sachin Prabhu
Comment 24 Albert Flügel 2013-11-27 03:33:12 EST
Hi,

please have a look here:
http://www.spinics.net/lists/linux-nfs/msg40622.html
this is more or less the same approach like my patch to fix that.

Albert
Comment 25 Peter Merhaut 2014-06-06 08:57:15 EDT
Hi,

and News on that bug?
Is it fixed in the latest kernel version?

cheers, Peter
Comment 26 Albert Flügel 2015-02-06 03:40:53 EST
This report seems somewhat orphaned ?
Seems fixed with https://rhn.redhat.com/errata/RHSA-2013-1645.html
and duplicate reports in Bug 1031678 and 906056.

Please close, if not making sense anymore.
Comment 27 Romeo Theriault 2015-06-10 17:34:13 EDT
I would like to report that this issue still exists in RHEL 6.6. I tried (in Ubuntu 12 and 14) 3.x kernels and the issue exists there as well, so it doesn't appear to be fixed upstream. The patch that Albert attached to this bug report fixed the problem for me.

Initially in RHEL 6.3, the system would kernel panic when setting some arbitrary # of nfsv4 acl's. I updated the system to RHEL 6.6 and started getting this error (instead of the kernel panic):

[root@njtest171 bin]# nfs4_getfacl /mnt/ifs/storage/dl/users
Failed getxattr operation: Numerical result out of range

This is a strace of the issue:
getxattr("/mnt/ifs/storage/dl/users", "system.nfs4_acl", 0x7f59fccf7010, 8188) = -1 ERANGE (Numerical result out of range)
write(3, "Failed getxattr operation: Numer"..., 57Failed getxattr operation: Numerical result out of range

These errors led me to this bug report. I applied Albert's patch and we can now use large amounts of nfsv4 acl's. (which we need to use).

I would love to see this patch included in a future release and the upstream kernel.
Comment 28 J. Bruce Fields 2015-06-11 09:55:42 EDT
(In reply to Sachin Prabhu from comment #20)
> I have created bz 906056 to fix the security issue related to the short
> buffer lengths being passed so that it can be independently fixed and
> released quickly. You have been cc-ed on that issue and should receive the
> public updates related to that patch.
> 
> As for the case where the ACL length approaches a page cache size, I was
> attempting to reproduce the problem and faced a problem with the linux NFS
> server which returns NFS4ERR_RESOURCE when the acl size approaches a
> PAGESIZE. I am still trying to work around this problem and hopefully have a
> working reproducer which I can use to debug the problem. I was curious to
> know what NFS server are you using to reproduce this issue.

By the way, in case it helps reproduce: upstream linux nfsd as of 3.16 should accept ACLs up to the maximum rsize (negotiated by client and server, typically about a megabyte), and up to 204 entries (on x86_64).
Comment 31 J. Bruce Fields 2015-12-11 10:46:30 EST
The crashes all appear to have been fixed, so at this point the one remaining problem is a spurious ERANGE addressed by the patch referenced in comment 24.  As far as I can tell that patch never got any feedback upstream, and never went into upstream or RHEL.

So, the bug may still be there; I'll start by trying to reproduce upstream.
Comment 32 J. Bruce Fields 2015-12-14 17:23:25 EST
Updated patch and new test case posted upstream:

http://lkml.kernel.org/r/20151214222154.GB7342@fieldses.org
Comment 36 J. Bruce Fields 2017-01-20 17:28:07 EST
Just getting back to this bug now: my earlier reproducer still reproduces the bug upstream, though I had to request sec=sys on the mount.  (With sec=krb5p, the initial setxattr returns an -EINVAL, I'm not sure why--that may be a separate bug.)
Comment 37 J. Bruce Fields 2017-01-27 20:53:26 EST
The patch linked in comment 32 changes the getacl call so that instead of preallocating the pages required to hold the resulting ACL, it leaves it to the rpc layer to allocate pages at the time it receives the call.  The problem is that after the rpc layer receives the call, the receive xdr_buf is left in a confusing state: the page_len is set to the maximum we expect might be received, not to the amount that was actually received.  This can lead to NULL dereferences when page pointers expected to be found in the xdr_buf's pages array aren't there.  I can reproduce that bug now.

For some reason the xprt-level code that allocates pages avoids touching the fields of the xdr_buf;  I'll ask why that is, for now I'll keep to that pattern.  I think the buffer should be fixed up before passing to the krb5i/krb5p unwrapping, so I'm currently experimenting with a fix in net/sunrpc/clnt.c:call_decode().  But don't have that working yet.

(I'm also curious how the NFSv3 ACL code is handling this; it looks to me like it should have a similar problem.  I wonder if it actually works reliably over krb5i or krb5p.)
Comment 38 J. Bruce Fields 2017-02-03 17:17:00 EST
(In reply to J. Bruce Fields from comment #37)
> For some reason the xprt-level code that allocates pages avoids touching the
> fields of the xdr_buf;  I'll ask why that is, for now I'll keep to that
> pattern.  I think the buffer should be fixed up before passing to the
> krb5i/krb5p unwrapping, so I'm currently experimenting with a fix in
> net/sunrpc/clnt.c:call_decode().  But don't have that working yet.

On its own, that's not quite enough: the problems are that:

- if we allocate just enough pages to hold the received data, and if xdr_shrink_bufhead() then has to be called, it may need to shift some data into the tail, but the proc code isn't prepared to copy data out of the tail.  We can fix that by using read_bytes_from_xdr_buf, as the NFSv3 ACL code does--though the reason it has direct access to the xdr_buf is that the corresponding code there is in the xdr layer instead.  Probably we should do that in the v4 case too.
- that problem was compounded by the fact that the estimate of the reply size was off by one, requiring xdr_shrink_bufhead() to be called on every getacl.  That's a one-line patch that I can send upstream now

I suspect the v3 ACL code may also have a bug in the krb5i/p case in the (possibly very unlikely) case that xdr_shrink_bufhead() has to be called.

Anyway, this is all very doable upstream.  And there shouldn't be major backporting issues.
Comment 41 J. Bruce Fields 2017-02-10 09:12:57 EST
(In reply to J. Bruce Fields from comment #38)
> the v3 ACL code may also have a bug in the krb5i/p case in the
> (possibly very unlikely) case that xdr_shrink_bufhead() has to be called.

Oh, but on another look, the v3 code doesn't actually bother with enter_pages or xdr_shrink_bufhead--it just copies data directly out of the xdr_buf.

Which makes a lot of sense, there's no point trying to do all those zero-copy tricks here.  So, I think we should do something like v3 does and just allocate a buffer in the xdr code and copy everything out into that.  That should be simpler and fix this problem.
Comment 43 Yongcheng Yang 2017-03-15 04:39:05 EDT
(In reply to Sachin Prabhu from comment #10)

> On the first call, it fails in the following manner
> # ./mytest 
> Size = 80
> Segmentation fault
> 
> Any subsequent call will result in a ERANGE error being returned as the
> getxattr command attempts to service the request from cache.
> The machine crashes when we attempt to unmount the share.
> 
> I can reproduce this on RHEL 6 and on F18(3.6.10-4.fc18.x86_64). 
> 

Hi Sachin, the reproducer seems has been obsoleted. So I just did a bit modification (others it returns error "getxattr: Numerical result out of range"). Seems Comment #19 has already given some update, sorry for don't understand it much.

Please help to have a look at the following logs. As there's no crash occurs now (maybe it's the reproducer need be updated), is this issue already fixed? If not, then what are the "Expected results" and "Actual results" now.

# cat mytest.c 
#include <sys/types.h>
#include <attr/xattr.h>
#include <stdio.h>

int main(void)
{
    char val[10];
    ssize_t size;
    ssize_t bufsize;
    int i;

    /* An empty buffer of size zero to return the current size */
    bufsize = getxattr ("/mnt/mnt_test/testfile", "system.nfs4_acl", (void *)(&val), 0);

    size = getxattr ("/mnt/mnt_test/testfile", "system.nfs4_acl", (void *)(&val), bufsize);
    if (size > 0) {
        printf("Size = %i\n", (int) size);
        for (i = 0; i < size; i++)
            printf("%c", val[i]);
    }
    else
        perror("getxattr");
}
# gcc -o mytest mytest.c 
# mount -t nfs4 $HOSTNAME:/export_test/ /mnt/mnt_test/
# date > /mnt/mnt_test/testfile
# ./mytest 
Size = 80
Segmentation fault (core dumped)
# ./mytest 
Size = 80
Segmentation fault (core dumped)
# ./mytest 
Size = 80
Segmentation fault (core dumped)
# umount /mnt/mnt_test/
# 
# dmesg | grep "general protection"
mytest[10238] general protection ip:400656 sp:7ffe9f9c5098 error:0 in mytest[400000+1000]
mytest[10242] general protection ip:400656 sp:7ffdae9eaff8 error:0 in mytest[400000+1000]
mytest[10244] general protection ip:400656 sp:7ffc262b79b8 error:0 in mytest[400000+1000]
mytest[10279] general protection ip:400656 sp:7ffeded441e8 error:0 in mytest[400000+1000]
mytest[10283] general protection ip:400656 sp:7ffda2f91ff8 error:0 in mytest[400000+1000]
mytest[10286] general protection ip:400656 sp:7ffee80e6668 error:0 in mytest[400000+1000]
# 
# rpm -q kernel
kernel-2.6.32-696.el6.x86_64
Comment 45 Mark Thacker 2017-05-04 21:34:24 EDT
pm_ack provided. though I am a bit surprised that this wasn't auto-granted for a bug fix.
Comment 51 Mark Thacker 2017-05-05 07:25:39 EDT
Added PMapproved
Comment 52 Phillip Lougher 2017-05-06 00:19:32 EDT
Patch(es) committed on kernel repository and kernel is undergoing testing
Comment 54 Phillip Lougher 2017-05-08 12:08:51 EDT
Patch(es) available on kernel-2.6.32-703.el6
Comment 61 Yongcheng Yang 2017-09-27 02:35:01 EDT
Moving to VERIFIED according to test logs of comment #60.

Will include this case as regression test in the future.
Comment 64 errata-xmlrpc 2018-06-19 00:50:10 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1854

Note You need to log in before you can comment on or make changes to this bug.