Bug 812364

Summary: rpc.mountd sends strings that are too long over RPC channels
Product: Red Hat Enterprise Linux 6 Reporter: Stefan Walter <walteste>
Component: nfs-utilsAssignee: Steve Dickson <steved>
Status: CLOSED DUPLICATE QA Contact: yanfu,wang <yanwang>
Severity: high Docs Contact:
Priority: high    
Version: 6.2CC: baumanmo, ndevos, rrajaram, yanwang
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-25 14:48:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 727267    
Attachments:
Description Flags
patch for utils/mountd/cache.c none

Description Stefan Walter 2012-04-13 13:57:41 UTC
Created attachment 577350 [details]
patch for utils/mountd/cache.c

Description of problem:

On our file server cluster we have many exports with individual netgroups. On some clients it is not possible to mount some shares. What we get is the following error message on the server:

Apr 12 12:04:22 srv rpc.mountd[12143]: qword_printhex: fwrite failed: errno 22 (Invalid argument)
Apr 12 12:04:22 srv rpc.mountd[12143]: qword_eol: fflush failed: errno 22 (Invalid argument)
Apr 12 12:04:22 srv rpc.mountd[12143]: Cannot export /export/..., possibly unsupported filesystem or fsid= required

We have tracked down the problem to this commit which is missing in the
nfs-utils in RHEL 6.2:

  http://git.linux-nfs.org/?p=steved/nfs-utils.git;a=commitdiff;h=5604b35a61e22930873ffc4e9971002f578e7978

This patch alone does not solve the problem though. The two functions
cache_export() and cache_export_ent() in utils/mountd/cache.c must also
be modified to get a larger buffer via setvbuf(). A patch is attached to
this report.

Version-Release number of selected component (if applicable):

nfs-utils-1.2.3-15.el6.x86_64

The additional patch is not present in upstream and should probably be added there too.

Additional info:

The reason why this is triggered at our site is similar to the one of the following bug I reported against RHEL4 long ago here: 

   https://lkml.org/lkml/2007/7/30/116

Basically rpc.mountd builds a long string with netgroups and the first
1024 bytes of the incomplete string are flushed by stdio:

12295 open("/proc/net/rpc/auth.unix.ip/channel", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 12
12295 fstat(12, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
12295 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ffabf567000
12295 write(12, "nfsd x.x.x.x 1334318187 *,@aadiya,@akourtis,@akreutz,@aligma,@alonso,@amamin,@amarfurt,@amoga,@antoinek,@asinghan,@azachary,@azala,@baslergi,@baumanan,@baumanmo,@bcagri,@blukas,@brams,@briangol,@"..., 1024) = -1 EINVAL (Invalid argument)
12295 sendto(11, "<28>Apr 13 13:26:27 rpc.mountd[12295]: qword_print: fwrite failed: errno 22 (Invalid argument)", 94, MSG_NOSIGNAL, NULL, 0) = 94
12295 write(12, "\n", 1)                = -1 EINVAL (Invalid argument)
12295 sendto(11, "<28>Apr 13 13:26:27 rpc.mountd[12295]: qword_eol: fflush failed: errno 22 (Invalid argument)", 92, MSG_NOSIGNAL, NULL, 0) = 92
12295 close(12)                         = 0

Comment 6 Steve Dickson 2012-04-25 14:46:13 UTC
The upstream patch:

commit 5604b35a61e22930873ffc4e9971002f578e7978
Author: Sean Finney <sean.finney>
Date:   Tue Apr 19 11:04:35 2011 -0400

    nfs-utils: Increase the stdio file buffer size for procfs files

Comment 7 Steve Dickson 2012-04-25 14:48:17 UTC

*** This bug has been marked as a duplicate of bug 736741 ***

Comment 8 Moritz Baumann 2012-04-25 15:12:48 UTC
unfortunately we cannot acces the other bug or the KB article. 
Would you mind to either add Stefan Walter and me to bug 736741, or give us a quick info if this is fixed, or what workaround exists?

Comment 9 Moritz Baumann 2012-04-25 15:52:59 UTC
Hi,
I wanted to ask if this addresses  the other bug we found (patch in this ticket)
as well?