Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 697659

Summary: NFS4 problem using open() on exported urandom device
Product: Red Hat Enterprise Linux 6 Reporter: John Muir <jrmuir>
Component: kernelAssignee: J. Bruce Fields <bfields>
Status: CLOSED ERRATA QA Contact: Eryu Guan <eguan>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.0CC: bfields, eguan, syeghiay, yanwang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.32-196.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 13:10:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 743047    
Attachments:
Description Flags
Simple c app to open urandom device (outside of jail) and read a few bytes and nfs debug output from server having trouble none

Description John Muir 2011-04-18 21:03:41 UTC
Created attachment 493012 [details]
Simple c app to open urandom device (outside of jail) and read a few bytes and nfs debug output from server having trouble

Description of problem:  

This may extend to more than just the urandom device and appears to me to be something going on with the nfs clients.  I'm using three servers (two application servers(nfs clients), one nfs server) running php-fpm with jailed instances using an nfs hosted /dev/urandom in the jail.  Post-reboot, the first nfs client up is able to use the nfs-exported /dev/urandom device file within php.  The second client cannot.  Changes to the device file restore access to the second client.  This can be reproduced from outside of the jail once the issue occurs with a simple c app that opens and reads from the jailed device file (using the full path, not jailed).


Version-Release number of selected component (if applicable):
nfs clients kernel 2.6.32-71.24.1.el6.x86_64
nfs server kernel 2.6.32-71.14.1.el6.x86_64
nfs-utils nfs-utils-1.2.2-7.el6.x86_64 (server/client match)

How reproducible: Very


Steps to Reproduce (sorry, this isn't the most easily reproducible environment):
1. Get three servers up, set up two to mount an nfs export from the third.  
2. On web servers, install php 5.3.5 or greater, set up php-fpm pool as a jailed pool inside the nfs mount.  Configure chroot jail with essential libraries, timezone data files, and openldap configuration for an external ldaps server.
2. Configure apache to use mod_fastcgi's external server feature to point to the php-fpm pool.  Provide suexec user/group entries for site as set up in the php-fpm pool.  
3. Start apache and php-fpm services on both web servers.  Perform an ldaps connect to a remote ldaps server with php-ldap.
4. Additionally, attempts to access the device file from outside of the jail with a simple c app to read bytes from the urandom file will now fail.

I'm working on an easier to reproduce test case, but currently I don't have one available.
  
Actual results:

One web server will connect over ldaps to the ldaps server, the other will fail.

Expected results:

Both web servers (nfs clients) should connect to the ldaps server without issue, but cannot due to being unable to read any random numbers from the urandom device file.

Additional info:

straces performed show problems opening the jailed /dev/urandom device:
open("/dev/urandom", O_RDONLY|O_NOCTTY|O_NONBLOCK) = -1 EINVAL (Invalid argument)

changes to the device file restore access along with restarting php-fpm.  I.e. chmod 644 /path/to/jail/dev/urandom, service php-fpm restart.

A simple c app (attached) that reads bytes from the jailed /dev/urandom also fail with the -1 EINVAL error:
open("/path/to/jail/dev/urandom", O_RDONLY|O_NOCTTY|O_NONBLOCK) = -1 EINVAL (Invalid argument)

and the simple c app works great immediately after a chmod of the device file.

Also attached is nfs debug output from the client not able to open the device file.  This is only from a run of the c app (filelocktester/a.out) which had a failure as noted above (EINVAL).

Since this one isn't easy to reproduce without a fairly significant amount of work on the setup I'd be happy to hunt down more details or information if someone can point me in the right direction.

Comment 2 RHEL Program Management 2011-04-19 06:00:25 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 3 Steve Dickson 2011-08-12 18:45:39 UTC
Well the reason the v4 mount fails and the v3 does not is 
the v3 mount recolonizes /dev/urandom as a special character 
device and v4 does not. The v4 server treats  /dev/urandom 
as a normal file (S_ISREG(mode) == TRUE) which causes 
nfsd_mode_check() to fail with nfserr_inval. 

Still looking into as why....

Comment 4 J. Bruce Fields 2011-08-15 21:33:12 UTC
The server is correct to fail an on-the-wire OPEN of something that isn't a regular file.  However, the error it's returning (ERR_INVAL) isn't correct.  Trond suggests ERR_SYMLINK instead.  The spec's a little fuzzy here, but that does appear to work.

The patch posted here, when applied to the server, appears to fix the problem:

http://marc.info/?l=linux-nfs&m=131344354314919&w=2

Comment 5 Steve Dickson 2011-08-16 11:35:40 UTC
(In reply to comment #4)
> The server is correct to fail an on-the-wire OPEN of something that isn't a
> regular file.  However, the error it's returning (ERR_INVAL) isn't correct. 
> Trond suggests ERR_SYMLINK instead.  The spec's a little fuzzy here, but that
> does appear to work.
> 
> The patch posted here, when applied to the server, appears to fix the problem:
> 
> http://marc.info/?l=linux-nfs&m=131344354314919&w=2

This patch does indeed fix the problem...

Comment 6 J. Bruce Fields 2011-08-19 19:44:10 UTC
Revised patch committed to git://linux-nfs.org/~bfields/linux.git for-3.2 as aadab6c6f4da38d639394de740602f146c88da0c

Comment 8 Aristeu Rozanski 2011-09-12 15:03:19 UTC
Patch(es) available on kernel-2.6.32-196.el6

Comment 11 Eryu Guan 2011-09-16 07:33:26 UTC
Reproduced on -191 kernel (server side)

On client I got Invalid Argument error when opening a device special file exported by server.
This requires echo 3 > /proc/sys/vm/drop_caches on both server and client side
The opendev command is a simple c program opens argument 1

[root@fstest bz697659]# mount | grep nay
ibm-ls22-01.rhts.eng.nay.redhat.com:/mnt/nfsexport on /mnt/testarea/nfs type nfs (rw,mand,vers=4,proto=tcp,addr=10.66.86.24,clientaddr=10.66.13.199)
[root@fstest bz697659]# ./devopen /mnt/testarea/nfs/zero
open failed: Invalid argument
[root@fstest bz697659]# ls -l /mnt/testarea/nfs/zero
crw-r--r--. 1 root root 1, 5 Sep 16 14:49 /mnt/testarea/nfs/zero
[root@fstest bz697659]#

On -196 kernel (server side), the open got success as expected if I understand comments in http://marc.info/?l=linux-nfs&m=131344354314919&w=2 correctly

"NFS4ERR_SYMLINK should _always_ trigger the correct behaviour on a client: a fresh lookup of the component."

Comment 12 errata-xmlrpc 2011-12-06 13:10:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2011-1530.html