Bug 697659
| Summary: | NFS4 problem using open() on exported urandom device | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | John Muir <jrmuir> | ||||
| Component: | kernel | Assignee: | J. Bruce Fields <bfields> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Eryu Guan <eguan> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 6.0 | CC: | bfields, eguan, syeghiay, yanwang | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | kernel-2.6.32-196.el6 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2011-12-06 13:10:36 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 743047 | ||||||
| Attachments: |
|
||||||
Since RHEL 6.1 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. Well the reason the v4 mount fails and the v3 does not is the v3 mount recolonizes /dev/urandom as a special character device and v4 does not. The v4 server treats /dev/urandom as a normal file (S_ISREG(mode) == TRUE) which causes nfsd_mode_check() to fail with nfserr_inval. Still looking into as why.... The server is correct to fail an on-the-wire OPEN of something that isn't a regular file. However, the error it's returning (ERR_INVAL) isn't correct. Trond suggests ERR_SYMLINK instead. The spec's a little fuzzy here, but that does appear to work. The patch posted here, when applied to the server, appears to fix the problem: http://marc.info/?l=linux-nfs&m=131344354314919&w=2 (In reply to comment #4) > The server is correct to fail an on-the-wire OPEN of something that isn't a > regular file. However, the error it's returning (ERR_INVAL) isn't correct. > Trond suggests ERR_SYMLINK instead. The spec's a little fuzzy here, but that > does appear to work. > > The patch posted here, when applied to the server, appears to fix the problem: > > http://marc.info/?l=linux-nfs&m=131344354314919&w=2 This patch does indeed fix the problem... Revised patch committed to git://linux-nfs.org/~bfields/linux.git for-3.2 as aadab6c6f4da38d639394de740602f146c88da0c Patch(es) available on kernel-2.6.32-196.el6 Reproduced on -191 kernel (server side) On client I got Invalid Argument error when opening a device special file exported by server. This requires echo 3 > /proc/sys/vm/drop_caches on both server and client side The opendev command is a simple c program opens argument 1 [root@fstest bz697659]# mount | grep nay ibm-ls22-01.rhts.eng.nay.redhat.com:/mnt/nfsexport on /mnt/testarea/nfs type nfs (rw,mand,vers=4,proto=tcp,addr=10.66.86.24,clientaddr=10.66.13.199) [root@fstest bz697659]# ./devopen /mnt/testarea/nfs/zero open failed: Invalid argument [root@fstest bz697659]# ls -l /mnt/testarea/nfs/zero crw-r--r--. 1 root root 1, 5 Sep 16 14:49 /mnt/testarea/nfs/zero [root@fstest bz697659]# On -196 kernel (server side), the open got success as expected if I understand comments in http://marc.info/?l=linux-nfs&m=131344354314919&w=2 correctly "NFS4ERR_SYMLINK should _always_ trigger the correct behaviour on a client: a fresh lookup of the component." Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2011-1530.html |
Created attachment 493012 [details] Simple c app to open urandom device (outside of jail) and read a few bytes and nfs debug output from server having trouble Description of problem: This may extend to more than just the urandom device and appears to me to be something going on with the nfs clients. I'm using three servers (two application servers(nfs clients), one nfs server) running php-fpm with jailed instances using an nfs hosted /dev/urandom in the jail. Post-reboot, the first nfs client up is able to use the nfs-exported /dev/urandom device file within php. The second client cannot. Changes to the device file restore access to the second client. This can be reproduced from outside of the jail once the issue occurs with a simple c app that opens and reads from the jailed device file (using the full path, not jailed). Version-Release number of selected component (if applicable): nfs clients kernel 2.6.32-71.24.1.el6.x86_64 nfs server kernel 2.6.32-71.14.1.el6.x86_64 nfs-utils nfs-utils-1.2.2-7.el6.x86_64 (server/client match) How reproducible: Very Steps to Reproduce (sorry, this isn't the most easily reproducible environment): 1. Get three servers up, set up two to mount an nfs export from the third. 2. On web servers, install php 5.3.5 or greater, set up php-fpm pool as a jailed pool inside the nfs mount. Configure chroot jail with essential libraries, timezone data files, and openldap configuration for an external ldaps server. 2. Configure apache to use mod_fastcgi's external server feature to point to the php-fpm pool. Provide suexec user/group entries for site as set up in the php-fpm pool. 3. Start apache and php-fpm services on both web servers. Perform an ldaps connect to a remote ldaps server with php-ldap. 4. Additionally, attempts to access the device file from outside of the jail with a simple c app to read bytes from the urandom file will now fail. I'm working on an easier to reproduce test case, but currently I don't have one available. Actual results: One web server will connect over ldaps to the ldaps server, the other will fail. Expected results: Both web servers (nfs clients) should connect to the ldaps server without issue, but cannot due to being unable to read any random numbers from the urandom device file. Additional info: straces performed show problems opening the jailed /dev/urandom device: open("/dev/urandom", O_RDONLY|O_NOCTTY|O_NONBLOCK) = -1 EINVAL (Invalid argument) changes to the device file restore access along with restarting php-fpm. I.e. chmod 644 /path/to/jail/dev/urandom, service php-fpm restart. A simple c app (attached) that reads bytes from the jailed /dev/urandom also fail with the -1 EINVAL error: open("/path/to/jail/dev/urandom", O_RDONLY|O_NOCTTY|O_NONBLOCK) = -1 EINVAL (Invalid argument) and the simple c app works great immediately after a chmod of the device file. Also attached is nfs debug output from the client not able to open the device file. This is only from a run of the c app (filelocktester/a.out) which had a failure as noted above (EINVAL). Since this one isn't easy to reproduce without a fairly significant amount of work on the setup I'd be happy to hunt down more details or information if someone can point me in the right direction.