Description of problem: I have a setup where a number of web servers serve PHP pages from an NFS server. After we upgraded the web server kernels through yum to 2.6.18-1.2200 and 1.2239, we have big troubles with our isilon (FreeBSD based file cluster system, see http://www.isilon.com/) file servers. This is _either_ related to the isilon NFS server itself or the file system size (unfortunately the smallest isilon filesystem I have is slightly over 2 TB). When running <? phpinfo(); ?> through Apache httpd 2.2.2 and PHP 5.1.6, it reports the normal PHP info when serving the files from a local file system or from a file system mounted from a linux 2.4 server (~ 1.3 TB) On a 2 TB and on a 7.1 TB file system served by the isilon file cluster (OneFS 4.5), the following error is shown: <br /> <b>Warning</b>: Unknown: failed to open stream: Value too large for defined data type in <b>Unknown</b> on line <b>0</b><br /> <br /> <b>Warning</b>: Unknown: Failed opening '/home/www/info.php' for inclusion (include_path='.:/usr/share/pear') in <b>Unknown</b> on line <b>0</b><br /> I can provide tcpdumps of the NFS communication if necessary. Version-Release number of selected component (if applicable): kernel-smp-2.6.17-1.2174_FC5 --- This version works ok. kernel-smp-2.6.18-1.2200.fc5 --- Does not work kernel-2.6.18-1.2849.fc6 --- Does not work kernel-smp-2.6.18-1.2239.fc5 --- Does not work How reproducible: always Steps to Reproduce: 1. Get an isilon file cluster 2. mount a file system on a FC5 box using a 2.6.18 kernel 3. Serve files from this filesystem through apache and php 4. use <? phpinfo(); ?> for testing. 5. observe the error. Actual results: <br /> <b>Warning</b>: Unknown: failed to open stream: Value too large for defined data type in <b>Unknown</b> on line <b>0</b><br /> <br /> <b>Warning</b>: Unknown: Failed opening '/home/www/info.php' for inclusion (include_path='.:/usr/share/pear') in <b>Unknown</b> on line <b>0</b><br /> Expected results: Should serve the phpinfo(); report Additional info: cat /proc/mounts says fileserver:/ifs/data/files/www /home/www nfs rw,vers=3,rsize=8192,wsize=8192,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=fileserver 0 0 Changing the rsize and wsize to 32768 does not change the behaviour. Running "php /home/www/info.php" on the command line works fine. So it seems to be some sort of interaction of the way Apache and the embedded PHP access the file system.
Would be possible to post a bzip binary tethreal trace of this problem. Something similar to: tethereal -w /tmp/data.pcap host <server> ; bzip2 /tmp/data.pcap
Created attachment 143214 [details] Cpature with working 2.6.17 kernel (last one working) This is a capture of a successful lookup
Created attachment 143215 [details] Capture with failing 2.6.18 kernel (since 1.2200 none worked) This is a failing lookup
So what I did was trying to quiet down the NFS traffic on one of our web server boxes (I only succeeded partially, our health checker interfered with my tests, please ignore all access to healthcheck.php). Then I ran /sbin/service/httpd start ; sleep 1 ; wget -O - http://srv017.dc1.thomson-webcast.net/tools/info.php (that is the web box). The Isilon cluster is at sto001.dc1.thomson-webcast.lan The local IP is 10.64.0.64, the cluster ip is 10.64.1.56 and 10.64.1.62 (that one varies from mount to mount). I also tried the lastest kernel from rawhide in a UP version (vmlinuz-2.6.18-1.2849.fc6) with no changes to the behaviour. If you need more info, let me know; I can also provide you with a technical contact at isilon (only through private mail)
Any news or progress on this bug reports? Any more traces needed?
This problem still persists with 2.6.18-1.2257. I'd really appreciate *any* progress on this.
First of all, I do apologize for not being a bit more responsive on this... Looking at both traces I'm only seen half of the traffic... Only the requests, none of the replies which tells me either the server is not responding or dropping the replies (which I doubt) or the server has more than one network interface and the replies are coming back with a different ip address (which is more likely the case).... So would it possible to re-run the traces to catch both side of the traffic... maybe something like "tethereal -w /tmp/data.pcap host 10.64.1.56 and host <2ed if>" also what nfs-utils version are you using? Finally are there any type of errors or warnings in the /var/log/messages file on the client?
Ok, first thing, for everyone that finds this bug because of the same problems: This is a known problem from Isilon and they have Knowledge Base article # 1568 which fixes the problem. Contact Isilon Support for this fix.
Created attachment 144207 [details] Failed NFS lookup (both sides) This is a failed File lookup through Apache. Kernel is 2.6.18-1.2257-fc5smp. (This time I have both sides, I capture the bonding interface, not just eth0.)
Created attachment 144208 [details] Working NFS lookup (both side) This is a working File lookup through Apache. Kernel is 2.6.17-1.2174_FC5smp (This time I have both sides, I capture the bonding interface, not just eth0.)
Created attachment 144209 [details] Working NFS lookup with workaround (both sides) This is a working NFS lookup using 2.6.18-1.2257-fc5smp. I have the proposed workaround from Isilon in place on the cluster.
Fedora apologizes that these issues have not been resolved yet. We're sorry it's taken so long for your bug to be properly triaged and acted on. We appreciate the time you took to report this issue and want to make sure no important bugs slip through the cracks. If you're currently running a version of Fedora Core between 1 and 6, please note that Fedora no longer maintains these releases. We strongly encourage you to upgrade to a current Fedora release. In order to refocus our efforts as a project we are flagging all of the open bugs for releases which are no longer maintained and closing them. http://fedoraproject.org/wiki/LifeCycle/EOL If this bug is still open against Fedora Core 1 through 6, thirty days from now, it will be closed 'WONTFIX'. If you can reporduce this bug in the latest Fedora version, please change to the respective version. If you are unable to do this, please add a comment to this bug requesting the change. Thanks for your help, and we apologize again that we haven't handled these issues to this point. The process we are following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp We will be following the process here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this doesn't happen again. And if you'd like to join the bug triage team to help make things better, check out http://fedoraproject.org/wiki/BugZappers
This bug is open for a Fedora version that is no longer maintained and will not be fixed by Fedora. Therefore we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen thus bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.