Bug 155771

Summary: Random app crashes caused by "file not found" errors for already-open files/sockets
Product: [Fedora] Fedora Reporter: Robin Green <greenrd>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED WORKSFORME QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 4CC: charlesc-fedoracore-bugzilla, pfrields
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-07-15 21:29:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Robin Green 2005-04-22 21:29:38 UTC
Description of problem:
Some of my applications are reporting random "file not found" or "no such file
or directory" errors, exclusively with *opened* files or sockets, since either
my latest kernel upgrade or a recent one (I don't remember exactly when it
started happening). I am not using NFS, but see also bug 144556.

A typical error is an application, such as psi or pan, crashing due to an X I/O
error, with "file not found" sometimes given as the reason.

Another symptom is this non-fatal stack trace from a Java program, Azureus:
DEBUG::Fri Apr 22 19:51:02 BST 2005
  java.io.IOException: No such file or directory
        at sun.nio.ch.IOUtil.drain(Native Method)
        at sun.nio.ch.PollSelectorImpl.doSelect(PollSelectorImpl.java:66)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
        at
com.aelitis.azureus.core.networkmanager.VirtualChannelSelector.select(VirtualChannelSelector.java:239)
        at
com.aelitis.azureus.core.networkmanager.WriteController.writeSelectorLoop(WriteController.java:74)
        at
com.aelitis.azureus.core.networkmanager.WriteController.access$0(WriteController.java:72)
        at
com.aelitis.azureus.core.networkmanager.WriteController$1.runSupport(WriteController.java:52)
        at org.gudy.azureus2.core3.util.AEThread.run(AEThread.java:45)

I believe both of these errors are symptomatic of the same problem, which is
that the kernel (or perhaps glibc) seems to be ocassionally reporting a file not
found error for operations on open file descriptors.

These errors happen on the order of once or twice in an hour, or less, usually
when I'm not at the machine.

Version-Release number of selected component (if applicable):
kernel-2.6.11-1.1253_FC4

How reproducible:
Not consistently reproducible

Additional information:
glibc-2.3.4-21

Comment 1 charlesc-fedoracore-bugzilla 2005-04-29 00:43:51 UTC
I believe this one has been around for a while; a customer of mine is seeing
processes dying on Fedora Core 2 with at least kernel-2.6.9-1.6_FC2
and kernel-2.6.10-1.771_FC2.

The system seems to fail to open files even when plenty of fds are available as
well.

Comment 2 Dave Jones 2005-06-27 23:23:38 UTC
Mass update of -test bugs to update version to fc4.
(Please retest on final release, and report results if you have not already done
so).

Thanks.

Comment 3 Dave Jones 2005-07-15 21:16:39 UTC
[This comment has been added as a mass update for all FC4 kernel bugs.
 If you have migrated this bug from an FC3 bug today, ignore this comment.]

Please retest your problem with todays 2.6.12-1.1398_FC4 update.

If your problem involved being unable to boot, or some hardware not being
detected correctly, please make sure your /etc/modprobe.conf is correct *BEFORE*
installing any kernel updates.
If in doubt, you can recreate this file using..

mv /etc/sysconfig/hwconf /etc/sysconfig/hwconf.bak
mv /etc/modprobe.conf /etc/modprobe.conf.bak
kudzu


Thank you.


Comment 4 Robin Green 2005-07-15 21:29:26 UTC
I have not seen this bug for weeks, so closing.