Description of problem: When exporting an ext4 via NFS, then mounting this on RH4 or RH5 system, any compiles on the remote system fail badly with a "final close failed: Input/output error" Version-Release number of selected component (if applicable): This seems to have only started happening with 2.6.29.5-191.fc11. Doesn't happen if exporting an ext3 filesystem. How reproducible: Every time Steps to Reproduce: 1. Get a nontrivial C program (I tested with File System Exerciser I was playing with but unimportant) 2.gcc the program ie gcc fsx-linux.c Actual results: [rh5-64bit]colin: gcc fsx-linux.c ���: final close failed: Input/output error collect2: ld returned 1 exit status [rh4-32bit]colin: gcc fsx-linux.c ��X: final close failed: Input/output error collect2: ld returned 1 exit status Strace gives: [pid 21682] close(9) = -1 EIO (Input/output error) Expected results: No error, which is what I get by running this on the local machine. Additional info: Seems to work to mounted on another F11 machine.
A lot of ext4 patches went into 2.6.29.5: http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=tree;f=releases/2.6.29.5;h=68b4262aba4a2b189cc8cf7a39d7fc5d333ede61;hb=HEAD
Am I reading it correctly that it only fails if the client is RHEL, and F11 clients are fine? Ok, I'll look into this, thanks. -Eric
You are correct. It fails consistently when the client is a 32 or 64 bit RHEL 4.8 or 5.3 system. F11 clients seem fine. Perhaps interestingly a compile from a Solaris 8 client to the F11 NFS server is actually fine. I've just discovered that this isn't related to ext4, it also fails from an ext3 exported file system using 2.6.29.5-191.fc11 too, again using a RHEL 4.8/5.3 client but fine from F11 clients. Sorry for that little piece of incorrect info, I assumed it was ext4 related but our other F11 machines hadn't yet had a reboot into this kernel newer.
Fails on xfs too. Punting to Jeff! :)
This looks like a server-side problem... Looks like the server is responding with success to the write, but with a count of 0. The client then figures this to mean that the write was short and returns -EIO on the close(). I'll look over the server side changes in this area. Colin, what was the last known "good" kernel?
The kernel 2.6.29.4-167.fc11 is fine, which I think was the last released one.
This might be a duplicate of bug 508174. Keeping an eye on koji now and when a -207 or later kernel pops out, I'll plan to test it out.
Looks like there's a -209 kernel in koji now. Colin, when you get a chance could you test that kernel and let me know if it resolves the problem? If so, then I'll close this as a duplicate of bug 508174. http://koji.fedoraproject.org/koji/buildinfo?buildID=112420
The 2.6.29.6-209.rc1.fc11 kernel does indeed appear to fix this issue. So I guess does look like a dup. Thanks
*** This bug has been marked as a duplicate of bug 508174 ***