The listxattr syscall can corrupt user space under certain circumstances. The problem seems to be related to signed/unsigned conversion during size promotion. The function return_EIO returns an int but its used as a ssize_t with a comparison to 0. This causes the range check to fail and copy_to_user copies way too much. The command line "fsfuzz iso9660" can easily reproduce this behavior. The problem here is the bad_inode_ops, and how they're set up. isofs creates a bad inode, which is fine, but then any op called from that inode is supposed to return -EIO via this method: static int return_EIO(void) { return -EIO; } #define EIO_ERROR_SSIZE ((void *) (return_EIO_ssize)) static struct inode_operations bad_inode_ops = { ... .listxattr = EIO_ERROR, ... } but, ssize_t (*listxattr) (struct dentry *, char *, size_t); and ssize_t is 64 bits on x86_64 and others, while return_EIO returns only an int (32 bits everywhere). So thanks to the (void *) cast we don't promote the type correctly, looks like, and our EIO, -5, 0xfffffffa turns into 0x00000000fffffffa or 4294967291, and we splat all over the user's buffer and then some.
committed in stream U5 build 42.40. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/
When running the test ssize_eio test case on ia64 I'm getting: [root@test155 bz220677]# ./ssize_eio and... error is -5, should be -5 [root@test155 bz220677]# uname -a Linux test155.test.redhat.com 2.6.9-42.0.6.EL #1 SMP Mon Jan 15 14:43:42 EST 2007 ia64 ia64 ia64 GNU/Linux ... which looks to be correct behavior However a run on x86_64 returns: [root@dhcp59-116 bz220677]# uname -a Linux dhcp59-116.rdu.redhat.com 2.6.9-42.0.6.EL #1 Mon Jan 15 14:43:55 EST 2007 x86_64 x86_64 x86_64 GNU/Linux [root@dhcp59-116 bz220677]# ./ssize_eio and... error is 4294967291, should be -5 it looks like we still have issues at least on x86_64, although I have gotten a couple of runs of fsfuzz iso9660 to work without apperant issue. I'll try this on a couple of other arch's and see what I get.
results for ppc & i386, note that there is no change in the test case output between -42 and -42.0.7 ppc: [root@ibm-js20-04 bz220677]# ./ssize_eio and... error is -5, should be -5 [root@ibm-js20-04 bz220677]# uname -a Linux ibm-js20-04.lab.boston.redhat.com 2.6.9-42.EL #1 SMP Wed Jul 12 23:22:51 EDT 2006 ppc64 ppc64 ppc64 GNU/Linux [root@ibm-js20-04 bz220677]# ./ssize_eio and... error is -5, should be -5 [root@ibm-js20-04 bz220677]# uname -r 2.6.9-42.EL i386: [root@dl585-02 bz220677]# ./ssize_eio and... error is -5, should be -5 [root@dl585-02 bz220677]# uname -a Linux dl585-02.rhts.boston.redhat.com 2.6.9-42.0.7.EL #1 Wed Jan 17 16:33:08 EST 2007 i686 athlon i386 GNU/Linux [root@dl585-02 bz220677]# ./ssize_eio and... error is -5, should be -5 [root@dl585-02 bz220677]# uname -a Linux dl585-02.rhts.boston.redhat.com 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:27:17 EDT 2006 i686 athlon i386 GNU/Linux
Mike, that ssize_eio program doesn't actually test the kernel, it's just a userspace demonstration of what was -wrong- with the kernel code as it had been written. It will -always- fail, that's how it's written. I don't have a very directed testcase for this bug, other than running the fuzzer on iso9660. To get more targeted maybe you could save off an iso9660 image that breaks in this special way on the old kernel, then re-test on an updated kernel. Thanks, -Eric
so far I've been unable to get an iso9660 out of fsfuzzer that shows this problem, however I think we have beaten on this one enough times. the fix for this is in linux-2.6.9-ext3-sub-second-timestamp.patch which is in both 42.0.8 as well as -43.el
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2007-0014.html