There appear to be failures in the assembly bitops that ext4 uses on s390. (I tested on 2.6.27-rc*) Simply making a filesystem with mkfs.ext4dev (from e2fsprogs-1.41.0), mounting it, and attempting to copy a few files to the fs will show the problem; the cp never completes, and the thread is either locked or spinning in the ext4 bg initialization code, which uses these bitops (via inlines, so not seen in the following trace): Call Trace: ([<0000000000016472>] show_trace+0xb2/0x130) [<00000000001793f4>] showacpu+0x48/0x68 [<0000000000024816>] do_ext_call_interrupt+0x8a/0xc4 [<000000000001c2b6>] do_extint+0xe2/0xfc [<0000000000020ea8>] ext_no_vtime+0x16/0x1a [<0000000020cd99f2>] ext4_mb_init_cache+0x9fe/0x1020 [ext4dev] ([<0000000020cd91a4>] ext4_mb_init_cache+0x1b0/0x1020 [ext4dev]) [<0000000020cdba9c>] ext4_mb_load_buddy+0x264/0x36c [ext4dev] [<0000000020cdc446>] ext4_mb_regular_allocator+0x53e/0x1218 [ext4dev] [<0000000020ce0d06>] ext4_mb_new_blocks+0x1a2/0x7e8 [ext4dev] [<0000000020cd6538>] ext4_ext_get_blocks+0xe3c/0x1074 [ext4dev] [<0000000020cc32be>] ext4_get_blocks_wrap+0x132/0x190 [ext4dev] [<0000000020cc40aa>] ext4_getblk+0x8a/0x26c [ext4dev] [<0000000020cc4b12>] ext4_bread+0x26/0xd8 [ext4dev] [<0000000020cc89fa>] ext4_mkdir+0x18e/0x3c8 [ext4dev] [<00000000000be54c>] vfs_mkdir+0x10c/0x1a8 [<00000000000c204e>] sys_mkdirat+0xca/0x114 [<00000000000208c0>] sysc_tracego+0xe/0x14 [<00000200001345e6>] 0x200001345e6 I've also pinged Martin (schwidefsky.com) and he said he'd look into it but I've not heard back after about a week. We'd like to ship ext4 as tech preview in RHEL5.3, and it'd be... best... if it worked on s390 too. I'd appreciate any help in getting this tracked down, and I can backport the fix to the RHEL5.3 kernel. Thanks, -Eric
Eric, the first version of ext4 that compiled on s390 after we had the bitops support implemented should have been working. Maybe you could try that and do a bisect search for the change that broke ext4?
Jan, which version was that, out of curiosity? FWIW, this sort of change: Index: linux-2.6/arch/s390/include/asm/bitops.h =================================================================== --- linux-2.6.orig/arch/s390/include/asm/bitops.h 2008-08-11 16:23:58.000000000 -0500 +++ linux-2.6/arch/s390/include/asm/bitops.h 2008-08-20 22:43:55.516165589 -0500 @@ -865,7 +865,7 @@ static inline int ext2_find_next_bit(voi * s390 version of ffz returns __BITOPS_WORDSIZE * if no zero bit is present in the word. */ - set = ffs(__load_ulong_le(p, 0) >> bit) + bit; + set = __ffs(__load_ulong_le(p, 0) >> bit) + bit; if (set >= size) return size + offset; if (set < __BITOPS_WORDSIZE) at least gets the "copy /lib/modules to an ext4 filesystem" test working; however, when I run fsstress I'm running into other trouble. The above changes the semantics of counting bits from starting at 1 to starting at 0; IOW, for a bitmap of all 1's, the original code did this: find next set bit starting at 0: 0 find next set bit starting at 1: 2 with the change, it's (properly, I think): find next set bit starting at 0: 0 find next set bit starting at 1: 1 -Eric
Ok, posted that a bit too soon. I think this gets it going: Index: linux-2.6/arch/s390/include/asm/bitops.h =================================================================== --- linux-2.6.orig/arch/s390/include/asm/bitops.h 2008-08-11 16:23:58.000000000 -0500 +++ linux-2.6/arch/s390/include/asm/bitops.h 2008-08-21 00:49:40.950176518 -0500 @@ -862,10 +862,10 @@ static inline int ext2_find_next_bit(voi p = addr + offset / __BITOPS_WORDSIZE; if (bit) { /* - * s390 version of ffz returns __BITOPS_WORDSIZE - * if no zero bit is present in the word. + * s390 version of ffs returns __BITOPS_WORDSIZE + * if no set bit is present in the word. */ - set = ffs(__load_ulong_le(p, 0) >> bit) + bit; + set = __ffs(__load_ulong_le(p, 0) & (~0UL << bit)); if (set >= size) return size + offset; if (set < __BITOPS_WORDSIZE) -Eric
Committed upstream: http://git390.osdl.marist.edu/cgi-bin/gitweb.cgi?p=linux-2.6.git;a=commitdiff;h=152382af4056aadc0c2ea2e8e8258b277be085bf
Patch is in Linus' git tree now. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=152382af4056aadc0c2ea2e8e8258b277be085bf
in kernel-2.6.18-110.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-0225.html