Bug 459436 - ext4 assembly bitops failures on s390
ext4 assembly bitops failures on s390
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.3
All Linux
medium Severity high
: rc
: ---
Assigned To: Eric Sandeen
Martin Jenner
:
Depends On:
Blocks: 447797
  Show dependency treegraph
 
Reported: 2008-08-18 16:41 EDT by Eric Sandeen
Modified: 2009-01-20 15:10 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-20 15:10:49 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Eric Sandeen 2008-08-18 16:41:39 EDT
There appear to be failures in the assembly bitops that ext4 uses on s390.  (I tested on 2.6.27-rc*)

Simply making a filesystem with mkfs.ext4dev (from e2fsprogs-1.41.0), mounting it, and attempting to copy a few files to the fs will show the problem; the cp never completes, and the thread is either locked or spinning in the ext4 bg initialization code, which uses these bitops (via inlines, so not seen in the following trace):

Call Trace:
([<0000000000016472>] show_trace+0xb2/0x130)
 [<00000000001793f4>] showacpu+0x48/0x68
 [<0000000000024816>] do_ext_call_interrupt+0x8a/0xc4
 [<000000000001c2b6>] do_extint+0xe2/0xfc
 [<0000000000020ea8>] ext_no_vtime+0x16/0x1a
 [<0000000020cd99f2>] ext4_mb_init_cache+0x9fe/0x1020 [ext4dev]
([<0000000020cd91a4>] ext4_mb_init_cache+0x1b0/0x1020 [ext4dev])
 [<0000000020cdba9c>] ext4_mb_load_buddy+0x264/0x36c [ext4dev]
 [<0000000020cdc446>] ext4_mb_regular_allocator+0x53e/0x1218 [ext4dev]
 [<0000000020ce0d06>] ext4_mb_new_blocks+0x1a2/0x7e8 [ext4dev]
 [<0000000020cd6538>] ext4_ext_get_blocks+0xe3c/0x1074 [ext4dev]
 [<0000000020cc32be>] ext4_get_blocks_wrap+0x132/0x190 [ext4dev]
 [<0000000020cc40aa>] ext4_getblk+0x8a/0x26c [ext4dev]
 [<0000000020cc4b12>] ext4_bread+0x26/0xd8 [ext4dev]
 [<0000000020cc89fa>] ext4_mkdir+0x18e/0x3c8 [ext4dev]
 [<00000000000be54c>] vfs_mkdir+0x10c/0x1a8
 [<00000000000c204e>] sys_mkdirat+0xca/0x114
 [<00000000000208c0>] sysc_tracego+0xe/0x14
 [<00000200001345e6>] 0x200001345e6

I've also pinged Martin (schwidefsky@de.ibm.com) and he said he'd look into it but I've not heard back after about a week.

We'd like to ship ext4 as tech preview in RHEL5.3, and it'd be... best... if it worked on s390 too.

I'd appreciate any help in getting this tracked down, and I can backport the fix to the RHEL5.3 kernel.

Thanks,
-Eric
Comment 1 Jan Glauber 2008-08-20 07:34:47 EDT
Eric,
the first version of ext4 that compiled on s390 after we had the bitops support
implemented should have been working. Maybe you could try that and do a bisect search for the change that broke ext4?
Comment 2 Eric Sandeen 2008-08-21 01:39:12 EDT
Jan, which version was that, out of curiosity?

FWIW, this sort of change:

Index: linux-2.6/arch/s390/include/asm/bitops.h
===================================================================
--- linux-2.6.orig/arch/s390/include/asm/bitops.h	2008-08-11 16:23:58.000000000 -0500
+++ linux-2.6/arch/s390/include/asm/bitops.h	2008-08-20 22:43:55.516165589 -0500
@@ -865,7 +865,7 @@ static inline int ext2_find_next_bit(voi
 		 * s390 version of ffz returns __BITOPS_WORDSIZE
 		 * if no zero bit is present in the word.
 		 */
-		set = ffs(__load_ulong_le(p, 0) >> bit) + bit;
+		set = __ffs(__load_ulong_le(p, 0) >> bit) + bit;
 		if (set >= size)
 			return size + offset;
 		if (set < __BITOPS_WORDSIZE)

at least gets the "copy /lib/modules to an ext4 filesystem" test working; however, when I run fsstress I'm running into other trouble.

The above changes the semantics of counting bits from starting at 1 to starting at 0; IOW, for a bitmap of all 1's, the original code did this:

find next set bit starting at 0: 0
find next set bit starting at 1: 2

with the change, it's (properly, I think):

find next set bit starting at 0: 0
find next set bit starting at 1: 1

-Eric
Comment 3 Eric Sandeen 2008-08-21 02:10:43 EDT
Ok, posted that a bit too soon.  I think this gets it going:

Index: linux-2.6/arch/s390/include/asm/bitops.h
===================================================================
--- linux-2.6.orig/arch/s390/include/asm/bitops.h	2008-08-11 16:23:58.000000000 -0500
+++ linux-2.6/arch/s390/include/asm/bitops.h	2008-08-21 00:49:40.950176518 -0500
@@ -862,10 +862,10 @@ static inline int ext2_find_next_bit(voi
 	p = addr + offset / __BITOPS_WORDSIZE;
 	if (bit) {
 		/*
-		 * s390 version of ffz returns __BITOPS_WORDSIZE
-		 * if no zero bit is present in the word.
+		 * s390 version of ffs returns __BITOPS_WORDSIZE
+		 * if no set bit is present in the word.
 		 */
-		set = ffs(__load_ulong_le(p, 0) >> bit) + bit;
+		set = __ffs(__load_ulong_le(p, 0) & (~0UL << bit));
 		if (set >= size)
 			return size + offset;
 		if (set < __BITOPS_WORDSIZE)

-Eric
Comment 7 Don Zickus 2008-09-10 16:14:54 EDT
in kernel-2.6.18-110.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 11 errata-xmlrpc 2009-01-20 15:10:49 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html

Note You need to log in before you can comment on or make changes to this bug.