Bug 220053 - sbcl: ppc issues
sbcl: ppc issues
Product: Fedora
Classification: Fedora
Component: sbcl (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Rex Dieter
Fedora Extras Quality Assurance
Depends On:
Blocks: F-ExcludeArch-ppc
  Show dependency treegraph
Reported: 2006-12-18 13:10 EST by Rex Dieter
Modified: 2007-11-30 17:11 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2007-01-07 13:57:31 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Always assume page size is 64KiB on PowerPC. (571 bytes, patch)
2006-12-28 14:02 EST, David Woodhouse
no flags Details | Diff
Updated patch for 64KiB page size. (2.45 KB, patch)
2006-12-28 18:58 EST, David Woodhouse
no flags Details | Diff
Fix handling of the page-table core entry with large page sizes (768 bytes, patch)
2006-12-29 08:52 EST, Juho Snellman
no flags Details | Diff
working (possibly) patch (3.22 KB, patch)
2006-12-29 17:50 EST, David Woodhouse
no flags Details | Diff
patch fixing the test failures + cleaning up the other changes (9.12 KB, patch)
2006-12-30 11:59 EST, Juho Snellman
no flags Details | Diff

  None (edit)
Comment 1 Rex Dieter 2006-12-22 08:02:01 EST
Crud, now seeing similar stuff on devel/fc7 trying to (re)build maxima:

Maybe the latest sbcl/ppc is simply borked.  ):  
Comment 2 David Woodhouse 2006-12-22 13:04:00 EST
Let me know if you need access to a PowerPC machine for debugging.
Comment 4 David Woodhouse 2006-12-28 06:12:32 EST
What's the most recent release that worked? What changed? Again, let me know if
you need access to a PowerPC machine to debug this.
Comment 5 David Woodhouse 2006-12-28 13:38:54 EST
I haven't seen a segfault but I've seen this:

This is SBCL 1.0, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses.  See the CREDITS and COPYING files in the
distribution for more information.
in core: 0x40000000 - in runtime: 0x4000000
fatal error encountered in SBCL pid 29213:
core/runtime address mismatch: READ_ONLY_SPACE_START

This is because sbcl assumes a that the page size will always match the result
of getpagesize() on the build host -- but the PPC ABI says it can be 64KiB and
(stupidly, IMO) that's what we did in FC-6.

I think that setting it to 64KiB unconditionally on PPC instead of using
getpagesize() probably ought to work. Testing that hypothesis now...
Comment 6 David Woodhouse 2006-12-28 14:02:23 EST
Created attachment 144482 [details]
Always assume page size is 64KiB on PowerPC.
Comment 7 Rex Dieter 2006-12-28 15:14:29 EST
Thanks David, I'll send the patch upstream asap.

Afterwhich, I guess we'll have to wait until upstream produces a fixed/patched
sbcl binary for bootstrapping.
Comment 8 David Woodhouse 2006-12-28 18:58:15 EST
Created attachment 144512 [details]
Updated patch for 64KiB page size.

Building shouldn't be a problem -- we have machines with 4KiB pages on which we
can build.

The main problem is that the patch isn't sufficient -- there are a few more
places we make assumptions about the page size. I think that in _principle_ it
ought to work if we iron out the details, but we may need help from upstream to
do that. Here's a current patch (which will break non-PPC builds but it should
be simple enough to fix that if you know the language/environment).

Unfortunately it still doesn't work -- when I install it on a 64KiB-page host
and ask it to rebuild itself, it does this...

//entering make-host-1.sh
//building cross-compiler, and doing first genesis
This is SBCL 1.0.1, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses.  See the CREDITS and COPYING files in the
distribution for more information.
* *** glibc detected *** sbcl: free(): invalid next size (normal): 0x10054248
======= Backtrace: =========
======= Memory map: ========
00100000-00120000 r-xp 00100000 00:00 0 				 [vdso]

04000000-04010000 rwxp 00010000 08:03 15835950				
04010000-08000000 rwxp 04010000 00:00 0 
08000000-08010000 rwxp 00020000 08:03 15835950				
08010000-09800000 rwxp 08010000 00:00 0 
0a000000-0b000000 rwxp 0a000000 00:00 0 
0f340000-0f350000 r-xp 00000000 08:03 47417259				
0f350000-0f360000 r-xp 00000000 08:03 47417259				
0f360000-0f370000 rwxp 00010000 08:03 47417259				
0f390000-0f450000 r-xp 00000000 08:03 47417258				
0f450000-0f460000 r-xp 000b0000 08:03 47417258				
0f460000-0f470000 rwxp 000c0000 08:03 47417258				
0f470000-0f5d0000 r-xp 00000000 08:03 47417253				
0f5d0000-0f5e0000 r-xp 00160000 08:03 47417253				
0f5e0000-0f5f0000 rwxp 00170000 08:03 47417253				
0ffc0000-0ffe0000 r-xp 00000000 08:03 47417252				
0ffe0000-0fff0000 r-xp 00010000 08:03 47417252				
0fff0000-10000000 rwxp 00020000 08:03 47417252				
10000000-10020000 r-xp 00000000 08:03 15836373				
10020000-10030000 rwxp 00020000 08:03 15836373				
10030000-10080000 rwxp 10030000 00:00 0 				 [heap]

40000000-40030000 rwxp 40000000 00:00 0 
40030000-40040000 ---p 40030000 00:00 0 
40040000-40050000 rwxp 40040000 00:00 0 
40050000-40240000 rwxp 40050000 00:00 0 
40240000-40250000 r-xp 40240000 00:00 0 
40250000-404e0000 rwxp 40250000 00:00 0 
4f000000-4f020000 rwxp 00030000 08:03 15835950				
4f020000-4f040000 r-xp 00050000 08:03 15835950				
4f040000-4f050000 rwxp 00070000 08:03 15835950				
4f050000-4f060000 r-xp 00080000 08:03 15835950				
4f060000-4f070000 r-xp 00090000 08:03 15835950				
4f070000-4f080000 rwxp 000a0000 08:03 15835950				
4f080000-4f090000 r-xp 000b0000 08:03 15835950				
4f090000-4f0a0000 rwxp 000c0000 08:03 15835950				
4f0a0000-4f0e0000 r-xp 000d0000 08:03 15835950				
4f0e0000-4f130000 rwxp 00110000 08:03 15835950				
4f130000-4f170000 r-xp 00160000 08:03 15835950				
4f170000-4f180000 r-xp 001a0000 08:03 15835950				
4f180000-4f1a0000 rwxp 001b0000 08:03 15835950				
4f1a0000-4f1d0000 r-xp 001d0000 08:03 15835950				
4f1d0000-4f1e0000 rwxp 00200000 08:03 15835950				
4f1e0000-4f270000 r-xp 00210000 08:03 15835950				
4f270000-4f290000 rwxp 002a0000 08:03 15835950		  fatal error
encountered in SBCL pid 2436:
%primitive halt called; the party is over.

error: Bad exit status from /var/tmp/rpm-tmp.80549 (%build)

I can provide access to hosts with both 64KiB and 4KiB pages.
Comment 9 David Woodhouse 2006-12-28 21:10:57 EST
The same build (with my patch) also fails on the 4KiB-page kernel, this time
with a SEGV:

Program received signal SIGSEGV, Segmentation fault.
0x1000802c in load_core_file (file=0x10026120 "", file_offset=0)
    at coreparse.c:353
353                         page_table[offset++].first_object_offset = data[i++];

It seems that page_table[] isn't large enough for what we're copying into it.
Fixing that (don't write if !data[i]) leads to another segfault in a function
called from funcall0() from initial_thread_trampoline() -- although that may
just be GC. Run it normally instead of under gdb and it just seems to go into an
endless loop eating CPU time.

Without my patch it all works fine on a 4KiB-page host, and it does _build_ OK
with my patch. I suspect it's just creating and parsing the core file which
isn't working correctly. Could do with more help from someone who's familiar
with the language and the runtime environment.
Comment 10 Juho Snellman 2006-12-29 08:52:23 EST
Created attachment 144540 [details]
Fix handling of the page-table core entry with large page sizes

Hmm. Is it possible that there's a bug in the changes you made to coreparse.c?
Your patch + the attached one for coreparse works for me.
Comment 11 David Woodhouse 2006-12-29 15:41:28 EST
Hm, I'm rarely incompetent enough to screw something as simple as that up, but
it's possible. Building with your version now to confirm, along with the other
changes in my patch of comment #8.

What platform did you try this on? I only changed page size definitions on
PowerPC, so if you want to try using 64KiB "pages" on other platforms you'll
need to change compiler/$CPU/{backend-,}parms.lisp accordingly.

If you mail me a SSH public key, I'll give you an account on PowerPC machines
with both 4KiB and 64KiB pages. Getting it to work with 64KiB pages on a host
which really only has 4KiB pages would be a good start though.
Comment 12 David Woodhouse 2006-12-29 17:50:55 EST
Created attachment 144568 [details]
working (possibly) patch

OK, with this patch it does at least seem to build, although the results of the
self-tests look scary -- see http://david.woodhou.se/sbcl-64KiB-build.log

Someone more clueful than I would need to fix my hard-coded '65536' in
src/code/linux-os.lisp; it should be 65536 only for PowerPC. Then, if the build
output I linked above looks OK, we can apply this patch to the package?
Comment 13 Juho Snellman 2006-12-30 11:59:40 EST
Created attachment 144583 [details]
patch fixing the test failures + cleaning up the other changes

The attached patch fixes the new test failures on ppc and cleans up the
Lisp-side changes.

I haven't been able to test it on the 65k ppc yet due to network problems, but
assuming it works there, I think this could be applied to the package. (And
something similar to this will probably be committed to the upstream SBCL for
the next release).
Comment 14 David Woodhouse 2006-12-30 14:45:19 EST
Looks much better; thanks. Having built with that patch I then rebuilt on a
machine with 64KiB pages; results at

I can reboot net2-101.woodhou.se onto a 64KiB page kernel if you'd like to do
more tests.
Comment 15 Juho Snellman 2006-12-30 15:43:05 EST
No, that looks good enough. Thanks for looking at this, and for the ppc access.
Comment 16 Rex Dieter 2006-12-30 21:56:48 EST
Excellent.  If David (or any other trusted Fedora Contributor for that matter),
can build, and create a binary ppc bootstrap (run binary-distribution.sh), and
make it available to me/fedora-buildsystem, we should(!) be good to go.
Comment 17 David Woodhouse 2006-12-31 06:57:24 EST

4b8d12b891a6bb9b49aa1ca8262a5a75  sbcl-1.0.1-binary.tar.bz2

Version: GnuPG v1.4.6 (GNU/Linux)

Comment 18 David Woodhouse 2006-12-31 06:58:24 EST
I wonder if we should do this for x86_64 too -- doesn't the ABI allow 16KiB
pages there?
Comment 19 Rex Dieter 2006-12-31 22:18:45 EST
Looks like we have a winner!
Comment 20 Rex Dieter 2006-12-31 23:08:35 EST
Good news:
FC-6 build humming along nicely:

Bad news:
FC-5 build failed (same ppc segfault as before) ??

EL-4 build failed due to GLIBC incompatibility with provided bootstrap
Comment 21 David Woodhouse 2007-01-01 12:28:16 EST
(In reply to comment #20)
> Bad news:
> FC-5 build failed (same ppc segfault as before) ??
> http://buildsys.fedoraproject.org/build-status/job.psp?uid=24820

Nah, that's glibc incompatibility too. No binaries built on FC-6 will run on
FC-5 or below due to the ld.so hash changes. I'll rebuild the bootstrap tarball
on RHEL4.

Comment 22 David Woodhouse 2007-01-01 17:05:14 EST

88995b87e548be7b850cd3600cb5628c  sbcl-1.0.1-binary-RHEL4-ppc.tar.bz2

Version: GnuPG v1.4.6 (GNU/Linux)

Comment 23 Rex Dieter 2007-01-05 07:57:59 EST
Thanks (again) David. (I should have asked/realized about the fc6 binary
compatibility issues with previous releases)
Comment 24 David Woodhouse 2007-01-06 21:38:57 EST
Did you get it built ok for FC5/RHEL4?
Comment 25 Rex Dieter 2007-01-07 13:57:31 EST
yup, we are no good to go, closing.

No peep from upstream yet regarding this bug (and patch). ): 
See thread starting at:
Comment 26 David Woodhouse 2007-01-07 23:00:36 EST

Regarding upstream; I was sort of assuming that Juho, who fixed up my initial
half-baked patch, was involved with upstream and would take care of that angle.

There remains the possibility that we should do something similar on x86_64,
because I have a feeling that the ELF ABI there _also_ allows for pages larger
than 4KiB.
Comment 27 Juho Snellman 2007-01-07 23:47:33 EST
Yes, to all of that :-)

I'm part of upstream, will eventually take care of getting this merged, and
something will still need to be done for x86-64, which allows for page sizes
between 4kB-64kB. But for the latter I want to first figure out which of the
page-size dependencies actually serve some useful purpose, and which ones don't.
Comment 28 Rex Dieter 2007-01-07 23:50:42 EST
Juho, thanks.

Note You need to log in before you can comment on or make changes to this bug.