Can't get any ppc bootstraps to function on EL-4. ): See: http://buildsys.fedoraproject.org/logs/fedora-4-epel/24078-sbcl-1.0-1.el4/ http://buildsys.fedoraproject.org/logs/fedora-4-epel/24082-sbcl-1.0-1.el4.1/ http://buildsys.fedoraproject.org/logs/fedora-4-epel/24084-sbcl-1.0-1.el4.2/
Crud, now seeing similar stuff on devel/fc7 trying to (re)build maxima: http://buildsys.fedoraproject.org/build-status/job.psp?uid=24353 Maybe the latest sbcl/ppc is simply borked. ):
Let me know if you need access to a PowerPC machine for debugging.
http://buildsys.fedoraproject.org/logs/fedora-development-extras/24638-sbcl-1.0.1-1.fc7/
What's the most recent release that worked? What changed? Again, let me know if you need access to a PowerPC machine to debug this.
I haven't seen a segfault but I've seen this: This is SBCL 1.0, an implementation of ANSI Common Lisp. More information about SBCL is available at <http://www.sbcl.org/>. SBCL is free software, provided as is, with absolutely no warranty. It is mostly in the public domain; some portions are provided under BSD-style licenses. See the CREDITS and COPYING files in the distribution for more information. in core: 0x40000000 - in runtime: 0x4000000 fatal error encountered in SBCL pid 29213: core/runtime address mismatch: READ_ONLY_SPACE_START This is because sbcl assumes a that the page size will always match the result of getpagesize() on the build host -- but the PPC ABI says it can be 64KiB and (stupidly, IMO) that's what we did in FC-6. I think that setting it to 64KiB unconditionally on PPC instead of using getpagesize() probably ought to work. Testing that hypothesis now...
Created attachment 144482 [details] Always assume page size is 64KiB on PowerPC.
Thanks David, I'll send the patch upstream asap. Afterwhich, I guess we'll have to wait until upstream produces a fixed/patched sbcl binary for bootstrapping.
Created attachment 144512 [details] Updated patch for 64KiB page size. Building shouldn't be a problem -- we have machines with 4KiB pages on which we can build. The main problem is that the patch isn't sufficient -- there are a few more places we make assumptions about the page size. I think that in _principle_ it ought to work if we iron out the details, but we may need help from upstream to do that. Here's a current patch (which will break non-PPC builds but it should be simple enough to fix that if you know the language/environment). Unfortunately it still doesn't work -- when I install it on a 64KiB-page host and ask it to rebuild itself, it does this... //entering make-host-1.sh //building cross-compiler, and doing first genesis This is SBCL 1.0.1, an implementation of ANSI Common Lisp. More information about SBCL is available at <http://www.sbcl.org/>. SBCL is free software, provided as is, with absolutely no warranty. It is mostly in the public domain; some portions are provided under BSD-style licenses. See the CREDITS and COPYING files in the distribution for more information. * 5 * *** glibc detected *** sbcl: free(): invalid next size (normal): 0x10054248 *** ======= Backtrace: ========= /lib/libc.so.6[0xf4ee394] /lib/libc.so.6(cfree+0xc8)[0xf4ee5e8] sbcl(wrapped_readlink+0x90)[0x100127b4] sbcl(call_into_c+0x70)[0x10018c58] [0x81a4] [0x1] sbcl(funcall0+0x1c)[0x1001321c] sbcl[0x10011a24] sbcl(main+0x218)[0x10010bd4] /lib/libc.so.6[0xf48dd4c] /lib/libc.so.6(__libc_start_main+0x144)[0xf48df74] ======= Memory map: ======== 00100000-00120000 r-xp 00100000 00:00 0 [vdso] 04000000-04010000 rwxp 00010000 08:03 15835950 /usr/lib/sbcl/sbcl.core 04010000-08000000 rwxp 04010000 00:00 0 08000000-08010000 rwxp 00020000 08:03 15835950 /usr/lib/sbcl/sbcl.core 08010000-09800000 rwxp 08010000 00:00 0 0a000000-0b000000 rwxp 0a000000 00:00 0 0f340000-0f350000 r-xp 00000000 08:03 47417259 /lib/libdl-2.5.so 0f350000-0f360000 r-xp 00000000 08:03 47417259 /lib/libdl-2.5.so 0f360000-0f370000 rwxp 00010000 08:03 47417259 /lib/libdl-2.5.so 0f390000-0f450000 r-xp 00000000 08:03 47417258 /lib/libm-2.5.so 0f450000-0f460000 r-xp 000b0000 08:03 47417258 /lib/libm-2.5.so 0f460000-0f470000 rwxp 000c0000 08:03 47417258 /lib/libm-2.5.so 0f470000-0f5d0000 r-xp 00000000 08:03 47417253 /lib/libc-2.5.so 0f5d0000-0f5e0000 r-xp 00160000 08:03 47417253 /lib/libc-2.5.so 0f5e0000-0f5f0000 rwxp 00170000 08:03 47417253 /lib/libc-2.5.so 0ffc0000-0ffe0000 r-xp 00000000 08:03 47417252 /lib/ld-2.5.so 0ffe0000-0fff0000 r-xp 00010000 08:03 47417252 /lib/ld-2.5.so 0fff0000-10000000 rwxp 00020000 08:03 47417252 /lib/ld-2.5.so 10000000-10020000 r-xp 00000000 08:03 15836373 /usr/bin/sbcl 10020000-10030000 rwxp 00020000 08:03 15836373 /usr/bin/sbcl 10030000-10080000 rwxp 10030000 00:00 0 [heap] 40000000-40030000 rwxp 40000000 00:00 0 40030000-40040000 ---p 40030000 00:00 0 40040000-40050000 rwxp 40040000 00:00 0 40050000-40240000 rwxp 40050000 00:00 0 40240000-40250000 r-xp 40240000 00:00 0 40250000-404e0000 rwxp 40250000 00:00 0 4f000000-4f020000 rwxp 00030000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f020000-4f040000 r-xp 00050000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f040000-4f050000 rwxp 00070000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f050000-4f060000 r-xp 00080000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f060000-4f070000 r-xp 00090000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f070000-4f080000 rwxp 000a0000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f080000-4f090000 r-xp 000b0000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f090000-4f0a0000 rwxp 000c0000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f0a0000-4f0e0000 r-xp 000d0000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f0e0000-4f130000 rwxp 00110000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f130000-4f170000 r-xp 00160000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f170000-4f180000 r-xp 001a0000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f180000-4f1a0000 rwxp 001b0000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f1a0000-4f1d0000 r-xp 001d0000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f1d0000-4f1e0000 rwxp 00200000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f1e0000-4f270000 r-xp 00210000 08:03 15835950 /usr/lib/sbcl/sbcl.core 4f270000-4f290000 rwxp 002a0000 08:03 15835950 fatal error encountered in SBCL pid 2436: %primitive halt called; the party is over. error: Bad exit status from /var/tmp/rpm-tmp.80549 (%build) I can provide access to hosts with both 64KiB and 4KiB pages.
The same build (with my patch) also fails on the 4KiB-page kernel, this time with a SEGV: Program received signal SIGSEGV, Segmentation fault. 0x1000802c in load_core_file (file=0x10026120 "", file_offset=0) at coreparse.c:353 353 page_table[offset++].first_object_offset = data[i++]; It seems that page_table[] isn't large enough for what we're copying into it. Fixing that (don't write if !data[i]) leads to another segfault in a function called from funcall0() from initial_thread_trampoline() -- although that may just be GC. Run it normally instead of under gdb and it just seems to go into an endless loop eating CPU time. Without my patch it all works fine on a 4KiB-page host, and it does _build_ OK with my patch. I suspect it's just creating and parsing the core file which isn't working correctly. Could do with more help from someone who's familiar with the language and the runtime environment.
Created attachment 144540 [details] Fix handling of the page-table core entry with large page sizes Hmm. Is it possible that there's a bug in the changes you made to coreparse.c? Your patch + the attached one for coreparse works for me.
Hm, I'm rarely incompetent enough to screw something as simple as that up, but it's possible. Building with your version now to confirm, along with the other changes in my patch of comment #8. What platform did you try this on? I only changed page size definitions on PowerPC, so if you want to try using 64KiB "pages" on other platforms you'll need to change compiler/$CPU/{backend-,}parms.lisp accordingly. If you mail me a SSH public key, I'll give you an account on PowerPC machines with both 4KiB and 64KiB pages. Getting it to work with 64KiB pages on a host which really only has 4KiB pages would be a good start though.
Created attachment 144568 [details] working (possibly) patch OK, with this patch it does at least seem to build, although the results of the self-tests look scary -- see http://david.woodhou.se/sbcl-64KiB-build.log Someone more clueful than I would need to fix my hard-coded '65536' in src/code/linux-os.lisp; it should be 65536 only for PowerPC. Then, if the build output I linked above looks OK, we can apply this patch to the package?
Created attachment 144583 [details] patch fixing the test failures + cleaning up the other changes The attached patch fixes the new test failures on ppc and cleans up the Lisp-side changes. I haven't been able to test it on the 65k ppc yet due to network problems, but assuming it works there, I think this could be applied to the package. (And something similar to this will probably be committed to the upstream SBCL for the next release).
Looks much better; thanks. Having built with that patch I then rebuilt on a machine with 64KiB pages; results at http://david.woodhou.se/sbcl-build-on-64KiB-page.log I can reboot net2-101.woodhou.se onto a 64KiB page kernel if you'd like to do more tests.
No, that looks good enough. Thanks for looking at this, and for the ppc access.
Excellent. If David (or any other trusted Fedora Contributor for that matter), can build, and create a binary ppc bootstrap (run binary-distribution.sh), and make it available to me/fedora-buildsystem, we should(!) be good to go.
http://david.woodhou.se/sbcl-1.0.1-binary.tar.bz2 4b8d12b891a6bb9b49aa1ca8262a5a75 sbcl-1.0.1-binary.tar.bz2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iQDVAwUARZelTcKjXUokOhMpAQLKJAX/RowyKIKJOzOulhzgzgG2pkUL6XTKYcNt znhxHc6A9Wk7AQyW3F4enoFJNi0qqX1QFUXXSPJiDudVowyI+eI3+T7gtokLBQxT J9wlxwdwu5B4UrVpCa7TpMDIEX2AtqVlKwyAOa4bG0bsJOagxj8XAt9P14qupUvN Fpz8t3YsNU9Mb3qBIouYpXoyOkd5tIlkow/iy1SekNCBJRG2y9TqSnNKju+eYnCV vD3sr2h6onScIs5EIQy73FBmy08O0Oio =QteW -----END PGP SIGNATURE-----
I wonder if we should do this for x86_64 too -- doesn't the ABI allow 16KiB pages there?
Looks like we have a winner! http://buildsys.fedoraproject.org/build-status/job.psp?uid=24817
Good news: FC-6 build humming along nicely: http://buildsys.fedoraproject.org/build-status/job.psp?uid=24821 Bad news: FC-5 build failed (same ppc segfault as before) ?? http://buildsys.fedoraproject.org/build-status/job.psp?uid=24820 EL-4 build failed due to GLIBC incompatibility with provided bootstrap http://buildsys.fedoraproject.org/build-status/job.psp?uid=24819
(In reply to comment #20) > Bad news: > FC-5 build failed (same ppc segfault as before) ?? > http://buildsys.fedoraproject.org/build-status/job.psp?uid=24820 Nah, that's glibc incompatibility too. No binaries built on FC-6 will run on FC-5 or below due to the ld.so hash changes. I'll rebuild the bootstrap tarball on RHEL4.
http://david.woodhou.se/sbcl-1.0.1-binary-RHEL4-ppc.tar.bz2 88995b87e548be7b850cd3600cb5628c sbcl-1.0.1-binary-RHEL4-ppc.tar.bz2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iQDVAwUARZmFUcKjXUokOhMpAQLuPAYAk9xCmq2qwI9EYPRfom5VUcxQ2rk5iRA4 hlzaMo+ZKgtxvl4cQIHMFPGsnJkiNXIm2suwgN8n3aG4MTJqtJf369LPeX9H05Wf nj2JgiuAFY83OTO4mAsS3igu14Igm5vSTBp82RxobqCR2jhr8f4sCdUDdKz/QAHz FCMB4mBhkWcgoY4WptJ1nnnJ7hz8NZfPt8RiDTaNtlr31nm8ktR7d2YsMErRDfLK SsYiI01FLwUCnOLOgQvctQFGGztTWDai =jn97 -----END PGP SIGNATURE-----
Thanks (again) David. (I should have asked/realized about the fc6 binary compatibility issues with previous releases)
Did you get it built ok for FC5/RHEL4?
yup, we are no good to go, closing. No peep from upstream yet regarding this bug (and patch). ): See thread starting at: http://sourceforge.net/mailarchive/message.php?msg_id=37803381
Thanks. Regarding upstream; I was sort of assuming that Juho, who fixed up my initial half-baked patch, was involved with upstream and would take care of that angle. There remains the possibility that we should do something similar on x86_64, because I have a feeling that the ELF ABI there _also_ allows for pages larger than 4KiB.
Yes, to all of that :-) I'm part of upstream, will eventually take care of getting this merged, and something will still need to be done for x86-64, which allows for page sizes between 4kB-64kB. But for the latter I want to first figure out which of the page-size dependencies actually serve some useful purpose, and which ones don't.
Juho, thanks.