Bug 562226

Summary: Review Request: ccl - Free Common Lisp implementation
Product: [Fedora] Fedora Reporter: Alexander Kahl <e-user>
Component: Package ReviewAssignee: Nobody's working on this, feel free to take it <nobody>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: fedora-package-review, gb, loganjerry, mattia.verga, notting, r.landmann
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-04 14:50:55 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On:    
Bug Blocks: 201449    

Description Alexander Kahl 2010-02-05 11:53:20 EST
Spec URL: http://akahl.fedorapeople.org/ccl/ccl.spec
SRPM URL: http://akahl.fedorapeople.org/ccl/ccl-1.4-1.fc12.src.rpm
Description:
Clozure CL is a fast, mature, open source Common Lisp implementation.
It compiles to native code and supports multithreading using native OS
threads. It includes a foreign-function interface, and supports both Lisp code
that calls external code, and external code that calls Lisp code. Clozure CL
can create standalone executables on all supported platforms.

Special notes:
The x86_64 kernel build creates a borked ELF header through a corrupted .stab section, sh_size 0xfd80 vs. sh_entsize 0x14 (the latter is correct for x86_64) which doesn't divide to an integer, resulting in rpm's debugedit failing during debuginfo creation ("invalid section entry size"), see libelf/elf32_updatenull.c:394 in elfutils-0.144.  Altering gcc, gas or ld options doesn't change anything - even the distributed pre-built kernel is affected.
This is also noted in the spec - my newly acquired low-level knowledge about ELF and ccl's involved Assembler code ends here however.
As a result, the kernel has to be pre-stripped so debuginfo will miss the stab (symbol table) for the kernel but the binary remains fully intact. I don't know whether this is ccl's, gas', gcc's or ld's fault.

Furthermore, the PPC and PPC64 kernels won't run and I don't own the appropriate hardware to test what's going on (gcc or gas flags?), please see scratch build at:
http://koji.fedoraproject.org/koji/taskinfo?taskID=1965080
If no one can help, I guess ExcludeArch/ExclusiveArch will do for now.

It would be nice to fix everything and get this into Fedora despite these problems.

As a last note, ccl will need two tagged builds in koji before going into any repository in order to resolve the circular dependency for bootstrapping, so the very first builds depends on the pre-built distributed heap images.
Comment 1 Jerry James 2010-02-24 17:35:29 EST
Hi Alexander,

The C code is being compiled to object files that contain valid DWARF-2 debug information.  The problem is with the assembler sources.  They are being processed with m4, which is adding explicit stabs information to the files, even though as would normally produce DWARF-2 debug information as well.  Search for ".stabs" in lisp-kernel/m4macros.m4.

One (untested) workaround is to strip only these files, prior to linking the executable:

imports.o
x86-asmutils64.o
x86-spentry64.o
x86-spjump64.o
x86-subprims64.o

That way you'd still get debug information for the C sources.  Long term, we should get upstream to stop inserting .stab sections into the output, or at least to insert correct .stab sections into the output if they really can't be bothered to move up to DWARF.  I wonder if passing -g to as and removing all the .stab junk from m4macros.m4 would work....

Incidentally, I figured out which files are the culprits like this, should it prove useful:

for f in *.o; do
  readelf -S $f | grep -F stab > /dev/null
  if [ $? = 0 ]; then echo $f; fi
done
Comment 2 Alexander Kahl 2010-02-25 12:12:19 EST
Hi Jerry,

actually I was also able to figure out the part about m4 and the affected files myself but how do you know the debugging format is DWARF-2? At least gcc and gas claim to use STABS natively if I get the manuals right so please enlighten me :)
Anyway, patching away the .stab* junk from m4macros.m4 worked flawlessly and now we have two properly supported architectures.

Can you help with ppc and ppc64?

Updated SPEC: http://akahl.fedorapeople.org/ccl/ccl.spec
Updated SRPM: http://akahl.fedorapeople.org/ccl/ccl-1.4-2.fc12.src.rpm

* Thu Feb 25 2010 Alexander Kahl <e-user@fsfe.org> - 1.4-2
- added patch0, removes borked DWARF-2 stab information
- added -g to ASFLAGS to have correct debugging information created
  automatically
- removed early stripping
Comment 3 Jerry James 2010-02-25 12:27:11 EST
Compile any C source with -g, and then do this to the executable:

readelf --debug-dump <name_of_executable> | grep -F DWARF

You should see one or more lines like this:

  DWARF Version:               2

The DWARF debug format is much more expressive than the old stabs format.  I assume we'll move up to DWARF-3 at some point.  I don't know why we haven't yet.  Perhaps some tools need to be ported to the new standard.

I don't have physical access to any PPC machines.  When I took over maintainership of GCL about a year and a half ago, someone on -devel gave me an account on his PPC machine so that I could debug some issues with it.  I seem to have misplaced the login information, though.   Hmm.  I'll try to dig it up.
Comment 4 Jerry James 2010-02-25 12:44:47 EST
It was Josh Boyer who gave me an account.  Unfortunately, I can't reach his machine right now.  I'll try again later, and contact him for instructions if I still can't get through.
Comment 5 Alexander Kahl 2010-03-09 04:41:09 EST
About #3/DWARF: Thank you for the insight!
About #4: Any update on this? I'm trying to get access to a PPC box myself right now.
Comment 6 Alexander Kahl 2010-03-09 11:20:24 EST
Bingo: I've managed to acquire an old PowerBook G4 at work and figured out it's SELinux ruining ccl's day on PPC machines ever since the default policy was changed to disallow "mmap_zero" by default.
Either we wait for PPC support to vanish in Fedora Land, exclude PPC or I'll have to create ccl-selinux with a new type for the ccl kernel on PPC systems that allows for this specific access.. your opinion?
Comment 7 Jerry James 2010-03-09 11:41:40 EST
I sent a query off to Josh Boyer and then forgot about it.  If it will help, I'll ping him again.  It sounds like you have what you need, though.

Do you know why ccl needs mmap_zero on PPC?  That's kind of an odd thing to do.  Since it doesn't seem to be needed on x86 platforms, perhaps it can be coded around somehow.
Comment 8 Alexander Kahl 2010-07-21 15:09:09 EDT
Spec URL: http://akahl.fedorapeople.org/ccl/ccl.spec
SRPM URL: http://akahl.fedorapeople.org/ccl/ccl-1.5-1.fc12.src.rpm

Changes:
- update to 1.5
- fixed install-clc path
- fixed install-clc script

After months of packaging abstinence but doing a lot of spare time CL coding somehow I've decided to get back on this. I'm going to do a test run on that spare PPC machine tomorrow and see if I can fix the mmap_zero issue.

Jerry, you still interested in founding the SIG..?
Comment 9 Alexander Kahl 2010-07-22 11:48:35 EDT
Oh.. I just realized PPC isn't supported as a primary architecture anymore. I'll remove the PPC part.
Comment 10 Jerry James 2010-08-04 14:57:13 EDT
I am still interested in the SIG.  Sorry, between vacation and a heavy work load (mostly due to having been on vacation :-) I haven't had much time for my hobbies recently.  I would like to get moving on that again, though.
Comment 11 Alexander Kahl 2010-08-05 04:15:17 EDT
Spec URL: http://akahl.fedorapeople.org/ccl/ccl.spec
SRPM URL: http://akahl.fedorapeople.org/ccl/ccl-1.5-2.fc13.src.rpm

Changes:
- removed/commented all ppc portions
- added --no-init to all kernel calls to prevent user init side-effects
Comment 12 Jason Tibbitts 2010-11-17 10:00:10 EST
I don't know if Jerry intends to get back to this at some point or not, but here are a few comments.  First, rpmlint:

  ccl.x86_64: W: executable-stack /usr/lib64/ccl/lx86cl64
This implies that there will be selinux problems.

  ccl.x86_64: E: non-standard-executable-perm /usr/lib64/ccl/lx86cl64 0775L
I think this is due to a mock bug and is specific to some buildsystems; if it doesn't happen for you then it should probably be ignored.

  ccl.x86_64: W: no-manual-page-for-binary ccl
Nice to have a manual page for command line applications.  Not essential, though.

  ccl-contrib.x86_64: W: only-non-binary-in-usr-lib
  ccl-examples.x86_64: W: only-non-binary-in-usr-lib
Why aren't these noarch, with the content in /usr/share?  Or better yet, why are they not documentation instead?  Also, are the examples actually useful?  The cocoa stuff is apple-specific, is it not?

  ccl-contrib.x86_64: W: devel-file-in-non-devel-package
   /usr/lib64/ccl/contrib/krueger/InterfaceProjects/Lisp
   IB Plugins/LispControllerPlugin/LispControllerPlugin.h
  ccl-contrib.x86_64: W: devel-file-in-non-devel-package
   /usr/lib64/ccl/contrib/krueger/InterfaceProjects/Lisp
   IB Plugins/LispControllerPlugin/LispControllerInspector.h
  ccl-contrib.x86_64: W: devel-file-in-non-devel-package
   /usr/lib64/ccl/contrib/krueger/InterfaceProjects/Lisp
   IB Plugins/LispControllerPlugin/LispController.h
  ccl-examples.x86_64: W: devel-file-in-non-devel-package
   /usr/lib64/ccl/examples/FFI/Allocating-foreign-data-on-the-lisp-heap/ptrtest.c
  ccl-examples.x86_64: W: devel-file-in-non-devel-package
   /usr/lib64/ccl/examples/FFI/Using-basic-calls-and-types/typetest.c
These are examples, so I expect this is OK, but see the previous complaint.

  ccl-contrib.x86_64: E: script-without-shebang
   /usr/lib64/ccl/contrib/krueger/InterfaceProjects/Utilities/date.lisp
Why is this executable?  How could it be run?

Regarding PPC, it's still supported as a secondary arch, although I don't know how much is happening with it currently.  As a courtesy you can ExcludeArch: the ppc stuff so they won't have to do it for you.
Comment 13 Alexander Kahl 2010-11-25 05:03:27 EST
Don't worry, I'm not "stalled" - just trying to allocate time for this. Sorry for letting you wait!
Comment 14 Alexander Kahl 2010-11-26 11:11:24 EST
Hey Jason, let's go:

(In reply to comment #12)
>   ccl.x86_64: W: executable-stack /usr/lib64/ccl/lx86cl64
> This implies that there will be selinux problems.
Sure? I can execute the kernel in enforcing mode without any prior policy tampering :)

>   ccl.x86_64: E: non-standard-executable-perm /usr/lib64/ccl/lx86cl64 0775L
> I think this is due to a mock bug and is specific to some buildsystems; if it
> doesn't happen for you then it should probably be ignored.
Yeah, so it seems - local building doesn't seem to cause this.

>   ccl.x86_64: W: no-manual-page-for-binary ccl
> Nice to have a manual page for command line applications.  Not essential,
> though.
True.

>   ccl-contrib.x86_64: W: only-non-binary-in-usr-lib
>   ccl-examples.x86_64: W: only-non-binary-in-usr-lib
> Why aren't these noarch, with the content in /usr/share?  
Because it's loadable examples / contributed code with hardwired load paths in the heap image, please see `ccl-1.5/linuxx86/ccl/l1-pathnames.lisp', global variable *module-search-path*. 

> Or better yet, why
> are they not documentation instead?  
I could inject some code into the image that modifies the paths.. but loading files from %_docdir feels wrong somehow.

%{_datadir}/ccl should be the way to go as %{_datadir}/common-lisp/source is CL implementation-independent. 
Please share your thoughts - is using ccl's home in %_libdir really that problematic?

> Also, are the examples actually useful? 
The FFI examples, definitely. The gtk clock looks ugly but works... what do you actually mean by "useful"? :)

> The cocoa stuff is apple-specific, is it not?
Hmm right, I could leave that out if you want - have a note about this in >> README.Fedora?

>   ccl-contrib.x86_64: W: devel-file-in-non-devel-package
>    /usr/lib64/ccl/contrib/krueger/InterfaceProjects/Lisp
>    IB Plugins/LispControllerPlugin/LispControllerPlugin.h
>   ccl-contrib.x86_64: W: devel-file-in-non-devel-package
>    /usr/lib64/ccl/contrib/krueger/InterfaceProjects/Lisp
>    IB Plugins/LispControllerPlugin/LispControllerInspector.h
>   ccl-contrib.x86_64: W: devel-file-in-non-devel-package
>    /usr/lib64/ccl/contrib/krueger/InterfaceProjects/Lisp
>    IB Plugins/LispControllerPlugin/LispController.h
>   ccl-examples.x86_64: W: devel-file-in-non-devel-package
>   
> /usr/lib64/ccl/examples/FFI/Allocating-foreign-data-on-the-lisp-heap/ptrtest.c
>   ccl-examples.x86_64: W: devel-file-in-non-devel-package
>    /usr/lib64/ccl/examples/FFI/Using-basic-calls-and-types/typetest.c
> These are examples, so I expect this is OK, but see the previous complaint.
The C source files get compiled and their results directly accessed from the lisp ones (-> FFI = Foreign Function Interface), moving them where they probably belong (%_datadir) won't make the warnings go away anyway.

>   ccl-contrib.x86_64: E: script-without-shebang
>    /usr/lib64/ccl/contrib/krueger/InterfaceProjects/Utilities/date.lisp
> Why is this executable?  How could it be run?
The executable bit is spurious, fixed in next release.

> Regarding PPC, it's still supported as a secondary arch, although I don't know
> how much is happening with it currently.  As a courtesy you can ExcludeArch:
> the ppc stuff so they won't have to do it for you.
You're right, added in next release.
Comment 15 Alexander Kahl 2010-11-26 11:20:10 EST
Spec URL: http://akahl.fedorapeople.org/ccl/ccl.spec
SRPM URL: http://akahl.fedorapeople.org/ccl/ccl-1.5-3.fc14.src.rpm

Changes:
- removed %%clean section
- removed explicit BuildRoot
- removed spurious executable bit from
  contrib/krueger/InterfaceProjects/Utilities/date.lisp
- added ExcludeArch ppc, ppc64
Comment 16 Jason Tibbitts 2010-12-02 13:30:43 EST
I talked to the selinux folks who are quite surprised that any binary in that path works with executable stack.  Are you certain that you are exercising the portion of the code that requires it?  It really should fail unless you set the allow_execstack boolean.

BTW, by "useful" I meant remotely relevant as examples.  A bunch of cocoa stuff that you can't even run because you're on the wrong operating system doesn't sound all that useful to me.
Comment 17 Jason Tibbitts 2010-12-02 13:41:20 EST
OK, after more checking, the package requires executable heap, not executable stack.  Setting allow_execmem off causes this when running the kernel:

> /usr/lib64/ccl/lx86cl64
remap spjump: Permission denied

That means that for most installs it will be OK, but for some it won't.  I can't really say how much of an issue this is; you probably want to talk to the selinux folks.
Comment 18 Alexander Kahl 2010-12-12 08:40:47 EST
Spec URL: http://akahl.fedorapeople.org/ccl/ccl.spec
SRPM URL: http://akahl.fedorapeople.org/ccl/ccl-1.6-1.fc14.src.rpm

Changes:
- update to 1.6


Still have to figure out the executable heap stuff, trying to contact upstream.
Comment 19 Gary Byers 2010-12-13 04:27:28 EST
Like many CL implementations, CCL compiles and loads executable code into its heap.  Functions in CL are first-class objects; they can be GCed (just like strings/arrays/other data structures) or moved around in memory as the GC sees fit.  The heap (whatever heap functions are allocated in) pretty much has to be executable for this to work.

Some kinds of functions (lexical closures, see http://ccl.clozure.com/blog/?p=53 for some of the gory details) can be declared to have DYNAMIC-EXTENT, which means that their extent (lifetime) is coterminous with their scope.  Such functions are good candidates for stack-allocation, which means that whatever stack dynamic-extent closures are allocated on pretty much has to be executable as well.

Hand-coded functions which provide runtime support for compiled CCL code (something vaguely analogous to some flavors of libgcc) are accessed via a jump table at a fixed address.  (Since the functions that call these support routines - aka "subprimitives" - can move around in memory and can be far from their targets, PC-relative addressing isn't practical.)

On some platforms, it's possible to use linker scripts (or similar means) to ensure that the jump table winds up at the expected address; on Linux (at least), linker scripts tend to break whenever some obscure thing in the C library changes, and it's therefore necessary to copy the "subprimitive jump table" from wherever the linker put it to the fixed address that it's expected to reside at at runtime.  On some architectures, that jump table contains "branch" or "jump" instructions (that may require their targets to be adjusted when the instructions are copied) and the copied jump table needs execute permission.   On x86-64, the jump table just contains absolute 64-bit addresses, and there's no reason for the copy to have execute permissions; the error in comment #17 above can be avoided.

If selinux can't deal with the other issues ... well, I was always taught "if you can't say anything nice about a piece of software, you shouldn't say anything at all", so I have nothing more to say here.
Comment 20 Ruediger Landmann 2011-03-20 20:11:35 EDT
(In reply to comment #18)

> Still have to figure out the executable heap stuff, trying to contact upstream.

Alexander, is this package currently ready for review? If not, could you please set the Whiteboard to "NotReady" (and clear that once you've clarified what you need to with upstream).

If so, I'm happy to undertake a review. However, because I'm not confident that I'm understanding all the issues that you and Jason have discussed, I would need to defer final approval of the package to Jason or another reviewer more experienced than me. 

Cheers
Rudi
Comment 21 Mattia Verga 2012-01-21 14:46:23 EST
Alexander, is this review request still valid? Can you provide working URLs for spec and src.rpm?
Comment 22 Mattia Verga 2012-02-04 14:50:55 EST
Closing as DEADREVIEW