Bug 845652

Summary: Frama-c segfaults on startup
Product: [Fedora] Fedora Reporter: Jerry James <loganjerry>
Component: frama-cAssignee: Jerry James <loganjerry>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: high    
Version: rawhideCC: c.david86, fedora-ocaml-list, loganjerry, msrader, rjones
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-10-10 16:45:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jerry James 2012-08-03 18:11:52 UTC
Description of problem:
The new build of frama-c with OCaml 4.00.0 segfaults on startup.  I don't think this is related to the i386 segfault bug, because this segfault happens on both i386 and x86_64, and is easily reproducible in a Rawhide VM.  It looks like this:

$ frama-c -v
[ 9014.843130] frama-c[28856]: segfault at 10008 ip 08510824 spbf99cc10 error 4 in frama-c[8048000+5b4000]
$ gdb /usr/bin/frama-c
...
(gdb) run -v
Starting program: /usr/bin/frama-c -v

Program received signal SIGSEGV, Segmentation fault.
0x08510824 in camlHashtbl__replace_bucket_1257 ()
Missing separate debuginfo, use: debuginfo-install frama-c-1.7-5.fc18.i686
(gdb) bt
#0  0x08510824 in camlHashtbl__replace_bucket_1257 ()
#1  0x08511385 in camlHashtbl__replace_1253 ()
#2  0x08288612 in camlSlicing__Register__entry ()
#3  0x0826bdc1 in caml_program ()
#4  0x08548caa in caml_start_program ()
#5  0x08535f62 in caml_main ()
#6  0x0826aca5 in main ()

This could be a frama-c bug, but the version built with ocaml 3.12.1 worked, and the bytecode version built with ocaml 4.00.0 also works.  It's just the native version built with ocaml 4.00.0 that is crashing.

Since this is blocking rebuild of the why package, I'm going to temporarily ship the bytecode version, get why rebuilt, then we can revert to the native compiled version for analysis.

Version-Release number of selected component (if applicable):
ocaml-4.00.0-1.fc18.x86_64
frama-c-1.7-5.fc18.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Install frama-c (or build it locally without the hack to ship the bytecode)
2. Run "frama-c -v"
  
Actual results:
Segfault.

Expected results:
Version information on stdout.

Additional info:

Comment 1 Richard W.M. Jones 2012-08-03 18:22:34 UTC
The only known problem (possibly with the compiler,
possibly not) is this one:

http://lists.diku.dk/pipermail/cocci/2012-July/003158.html

The i386 segfault thing only happens when using Rawhide
userspace on RHEL 6 kernel, so I'm mostly ignoring that
for now.

Comment 2 Jerry James 2012-08-03 21:54:13 UTC
I had to apply a patch to frama-c due to the changed Hashtbl signature in OCaml 4.00.0, so my patch is immediately suspect.  Of course, the bytecode version works, so I'm a bit puzzled....

With the bytecode version of frama-c installed, the why build gets farther, but stops upon attempting to invoke coq because of problems with text relocations in the coq plugins.  Hmmmm.  The cmxs files ought to be compiled with -fPIC.  I see it in %{_libdir}/ocaml/Makefile.config:

SHAREDCCCOMPOPTS=-fPIC

just like with ocaml 3.12.1.  But the 3.12.1-compiled coq plugins load successfully, and the new ones don't.  Is there something else I'm supposed to do to make shared OCaml libraries be compiled with -fPIC?

Well, that's a different problem.  Sorry for cluttering up this bug.

Comment 3 Jerry James 2012-09-07 23:01:21 UTC
The problem with the coq plugins went away on its own.  I have no idea what triggered it or what solved it.

I've tried 3 completely different approaches to patching frama-c for OCaml 4's Hashtbl changes.  All 3 segfaulted in the same way.  I figured out where the segfaulting statement is and inserted some debug statements, to show the key (a string) and the value (a list of Parameters, containing the strings that are printed).  The segfault occurs while the security slicing plugin is assembling its command-line arguments.  In fact, it's on the very last one:

Hashtbl.replace "" -> -slice-force -slice-print -slicing-exported-project-postfix -slicing-project-name -slicing-keep-annotations -slice-undef-functions -slicing-level -slice-callers -slice-value -slice-wr -slice-rd -slice-pragma -slice-loop-var -slice-loop-inv -slice-assert -slice-threat -slice-return -slice-calls
Segmentation fault (core dumped)

where the bytecode version survives that call and goes on to work on the aorai plugin:

Hashtbl.replace "Output Messages" -> -aorai-verbose
Hashtbl.replace "Output Messages" -> -aorai-debug -aorai-verbose
...

(Yes, that's an empty string as the key in the faulting call.)  The code repeatedly looks up the list associated with a key, adds a new Parameter object to the front of the list, then calls Hashtbl.replace with the key and the modified list.  The faulting call passes the longest list of any passed to this call (18 elements); I can't imagine why that would matter, but thought I ought to mention it.

I'm open to any suggestions for how to debug this further.

Comment 4 Jerry James 2012-09-11 20:22:20 UTC
Hey, GDB can read the debug output from OCaml 4.00.0!  Wow, that made debugging this a *whole* lot easier!  I wish I'd known that sooner.  Are there any plans to migrate Fedora ocaml packages over to producing debuginfo packages?  It's not like GDB is fully functional with OCaml sources, but it can do some basic stuff like displaying a stacktrace.

This turns out to be a frama-c bug after all.  There is code in src/type/type.ml that computes a hash value thusly:

          let tag = Obj.tag x in
          if tag = 0 then
            0
          else if tag = Obj.closure_tag then
            (* assumes that the first word of a closure does not change in
               any way (even by Gc.compact invokation). *)
            Obj.magic (Obj.field x 0)
          else
            Hashtbl.hash x

The part that operates on the closure is leading to this segfault, although I still don't know exactly why.  But if I comment out that part, frama-c runs instead of crashing on startup.  I have sent email to upstream about this.

Comment 5 Richard W.M. Jones 2012-09-11 20:39:40 UTC
(In reply to comment #4)
> Hey, GDB can read the debug output from OCaml 4.00.0!  Wow, that made
> debugging this a *whole* lot easier!  I wish I'd known that sooner.  Are
> there any plans to migrate Fedora ocaml packages over to producing debuginfo
> packages?  It's not like GDB is fully functional with OCaml sources, but it
> can do some basic stuff like displaying a stacktrace.

Yup, we should enable debuginfo.  However last time I looked,
the RPM macros wouldn't believe that '*.ml' was a source file
(and I didn't look any deeper -- possibly a simple change).

Comment 6 Jerry James 2012-10-10 16:45:39 UTC
Upstream confirmed that commenting out the offending code is correct.  Fixed in Rawhide and F-18.