Red Hat Bugzilla – Full Text Bug Listing
|Summary:||ocamlopt.opt (from Rawhide) segfaults when run on a RHEL 6 32 bit kernel|
|Product:||[Fedora] Fedora||Reporter:||Richard W.M. Jones <rjones>|
|Component:||ocaml||Assignee:||Richard W.M. Jones <rjones>|
|Status:||CLOSED RAWHIDE||QA Contact:||Fedora Extras Quality Assurance <extras-qa>|
|Version:||19||CC:||acathrow, c.david86, fedora-ocaml-list, mbooth, rjones, tcallawa|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2014-01-06 14:40:17 EST||Type:||Bug|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description Richard W.M. Jones 2012-07-27 04:10:51 EDT
Description of problem: ocamlopt.opt got signal and exited make: *** [inspect_vm] Error 2 make: *** Waiting for unfinished jobs.... (probably when building ocaml/examples) Version-Release number of selected component (if applicable): libguestfs 1.19.26, Rawhide when building in Koji How reproducible: Happened twice (three times??)
Comment 1 Richard W.M. Jones 2012-07-27 05:29:05 EDT
Cannot reproduce on Rawhide (64 bit) even with latest OCaml and glibc. Builds which failed: http://koji.fedoraproject.org/koji/taskinfo?taskID=4332739 (i686) http://koji.fedoraproject.org/koji/taskinfo?taskID=4332668 (i686)
Comment 2 Richard W.M. Jones 2012-07-27 05:50:27 EDT
http://koji.fedoraproject.org/koji/taskinfo?taskID=4335505 (also 32 bit) 3 out of 3 failed only on 32 bit, so I'm just installing a 32 bit Rawhide VM for testing.
Comment 3 Richard W.M. Jones 2012-07-27 12:22:52 EDT
I cannot reproduce this in a VM.
Comment 4 seth vidal 2012-07-27 14:08:58 EDT
Can you try reproducing it in an rhel 6 vm? also: ocamlopt.opt: segfault at 55b67514 ip 0000000008196fe3 sp 00000000ffb3823c error 4 in ocamlopt.opt[8048000+163000] conftest: segfault at 1 ip 00007f9c4d49df05 sp 00007fff66d87540 error 4 in libc-client.so.2007[7f9c4d45c000+105000] conftest: segfault at 1 ip 00007ffe966d5f05 sp 00007fffbad02040 error 4 in libc-client.so.2007[7ffe96694000+105000] php: segfault at 0 ip 000000000044ff0c sp 00007fff297faa70 error 4 in php[400000+309000] ocamlopt.opt: segfault at 55b67514 ip 0000000008196fe3 sp 00000000ff826bec error 4 in ocamlopt.opt[8048000+163000] is what I see in dmesg when I do the build using mock directly on one of the builders. this isn't koji-specific at least.
Comment 5 Richard W.M. Jones 2012-07-27 14:22:41 EDT
Thanks for testing. Could be the same thing as the bug that stops coq from building, which we think is a bug in the i686 code generator in OCaml 4.00.0.
Comment 6 Richard W.M. Jones 2012-07-27 17:59:48 EDT
The location of the segfault is in _C_ code (not generated OCaml code). asmrun/compact.c: invert_pointer_at line 80: while (Ecolor (*hp) == 0) hp = (word *) *hp; (specifically it happens while dereferencing *hp). However this is the garbage collector 'compact' module so this probably just indicates the some OCaml code corrupted the OCaml heap and we don't find out until the GC runs.
Comment 7 Richard W.M. Jones 2012-07-28 11:41:54 EDT
I updated Rawhide to OCaml 4.00.0 official release, but the bug still manifests itself exactly the same way.
Comment 8 Richard W.M. Jones 2012-07-30 11:32:42 EDT
(In reply to comment #4) > Can you try reproducing it in an rhel 6 vm? I took the F18 32 bit guest over and booted it on a RHEL 6.3 host. libguestfs builds correctly (ie. the bug is not exhibited). However I'm wondering how closely my environment matches the Koji environment: (1) What host kernel is used? => in my case: 2.6.32-279.el6.x86_64 (2) What guest kernel is used (ie. the environment where mock runs)? => in my case: 3.3.4-5.fc17.i686.PAE (Rawhide kernel doesn't boot for unrelated reasons) I suspect that on the real Koji, (2) is different because what Koji does is to boot a RHEL 6 guest with a mock chroot containing F18 packages, whereas I've got a real F17/18 guest.
Comment 9 Richard W.M. Jones 2012-07-31 14:56:10 EDT
I managed to reproduce this. I used a RHEL 6, 32 bit VM. I installed mock and built libguestfs-1.19.26-2.fc18.src.rpm in Rawhide, ie: $ ls -l /etc/mock/default.cfg lrwxrwxrwx. 1 root root 23 Jul 31 13:16 /etc/mock/default.cfg -> fedora-rawhide-i386.cfg $ mock -D '%libguestfs_buildnet 1' -D '%libguestfs_runtests 0' --rebuild libguestfs-1.19.26-2.fc18.src.rpm So the bug has something to do with ocamlopt.opt from Rawhide when run on a RHEL 6 32 bit kernel.
Comment 10 Richard W.M. Jones 2012-08-01 06:00:10 EDT
Needless to say, going into the mock chroot and building by hand does not exhibit the bug. Gahhhhhh ....
Comment 11 Fedora End Of Life 2013-04-03 13:23:03 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle. Changing version to '19'. (As we did not run this process for some time, it could affect also pre-Fedora 19 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19
Comment 12 Richard W.M. Jones 2014-01-06 14:40:17 EST
This is a stack alignment problem. Worked around in Rawhide: http://pkgs.fedoraproject.org/cgit/ocaml.git/commit/?id=179ac32d01818da5252cc100e9b97f347568727d Upstream is working on a fix: http://caml.inria.fr/mantis/view.php?id=6038