Description of problem:
bin/coqtop -boot -native-compiler -compile theories/Init/Datatypes.v -noinit -R theories Coq
rm -f theories/Init/Logic_Type.glob
bin/coqtop -boot -native-compiler -compile theories/Init/Logic_Type.v -noinit -R theories Coq
rm -f theories/Init/Nat.glob
bin/coqtop -boot -native-compiler -compile theories/Init/Nat.v -noinit -R theories Coq
Makefile.build:590: recipe for target 'theories/Init/Nat.vo' failed
make: *** [theories/Init/Nat.vo] Segmentation fault (core dumped)
make: *** [theories/Init/Nat.vo] Deleting file 'theories/Init/Nat.glob'
make: Leaving directory '/builddir/build/BUILD/coq-8.5pl1'
Makefile:158: recipe for target 'submake' failed
make: *** [submake] Error 2
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. ppc-koj ibuild --scratch f24 coq-8.5pl1-1.fc24.src.rpm
This is probably related to the new native compilation feature of coq. I'll try to track it down but, if unsuccessful, may have to turn that feature off for ppc64le.
Since I already had a ppc64 VM that I used to track down a gap bug, I tried building coq there, but it succeeded. So this bug appears on ppc64le, but not ppc64, unless my VM is somehow significantly different from real hardware. I'll try creating a ppc64le VM next. (Although access to real hardware would be very much appreciated, as these emulated CPUs are really slooooooow.)
Oh, when you spoke about VM I thought you already have access to real HW VM. There are OpenPOWER hubs across the world that provide VMs to developers, like the one we manage at Brno Technical University. You can sign up at https://docs.google.com/forms/d/1py0Fg2x9xoQP9yYzOm4DSnJ1buyQBBCWN9Sj_OZuXek/viewform
I suspect an ocaml code generation bug on ppc64le. This is not related to the new native compilation feature of coq. The segfault happens with that turned off. The segfault occurs in the function nat_of_int (see plugins/syntax/nat_syntax.ml).
On ppc64, which works, that function starts like this:
1d40: 7c 08 02 a6 mflr r0
1d44: f8 01 00 10 std r0,16(r1)
1d48: f8 21 ff 51 stdu r1,-176(r1)
1d4c: f8 81 00 88 std r4,136(r1)
1d50: f8 61 00 80 std r3,128(r1)
1d54: 7c 83 23 78 mr r3,r4
1d58: e9 62 81 08 ld r11,-32504(r2)
1d5c: f8 41 00 28 std r2,40(r1)
1d60: e8 4b 00 08 ld r2,8(r11)
1d64: e9 6b 00 00 ld r11,0(r11)
1d68: 7d 69 03 a6 mtctr r11
1d6c: 4e 80 04 21 bctrl
On ppc64le, which segfaults, that function starts like this:
1aa0: 02 00 4c 3c addis r2,r12,2
1aa4: 60 63 42 38 addi r2,r2,25440
1aa8: a6 02 08 7c mflr r0
1aac: 10 00 01 f8 std r0,16(r1)
1ab0: 51 ff 21 f8 stdu r1,-176(r1)
1ab4: 88 00 81 f8 std r4,136(r1)
1ab8: 80 00 61 f8 std r3,128(r1)
1abc: 78 23 83 7c mr r3,r4
1ac0: 08 81 82 e9 ld r12,-32504(r2)
1ac4: 18 00 41 f8 std r2,24(r1)
1ac8: a6 03 89 7d mtctr r12
1acc: 21 04 80 4e bctrl
The bctrl instruction is the one that faults, because a bogus address has been loaded into the ctr register. I notice that on ppc64, after loading r11 with -32504(r2), an additional load into r11 of 0(r11) is done, but there is no such corresponding additional load of r12 in the ppc64le code. I also notice that on ppc, the value of r2 appears to come from the caller, where on ppc64le, there are 2 additional instructions at the top of the function that set r2 from the value of r12.
Of course, I basically know squat about ppc assembly, so all of that might be utter nonsense.
I'm adding Richard Jones to the CC list. Richard, are you aware of any ocaml code generation bugs on ppc64le? Perhaps involving recursive local functions?
It wouldn't surprise me if there are still lurking ppc64le codegen
bugs. Note that Fedora 24 is still using the out-of-tree code
generator written by IBM. New in OCaml 4.03 is an in-tree code
generator for POWER written from scratch by Xavier Leroy et al.
Therefore I'm not inclined to fix code gen bugs in this version
of OCaml unless they are simple/obvious ones. We will get the
new code generator in F25.
(In reply to Jerry James from comment #4)
> Perhaps involving recursive local functions?
BTW don't overlook simple stack overflows. These are much more
common on POWER because the stack frames are larger. Try doing:
ulimit -s 65536
before the build.
why-2.36-1.fc25 frama-c-1.13-7.fc25 why3-0.87.3-1.fc25 gappalib-coq-1.3.2-2.fc25 zenon-0.8.2-5.fc25 flocq-2.5.2-4.fc25 coq-8.6-1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2017-033c8000df
coq-8.6-1.fc25, flocq-2.5.2-4.fc25, frama-c-1.13-7.fc25, gappalib-coq-1.3.2-2.fc25, why-2.36-1.fc25, why3-0.87.3-1.fc25, zenon-0.8.2-5.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-033c8000df
coq-8.6-1.fc25, flocq-2.5.2-4.fc25, frama-c-1.13-7.fc25, gappalib-coq-1.3.2-2.fc25, why-2.36-1.fc25, why3-0.87.3-1.fc25, zenon-0.8.2-5.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.