Red Hat Bugzilla – Bug 494026
mono build is blocked by ppc-build.
Last modified: 2010-01-25 11:14:44 EST
As seen here http://koji.fedoraproject.org/koji/taskinfo?taskID=1260389
packaging mono is currently blocked by the ppc-build.
This has also been discussed here http://fcp.surfsite.org/modules/newbb/viewtopic.php?topic_id=68952&viewmode=flat&order=ASC&start=0
I just opened this bugreport to track this issue, as all the new features of mono 2.4 don't slip into fedora (e.g. ASP.NET MVC addin for MonoDevelop)
I've spoken to some of the devs at Novell who cannot reproduce the problem on any of their systems (including whatever they use for opensuse). They have asked if we can run the build through gdb and report back the problems.
I've asked spot, but not had any response.
Blocking ppc and F11 trackers
The last successful build was RC1. The first failed build was RC2. RC2 did change ppc specific bits - ppc64 TLS was added, plus support for NPTL on both ppc32 and ppc64.
pulled the mono-2.4 release and ran into a problem where gcc-4.4 is being more pedantic then gcc-4.2/3.
for example very early we see:
In file included from ./include/private/gc_priv.h:95,
./include/private/gc_locks.h: In function ‘GC_test_and_set’:
./include/private/gc_locks.h:162: error: ‘asm’ operand has impossible constraints
make: *** [alloc.lo] Error 1
"1:\tlwarx %0,0,%3\n" /* load and reserve */
"\tcmpwi %0, 0\n" /* if load is */
"\tbne 2f\n" /* non-zero, return already set */
"\tstwcx. %2,0,%1\n" /* else store conditional */
"\tbne- 1b\n" /* retry if lost reservation */
"\tsync\n" /* import barrier */
"2:\t\n" /* oldval is zero if we set */
: "=&r"(oldval), "=p"(addr)
: "r"(temp), "1"(addr)
The latest from bdwgc is a little different:
149 __asm__ __volatile__(
150 "1:lwarx %0,0,%1\n" /* load and reserve */
151 "cmpwi %0, 0\n" /* if load is */
152 "bne 2f\n" /* non-zero, return already set */
153 "stwcx. %2,0,%1\n" /* else store conditional */
154 "bne- 1b\n" /* retry if lost reservation */
155 "2:\n" /* oldval is zero if we set */
156 : "=&r"(oldval)
157 : "r"(addr), "r"(temp)
158 : "memory", "cr0");
Note that =p(addr) is gone and the back ref "1"(addr) is not needed.
I would guess gcc-4.4 is choking on the =p(addr)
Does mono need to update to a newer version of Boehm GC?
Created attachment 338995 [details]
patch to GC_test_and_set
This patch fixes the compile error with gcc-4.4
The above patch allows the make for mono-2.4 to complete given the configure options:
./configure --prefix=/usr/local --with-moonlight=no
Now I see regression errors in the make check.
Test run: image=/home/ppcteam/mono-ppc/mono-131222/mono/mini/basic-long.exe, opts=
Test 'test_2_neg' failed result (got 3, expected 2).
Test 'test_0_neg_large' failed result (got 1, expected 0).
Test 'test_1_simple_neg' failed result (got 0, expected 1).
Results: total tests: 88, failed: 3, cfailed: 0 (pass: 96.59%)
Elapsed time: 0.010395 secs (0.001288, 0.009107), Code size: 26908
We get something different results with older versions of GCC 4.1/4.3 with those we see:
365 test(s) passed. 2 test(s) did not pass.
so there could be latent PPC GCC-4.4 bug, but could also be due to other system differences.
realized that I was using the trunk instead of the 2.4 branch so trying again with
Also updated the gcc packages from yum.
I've alerted the Novell people (via the mono developers list) of this problem as well as posting this BZ url.
Hopefully, we'll be able to get a ppc build running soon.
/me wishes someone would donate a PPC box to him so he could help get this fixed....
We can build mono-2.4 from the tar file on F11 with the attached patch. Working
with the vargaz from monodev this patch has been applied to trunk and mono-2-4
But I am still see the the basic-long failure on of the make check. This is not
the case with other distros (with older gcc).
So please veriry that the mono-2-4 builds in the fedora build environment then
we can close this bugz and open new bugz for these make check failures.
Nope. still not building, same problem.
Did you verify that the attached patch is applied. and that you gcc is up2date?
Also this may be a clue
"make: execvp: mcs: Permission denied"
please check you security settings!
Link to koji build page:
From the build.log:
Patch #8 (mono-24-ppc-glocks.patch):
+ /bin/cat /builddir/build/SOURCES/mono-24-ppc-glocks.patch
+ /usr/bin/patch -s -p1 -b --suffix .glocks-ppc --fuzz=0
Checked cvs to ensure this is the patch provided in this bug report.
From the root.log:
DEBUG util.py:256: gcc.ppc 0:4.4.0-0.32 gcc-c++.ppc 0:4.4.0-0.32
I note that the spec currently has moonlight enabled:
%configure --with-ikvm=yes --with-jit=yes --with-xen_opt=yes \
--with-moonlight=yes --disable-static --with-preview=yes \
I'm not 100% sure but I think that: "make: execvp: mcs: Permission denied" is because Paul has changed the spec file to rebootstrap the package. So /usr/bin/mcs is not being found. After that, the code tries to use the mcslite bootstrapping binaries and fails. We shouldn't have to rebootstrap, though, because releng put the old package, that had ppc builds back into the buildroot.
Okay, bootstrap code turned off, confirmed patch has been applied. Latest build gets farther but still fails with a stack overflow on ppc:
Build task is:
Ok this is still some goofey problem specific to your build environment, because I can build mono-2.4 from the svn branch (and the tar file with patch). I have verified that I can compile Mono.Xml.Xsl/PatternTokenizer.cs within my F11 mono-2.4 build.
one difference is I build --with-moonlight=no
Can you try with moonlight=yes please? At least we can factor that one out then. That said, when I've pushed 2.4 release and RC2 + RC3 through they had moonlight=no as well (actually they didn't have any of the moonlight options taken as the default is no).
I've tried with --with-moonlight=no and also with just::
%configure --with-moonlight=no --disable-static
In each of those cases it stops at the same point in the build.
Steven, is this error coming from mcs or some other tool?
One thing to remember is that this is being built with the bootstrapping code disabled, so we're using mcs from mono-core-2.4-RC1 in this build. (Although Paul's build.log with bootstrapping enabled showed that the bootstrapping mcs will error at a different point in the build).
Can you also confirm that the tarball before patch has md5sum:
(I noticed that the RC's and final all have the same name. :-(
Finally, if you think this is something to do with the buildsystem's environment, can you try rpm --rebuild of the source rpm on your F11 system? It's available here:
curl 'http://koji.fedoraproject.org/koji/getfile?taskID=1295214&name=mono-2.4-14.fc11.src.rpm' > mono-2.4-14.fc11.src.rpm
rpm --rebuild mono-2.4-14.fc11.src.rpm
When I add the --disable-static I see the following failure:
echo "#define XSLT_PATTERN" > Mono.Xml.Xsl/PatternParser.cs
./../../jay/jay -ct Mono.Xml.Xsl/PatternParser.jay < ./../../jay/skeleton.cs >>Mono.Xml.Xsl/PatternParser.cs
./../../jay/jay: 3 rules never reduced
./../../jay/jay: 1 shift/reduce conflict, 46 reduce/reduce conflicts.
echo "#define XSLT_PATTERN" > Mono.Xml.Xsl/PatternTokenizer.cs
cat System.Xml.XPath/Tokenizer.cs >>Mono.Xml.Xsl/PatternTokenizer.cs
MCS [basic] System.Xml.dll
Stack overflow in unmanaged: IP: 0xf8b4b54, fault addr: 0xff18bda0
Stack overflow in unmanaged: IP: 0xf8b4b54, fault addr: 0xff18abd0
Stack overflow in unmanaged: IP: 0xf8b4b54, fault addr: 0xff189ff0
Stack overflow in unmanaged: IP: 0xf8b4b54, fault addr: 0xff188e20
Stack overflow in unmanaged: IP: 0xf8b4b54, fault addr: 0xff187c50
Stack overflow in unmanaged: IP: 0xfd98510, fault addr: 0xff186ef0
Stack overflow in unmanaged: IP: 0xf8b4b54, fault addr: 0xff185ea0
Stack overflow in unmanaged: IP: 0xf8b4b54, fault addr: 0xff184cd0
Stack overflow: IP: 0xfd98510, fault addr: 0xff183f70
So the --disable-static seems to be part of the issue.
Also verified the checksum:
mono-ppc]$ md5sum mono-2.4.tar.bz2
Following onto the previous comment:
If I don't have --disable-static but do have --with-static_mono=no, I get the Stack overflow. --disable-static implies --with-static_mono=no
And if we link statically against libmono, we are able to build on ppc:
Apparently, --disable-static is not supported or tested upstream and future releases may rely on the static libs (makes for a quicker runtime as well).
Can we allow mono to have static libs in?
The --dibale-static option is also not used in the OpenSUSE build - which builds fine (according to some #mono-devel irc people).
It is also not present in their spec file.
We won't ship the static libs. It would be okay to build mono itself against a static libmono.a *as a temporary workaround*. We'd definitely want to fix this by F12.
The big issue I'd want an answer to is whether this is specific to mono dynamically linking to libmono. If it's going to happen with other things that link against libmono to embed the runtime, then this is a much larger issue than if it just affects mono.
By the looks, we build against libmono.a but then don't need to bundle it. It is also only an issue with mono and nothing else - yet.
(In reply to comment #27)
> By the looks, we build against libmono.a but then don't need to bundle it.
Yep. We can rm -rf the static libraries.
> It is also only an issue with mono and nothing else - yet.
This is an oversimplification. Nothing we ship uses libmono ATM. But that doesn't mean that people using Fedora aren't embedding libmono into their applications. Since we don't know precisely what the trigger is other than linking to libmono dynamically, we don't know how far reaching this is.
Okay, new mono build is in the buildsystem:
This will be mono-2.4 final with the patch from Steven (Thanks!) and building against a static libmono.
Removing this from the F11Blocker bug and adding to the F12Blocker. Things we need to do as soon as possible and definitely before the F12 release:
* Get mono linking dynamically against libmono again.
* Try to bootstrap mono onto ppc64.
that will be a trick!
The cross (compiling 64-bit on a 32-bit default system) requires lots of 64-bit packages that may or may not exist and some hacking around brain-dead pkgconfig/libtool isms. Starting with glib-2.0 and pcre I have had to hack pkgconfig and libtool la files to for 64-bit packages that did not bother to provide them. In some cases I resorted to running configure then hacking config.status to to replace lib with lib64.
Also you will need a 64-bit clean version of glibconfig.h as mono depends on it to get its int/pointer casts right. Without this fix any 64-bit mono build is doomed. It seems the glib is incapable of providing a biarch clean devel package.
Finally mkbundle is brain-dead as it blindly exec's as and ld without applying to the appropriate -m64/-m32 switches. I have patches to hack around that as well.
It is obviously simpler to build 64-bit on a 64-bit primary systems. I suspect it is just are hard to build a 32-bit mono on a 64-bit primary system, but I have not tried that yet.
Luckily we build a ppc64 primary system, we just don't ship it by default :-)
Assuming that my next build with bootstrapping off works_, we'll just have to figure out what's going wrong with dynamic linking to libmono.
.. _: http://koji.fedoraproject.org/koji/taskinfo?taskID=1300753
Steven, I *think* I have a simpler test case.
Once you have mono-2.4 final built and installed:
gcc -o teste teste.c `pkg-config --cflags --libs mono` -lm
gcc -Wall -o test-invoke test-invoke.c `pkg-config --cflags --libs mono` -lm
These work when run on an F10 i386 box with mono-2.4 final rpms so it seems like it's related... although it could be I'm looking at a different bug now.
Do you get this if you use gmcs instead of mcs?
Just checked and this has not been fixed:
As discussed at today's blocker bug review meeting, since we have a current 'workaround' (actually blessed by upstream) with no catastrophic consequences, can't consider this blocking F12 release. Dropping to F12Target.
Fedora Bugzappers volunteer triage team
Steven, do you have the time to look at this at some point or are you terribly busy? By F13, ppc is going to be a secondary arch and we'll probably go to dynamic linking for the F13 rawhide cycle whether or not this bug is fixed.