Bug 1658940 - Firefox fails to build on arm - /usr/bin/ld: final link failed: memory exhausted
Summary: Firefox fails to build on arm - /usr/bin/ld: final link failed: memory exhau...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: firefox
Version: rawhide
Hardware: armv7hl
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Peter Robinson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1641623 (view as bug list)
Depends On:
Blocks: ARMTracker
TreeView+ depends on / blocked
 
Reported: 2018-12-13 08:39 UTC by Martin Stransky
Modified: 2024-01-22 11:09 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-01-22 11:09:13 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
failed build log with cairo failure (3.56 MB, text/plain)
2019-01-10 09:49 UTC, Peter Robinson
no flags Details
Fedora 29 build failure (3.56 MB, text/plain)
2019-01-11 00:00 UTC, Peter Robinson
no flags Details

Description Martin Stransky 2018-12-13 08:39:19 UTC
Description of problem:
https://kojipkgs.fedoraproject.org//work/tasks/122/31420122/build.log

422:20.46 toolkit/library/symverscript.stub
422:20.84 toolkit/library/libxul.so
447:08.94 /usr/bin/ld: final link failed: memory exhausted


Version-Release number of selected component (if applicable):
firefox-64.0

Comment 1 Peter Robinson 2019-01-08 11:28:50 UTC
Please at least add it to the ARMTracker so ARM people are aware of the issue :)

Comment 2 Peter Robinson 2019-01-10 04:28:16 UTC
Martin: I've been trying to fix/reproduce this. I'm currently seeing a build issue in the bundled cairo which looks like it's trying to use NEON again. The Fedora ARMv7 doesn't enable NEON by default, the libraries are expected to use runtime detection/fast paths rather than explicitly enable it. It looks like we need to explicitly pass -DHAVE_ARM_NEON=0 but a number of different ways I've tried (below) don't appear to work, can you provide some direction/assistance here?


@@ -435,6 +432,10 @@ echo "ac_add_options --with-system-libvpx" >> .mozconfig
 echo "ac_add_options --without-system-libvpx" >> .mozconfig
 %endif
 
+%ifarch {arm}
+echo "export HAVE_ARM_NEON=0" >> .mozconfig
+%endif
+
 %ifarch s390 s390x
 echo "ac_add_options --disable-ion" >> .mozconfig
 %endif
@@ -511,7 +512,12 @@ echo "ac_add_options --enable-linker=gold" >> .mozconfig
 export RUSTFLAGS="-Cdebuginfo=0"
 %endif
 export CFLAGS=$MOZ_OPT_FLAGS
+%ifarch %{arm}
+export CXXFLAGS="$MOZ_OPT_FLAGS -DHAVE_ARM_NEON=0"
+%endif
+%ifnarch %{arm}
 export CXXFLAGS=$MOZ_OPT_FLAGS
+%endif
 export LDFLAGS=$MOZ_LINK_FLAGS
 
 export PREFIX='%{_prefix}'

Comment 3 Martin Stransky 2019-01-10 09:11:36 UTC
Can you provide me a build log? I didn't see that on our koji builds, it failed on linker.

Comment 4 Peter Robinson 2019-01-10 09:47:52 UTC
(In reply to Martin Stransky from comment #3)
> Can you provide me a build log? I didn't see that on our koji builds, it
> failed on linker.

https://koji.fedoraproject.org/koji/taskinfo?taskID=31909938

Will attach the build.log too as the failed scratch builds get cleaned up quickly.

Comment 5 Peter Robinson 2019-01-10 09:49:27 UTC
Created attachment 1519714 [details]
failed build log with cairo failure

Comment 6 Martin Stransky 2019-01-10 10:08:29 UTC
I see, that's rawhide. I prefer to fix F29/28 first and then look at rawhide as it brings new failures.
Let's concentrate at the actual showstopper which is the memory exhaustion - we can't do any builds with that.

Comment 7 Peter Robinson 2019-01-10 10:19:04 UTC
(In reply to Martin Stransky from comment #6)
> I see, that's rawhide. I prefer to fix F29/28 first and then look at rawhide
> as it brings new failures.
> Let's concentrate at the actual showstopper which is the memory exhaustion -
> we can't do any builds with that.

Sure, but you hadn't previously mentioned that's what you preferred and I normally work rawhide and then roll backwards.

I'll take a look at f29 then, but either way the build is explicitly enabling NEON which is should not so we also need to deal with that separately too so my question above still remains and is relevant on all branches.

Comment 8 Peter Robinson 2019-01-11 00:00:52 UTC
Created attachment 1519959 [details]
Fedora 29 build failure

https://koji.fedoraproject.org/koji/taskinfo?taskID=31939286

Same Cairo failure as per rawhide

Comment 9 Martin Stransky 2019-01-11 08:44:52 UTC
I don't understand why the same package (Firefox 64) was compiled fine and it's broken now. I haven't done any arm related changes to the package so there's no reason for that unless build system was changed somehow. 

Can you please try to build the firefox-64.0-2.fc29 version (build task was https://koji.fedoraproject.org/koji/taskinfo?taskID=31420059) ? That's the package on what failed at memory exhaustion and led to this bug report.

If the firefox-64.0-2.fc29 fails to build because of cairo/neon now it may be something wrong with the build system.

Comment 10 Peter Robinson 2019-01-11 12:42:55 UTC
I'm submitted a scratch build as such:

koji build --scratch --arch-override=armv7hl f29 git+https://src.fedoraproject.org/rpms/firefox.git#3336f2b99462836caf87ad455525be1a20b05809
Created task: 31956796
Task info: https://koji.fedoraproject.org/koji/taskinfo?taskID=31956796

based on the above build, there's no doubt a bunch of things that could have changed in the last month (the gcc changes/improves for one) so any number of things could have changed but at this point it's not even getting to the linking stage.

Comment 11 Martin Stransky 2019-01-11 15:19:36 UTC
I see your build is still running so I hope the cairo failure you see it's a regression introduced to 64.0-7 package - it will easy to find it then.

Comment 12 Peter Robinson 2019-01-12 00:06:16 UTC
So it appears to be a regression introduced since -2

435:10.72 In file included from /builddir/build/BUILD/firefox-64.0/objdir/dom/canvas/Unified_cpp_dom_canvas5.cpp:101:
435:10.72 /builddir/build/BUILD/firefox-64.0/dom/canvas/WebGLUniformLocation.cpp: In member function 'JS::Value mozilla::WebGLUniformLocation::GetUniform(JSContext*) const':
435:10.72 /builddir/build/BUILD/firefox-64.0/dom/canvas/WebGLUniformLocation.cpp:177:32: note: parameter passing for argument of type 'JS::MutableHandle<JS::Value>' changed in GCC 7.1
435:10.72              if (!dom::ToJSValue(js, boolBuffer, elemSize, &val)) {
435:10.72                   ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
435:16.00 toolkit/library/symverscript.stub
435:16.44 toolkit/library/libxul.so
461:37.52 /usr/bin/ld: final link failed: memory exhausted
461:37.52 collect2: error: ld returned 1 exit status
461:37.52 gmake[4]: *** [/builddir/build/BUILD/firefox-64.0/config/rules.mk:712: libxul.so] Error 1
461:37.52 gmake[3]: *** [/builddir/build/BUILD/firefox-64.0/config/recurse.mk:74: toolkit/library/target] Error 2
461:37.52 gmake[2]: *** [/builddir/build/BUILD/firefox-64.0/config/recurse.mk:34: compile] Error 2
461:37.53 gmake[1]: *** [/builddir/build/BUILD/firefox-64.0/config/rules.mk:431: default] Error 2
461:37.53 gmake: *** [client.mk:125: build] Error 2
461:37.57 287 compiler warnings present.

Comment 13 Martin Stransky 2019-01-12 08:59:49 UTC
Great! I'll fix that regression if you manage to address the memory issue.

Comment 14 Jeremy Linton 2019-01-29 17:10:51 UTC
Tried to reproduce this, but in the current rawhide rustc is crashing fairly early.

see bz:1670502

Comment 15 Martin Stransky 2019-02-06 12:58:05 UTC
Rawhide is recently broken due to gcc9 update.

Comment 16 Jeremy Linton 2019-02-08 01:26:53 UTC
Ok, so its possible to get past the memory exaustion by disabling debuginfo and optimization. ( -C opt-level=0 -C debuginfo=0) this is apparently a problem on all 32-bit arch's at this point.

https://github.com/rust-lang/rust/issues/45854

Comment 17 Martin Stransky 2019-02-08 07:51:22 UTC
(In reply to Jeremy Linton from comment #16)
> Ok, so its possible to get past the memory exaustion by disabling debuginfo
> and optimization. ( -C opt-level=0 -C debuginfo=0) this is apparently a
> problem on all 32-bit arch's at this point.
> 
> https://github.com/rust-lang/rust/issues/45854

AFAIK the problem here is linking of libxul.so and it's not related to rust (at least not directly). Rust build failure was Bug 1523912.

Comment 18 Jeremy Linton 2019-02-19 22:53:18 UTC
Spent some more time trying to get a clean build. Couple comments: the memory exaustion in libxul can be avoided if all the files in that link pass are striped with '-xd' which strips all local and debug symbols. This happens with both the gold and normal BFD linker. It doesn't help that the rust library is a GB by itself before stripping.

I also continue having issues with the rust pass as well, and need the opt-level and debuginfo reset, as well as a the LinuxSignal.h patch updated, as well as a few other tweaks. If/when I get a clean build from the .spec file I will post the delta. At the moment i've got some ugly hacks to workaround problems with the firefox profiler and the fact that apparently armv8 doesn't trigger {arm} stanzas in the .spec files.

Comment 19 Jeremy Linton 2019-03-04 22:16:57 UTC
Well, at the moment i'm at a loss why its failing in koji, i'm going to spin up a different enviroment and see if I can duplicate it.

Anyway, right now its still failing with:

662:45.59 /usr/bin/ld.gold: fatal error: libxul.so: mmap: failed to allocate 562853416 bytes for output file: Cannot allocate memory

which should _NOT_ be happening given that i've got `-Wl,--no-mmap-output-file` in MOZ_LINK_FLAGS. In fact the whole thing is odd since I'm sitting at just about 2G of address space utilization when I build locally. It almost looks like my flags arent being propogated through to the autogenerated build files.

Anyway, the current build tweaks are roughly:

MOZ_LINK_FLAGS="-Wl,--no-keep-memory -Wl,--no-keep-files-mapped -Wl,--no-map-whole-files -Wl,--no-mmap-output-file"


MOZ_RUST_DEFAULT_FLAGS="-Cdebuginfo=0 -Copt-level=0"


I have a sed to turn off NEON support for ycbcr, otherwise it seems gas has problems with jumps that are to far away.

sed -i -e "s/MOZILLA_MAY_SUPPORT_NEON/MOZILLA_MAY_SUPPORT_NEONXXXX/g" gfx/ycbcr/*


I've also replaced patch415 with:


+++ firefox-66.0/mfbt/LinuxSignal.h     2019-02-19 22:32:03.127639819 +0000
@@ -22,7 +22,7 @@ __attribute__((naked)) void SignalTrampo
                                              void* aContext) {
   asm volatile("nop; nop; nop; nop" : : : "memory");
 
-  asm volatile("b %0" : : "X"(H) : "memory");
+  H(aSignal, aInfo, aContext);
 }

Comment 20 Peter Robinson 2019-04-29 17:58:53 UTC
For reference a new gcc bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90273

Comment 21 Peter Robinson 2019-05-19 17:30:11 UTC
*** Bug 1641623 has been marked as a duplicate of this bug. ***

Comment 22 Ben Cotton 2019-10-31 19:17:59 UTC
This message is a reminder that Fedora 29 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '29'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 29 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 23 Ben Cotton 2019-11-27 22:34:28 UTC
Fedora 29 changed to end-of-life (EOL) status on 2019-11-26. Fedora 29 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 24 Peter Robinson 2024-01-22 11:09:13 UTC
arm32 is now EOL


Note You need to log in before you can comment on or make changes to this bug.