Bug 863801
Summary: | eclipse-4.3.0-0.12.git201301281400.fc19 is broken on ARM | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Peter Robinson <pbrobinson> | ||||||
Component: | eclipse | Assignee: | Krzysztof Daniel <kdaniel> | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | rawhide | CC: | akurtako, andjrobins, aph, blc, jerboaa, jsmith.fedora, kdaniel, mbenitez, overholt, rgrunber, swagiaal | ||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2013-03-02 08:27:20 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 913120 | ||||||||
Bug Blocks: | 245418, 901840 | ||||||||
Attachments: |
|
Description
Peter Robinson
2012-10-07 14:18:39 UTC
It breaks a chunk of sed scripts + sed -i -e s/1407/1503/ rt.equinox.framework/launcher-binary-parent/pom.xml + sed -i -e s/1500/1503/ rt.equinox.framework/bundles/org.eclipse.equinox.launcher.gtk.linux.arm/pom.xml sed: can't read rt.equinox.framework/bundles/org.eclipse.equinox.launcher.gtk.linux.arm/pom.xml: No such file or directory Chris, Please take care of this one together with other secondary archs. The change looks correct. The problem is that the arm fragments are missing. well there's a number of ways to define ARM. I'm not sure where %{eclipse_arch} was defined and the reason for changing that to %{ARCH} was but due to the lovelyness of the way ARM handles stuff we could have arm armv5tel armv6 armv6l armv6hl armv7 armv7l armv7hl. The difference between say armv7 and armv7l armv7hl is mostly around userspace and not the actual underlying HW. Peter is there smth like %{ix86} for arm ? Ah there is %{arm}. So we need to bring back eclipse_arch and have smth like %ifarch %{arm} %define eclipse_arch arm %endif For clarity: %ifarch %{arm} %define eclipse_arch arm %elifarch %{ix86} %define eclipse_arch x86 %else %define eclipse_arch %_arch %endif These mappings were done in eclipse-build before butwe need to ressurect them with CBI too Any update? Sorry for the delay. Fixing this bug gives you nothing as the new upstream build system does not support arm. I should give you the version for arm building and testing any time soon (basically as soon as ppc build completes successfully). Peter, please try to respin the build with the latest sources... I get the root.log error that the glassfish-jsp-api 2.2.1-4 is missing... It's weird because that package is built... (In reply to comment #11) > Peter, > please try to respin the build with the latest sources... I get the root.log > error that the glassfish-jsp-api 2.2.1-4 is missing... It's weird because > that package is built... It's only in updates-testing. What is the NVR I should be trying? latest one - 4.2.1-5 for rawhide, 6 for f18. (In reply to comment #13) > latest one - 4.2.1-5 for rawhide, 6 for f18. Just wanted to check as -6 hasn't been submitted as an update Appears to have the same problem on rawhide http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1214954 That's not the same issue. Could we focus on building for the rawhide first?http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1216516 I don't want to break x86_64 which works for f18... Please retry with the latest rawhide version (release -9) (In reply to comment #17) > Please retry with the latest rawhide version (release -9) I only see a -6 in koji and that's FTBFS on mainline Note: if it's not already built in mainline please build it as a scratch build What is the status of this? We need this fix for F18 final so we need it fixed soon. Peter, are you able to override build root with https://arm.koji.fedoraproject.org/koji/buildinfo?buildID=93420 ? The problem is that the build that is currently tagged (4.2.0-6) does not allow for further 4.x builds. (In reply to comment #21) > Peter, are you able to override build root with > https://arm.koji.fedoraproject.org/koji/buildinfo?buildID=93420 ? The > problem is that the build that is currently tagged (4.2.0-6) does not allow > for further 4.x builds. Done. Do realise we _MUST_ have the same NVR built on mainline I hear you. But building Eclipse is not as simple as that. If you take x86 build and put it on arm it will fail. Some of Eclipse parts are platform specific (not just arch packages, but separate fragments for x86, 64, ppc etc). There is no single fragment for arm in the new upstream build system. What we need to do is to build latest Eclipse (master) and then merge all changes (x86, ppc, arm) and rebuild it with the same NVR. Building each time the version that is tagged in mainline f18 adds just overhead (and changes almost nothing for *that* build). (In reply to comment #23) > I hear you. > > But building Eclipse is not as simple as that. If you take x86 build and put > it on arm it will fail. Some of Eclipse parts are platform specific (not > just arch packages, but separate fragments for x86, 64, ppc etc). There is > no single fragment for arm in the new upstream build system. > > What we need to do is to build latest Eclipse (master) and then merge all > changes (x86, ppc, arm) and rebuild it with the same NVR. Building each time > the version that is tagged in mainline f18 adds just overhead (and changes > almost nothing for *that* build). It needs to be from the same NVR and same source RPM. It's Fedora policy for _ALL_ secondary arches. Hi Peter, there is some misunderstanding going on. What Chris is trying to say is that we would need at least one bootstrap (different from mainline) build to happen after which when we change something in mainline arm will be able to rebuild from the same srpm. My build failed without apparent reason: http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1226664 Any idea? (In reply to comment #26) > My build failed without apparent reason: > http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1226664 > > Any idea? We just had some issues with koji, I've resubmitted the task http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1226718 Any update? Yes, there is a problem in tycho failing to properly recognize the jvm on arm only. Kdaniel is investigating the issue. It appears tycho-0.16.0-17.fc19 built successfully today on primary. Is that the package we were waiting on? (In reply to comment #30) > It appears tycho-0.16.0-17.fc19 built successfully today on primary. Is > that the package we were waiting on? It seems not. It's a noarch package so I imported and tagged it and kicked off a new eclipse build. I'm not sure the failure below is what we were seeing in the past so I'll leave it for someone else to check out the details. http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1257317 Any updates? I'm very sorry but I have still no clue what's wrong with the build on arm. I'm currently debugging remotely tycho build on a remote arm machine, but one debug cycle takes about 6 hours (assuming that there is no crash nor session timeout in the meantime). The build is integral and it is impossible to build only one bundle - it always fails for each bundle then. As I have said earlier - the build does the same steps for all platforms. Debugging on local arm emulator revealed that target platform for failing javase17 bundle is correctly calculated, but then it is lost. Local arm emulator (vexpress) turned out to be inefficient - it was possible to reproduce the issue in about 10 hours, but debugging was impossible. So I have got an arm machine and I'm trying luck with it. I have found and fixed another issue in the build - but that has no influence about this issue. I've looked for help amongst tycho developers, but you can check the result yourself: http://dev.eclipse.org/mhonarc/lists/tycho-user/msg03571.html. As usually, any help, comments, hints are welcome. So, the reproduction steps for now are: * first attempt fedpkg clone -a eclipse cd eclipse fedpkg local * next attempts cd R4_platform-aggregator mvn-rpmbuild clean install -Dmaven.test.skip=true -Dnative=gtk.linux.arm\ -DskipTychoVersionCheck -Dmaven.local.mode=true -Dtycho.local.keepTarget What is weird: I have called export MAVEN_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=y" to enable remote debug, as described in http://wiki.eclipse.org/Developing_Tycho, the build proceeded past the critical place. When my current build ends (for whatever reason) I'll try to run this again to check if this is persistent. Maybe I will check which exactly debug flag/breakpoint is responsible for "fixing" the build. Andrew: what those debug flags/breakpoints change in the jvm? (In reply to comment #34) > So, the reproduction steps for now are: > * first attempt > fedpkg clone -a eclipse > cd eclipse > fedpkg local > * next attempts > cd R4_platform-aggregator > mvn-rpmbuild clean install -Dmaven.test.skip=true -Dnative=gtk.linux.arm\ > -DskipTychoVersionCheck -Dmaven.local.mode=true -Dtycho.local.keepTarget > > What is weird: > I have called > export MAVEN_OPTS="-Xdebug > -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=y" > to enable remote debug, as described in > http://wiki.eclipse.org/Developing_Tycho, > the build proceeded past the critical place. > > When my current build ends (for whatever reason) I'll try to run this again > to check if this is persistent. Maybe I will check which exactly debug > flag/breakpoint is responsible for "fixing" the build. > > Andrew: what those debug flags/breakpoints change in the jvm? Aha! This is a really good clue. The JIT is totally disabled when debugging. If this is a JIT bug, that would explain it. If you can point me to the exact spot where the build gets weir I can have a look.
> If you can point me to the exact spot where the build gets weir I can have a
> look.
I don't know the exact place where something goes wrong (and I am afraid it might be hard to get without debugger...). But will try.
Andrew, I'm afraid I will not be able to nail it further in a reasonable state. Here are my current findings: I blame EquinoxResolver, because I have observed that (assuming that JIT does not remove objects sporadically) resolveState methods (line 67) gets a state with only one bundle in it (resolved state has only one bundle, and there is no resolution errors). It is unlikely that the state suddenly loses all the bundles. But the state is created in the same class, in the newState method. From the code, it looks like the loop in line 155 got invoked only once for a system bundle. Is there a way to disable JIT just for that class? [1] http://git.eclipse.org/c/tycho/org.eclipse.tycho.git/tree/tycho-core/src/main/java/org/eclipse/tycho/core/osgitools/EquinoxResolver.java Andrew, Peter, I nailed it - comment 37 is correct. Adding -XX:CompileCommand=exclude,org/eclipse/tycho/core/osgitools/EquinoxResolver,newState to the build seems to unbreak the build. How can we proceed further ? Chris, add it eclipse.ini together with the other excludes. (In reply to comment #39) > Chris, > add it eclipse.ini together with the other excludes. Sure I can add it, but it will not improve the process of building Eclipse... Ah, sorry. I read org/eclipse and stopped thinking . So can we add this to maven options to unbreak the build. Fabulous news! (In reply to comment #41) > Ah, sorry. I read org/eclipse and stopped thinking . So can we add this to > maven options to unbreak the build. I'll add it in both places. It could be optimized wrong in regular Eclipse, too. (In reply to comment #38) > Andrew, Peter, > > I nailed it - comment 37 is correct. Adding > -XX:CompileCommand=exclude,org/eclipse/tycho/core/osgitools/EquinoxResolver, > newState to the build seems to unbreak the build. How can we proceed further > ? Oh, well done! If you can tell me exactly how to get to this point I can debug the JIT. (In reply to comment #43) > (In reply to comment #41) > > Ah, sorry. I read org/eclipse and stopped thinking . So can we add this to > > maven options to unbreak the build. > > I'll add it in both places. It could be optimized wrong in regular Eclipse, > too. It could be, but that's very unlikely. ARM has its own JIT. Our Eclipse package still has a bunch of compiler excludes from some ancient version of HotSpot. We should take those out. There is no simpler reproduction steps than in comment 34. The issue happens when project org.eclipse.pde.api.tools.ee.javase17 is being resolved. (In reply to comment #46) > There is no simpler reproduction steps than in comment 34. The issue happens > when project org.eclipse.pde.api.tools.ee.javase17 is being resolved. OK, that's fine. I'll try to reproduce it. Peter, https://koji.fedoraproject.org/koji/taskinfo?taskID=4720961 I'll push that into f18 this week. Update that will go into f18. https://admin.fedoraproject.org/updates/lucene-3.6.0-6.fc18,eclipse-4.2.1-21.fc18 Is there a build order? lucene or eclipse first? lucene (In reply to comment #47) > (In reply to comment #46) > > There is no simpler reproduction steps than in comment 34. The issue happens > > when project org.eclipse.pde.api.tools.ee.javase17 is being resolved. > > OK, that's fine. I'll try to reproduce it. I'm working my way through those instructions, but it's possible that the patch described in comment 34 has already been applied, so I won't see the problem. Sorry for that. I have already added the workaround to the spec to unblock the work. The spec contains a line: export MAVEN_OPTS="-Xmx640m -XX:CompileCommand=exclude,org/eclipse/tycho/core/osgitools/EquinoxResolver,newState" It needs to be replaced with export MAVEN_OPTS="-Xmx640m" to reenable the issue. Can someone have a look at http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1289044 /usr/bin/mvn-rpmbuild: line 78: 5262 Killed $M2_HOME/bin/mvn -o "$@" It looks like the process was killed. No idea why - possibly timeout. Seeing this error in the armv5 build now that we can get all the way through + ./eclipse -application org.eclipse.equinox.initializer.configInitializer -justThisArchOSWS -fileInitializer ../../../../../../../../../extract_patterns.txt /var/tmp/rpm-tmp.WYLJX6: line 119: ./eclipse: No such file or directory error: Bad exit status from /var/tmp/rpm-tmp.WYLJX6 (%build) Full build here: http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1326829 Created attachment 670774 [details]
Patch
The problem is that one of the native parts building scripts does not recognize properly all arm architectures. I have prepared a patch which wildcards arm archs, so it should work now, but I was unable to test it - koji reported root problems :-(.
So I'm attaching a patch to this bug.
To verify at the early stage that the patch is working, look for the ant build_eclipse_cbi in the build.log. If you find
[exec] *** Unknown MODEL <armv6l>
then the patch is not working. If the patch will work, you should get
[exec] cp eclipse_1503.so ../../../../../rt.equinox.binaries/org.eclipse.equinox.launcher.gtk.linux.arm
a bit later (note the "arm" at the end - I've spent quite some time before I noticed that it may be missing).
try again, we're seeing intermittent repo issues on arm at the moment, should be fixed early in the new year Created attachment 671286 [details]
Patch revisited
Peter, build time outs: https://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1348746 the builder is misconfigured. Resubmitted and it has two known good hosts. http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1351651 Hey Peter, it looks like latest tycho was not in the buildroot. tycho-0.16.0-19.fc18 is needed. it's weird because it looks like tagged: http://arm.koji.fedoraproject.org/koji/buildinfo?buildID=105747 It's not tagged correctly. It's also not tagged in mainline. Can we work on getting it building on rawhide. What else do we need for the build other than tycho. It's pending a release: https://admin.fedoraproject.org/updates/FEDORA-2012-21048/eclipse-4.2.2-0.1.git20121217.fc18,tycho-0.16.0-19.fc18?_csrf_token=dc2830c5fa13f9db467de28054dd175da51f11fe I'm well aware of it's status, I'm just telling you why it's not there. Can we get it building on rawhide which actually has it? sure, I've started http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1354053 Hey Peter, I've had a successful rawhide arm build a while ago. It looks like my Eclipse build on a local arm machine was killed. I really need a stable builder that can run package build without 24h timeout. Can you communicate with me via IRC or Email, BZ isn't really conductive to dealing with the builder issues. I have concluded the build successfully. It required 36 hours. Are the changes necessary for a successful build in both f18 and f19? We're blocked on F18 currently. (In reply to comment #71) > Are the changes necessary for a successful build in both f18 and f19? We're > blocked on F18 currently. All the changes should be there. Builds have been killed by arm koji due to insufficient storage, timeout, etc. or there were missing deps.
> All the changes should be there. Builds have been killed by arm koji due to
> insufficient storage, timeout, etc. or there were missing deps.
Which NVR?
Still seeing issues http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1378395 This is the version that I got built: 4.2.2-0.3.git20121217.fc19.armv7hl There is no 4.2.2-0.3 build in Fedora mainline. http://koji.fedoraproject.org/koji/packageinfo?packageID=183 There's 0.1 and 0.4. As stated previously in this bug report we _ONLY_ build builds that have been built and submitted as updated on releases or tagged for the next rawhide build. As there is no 0.3 build in mainline we can try 0.1 or 0.4. try 0.4 It's not anymore :-) http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1381403 Seeing issues on rawhide. I thought by the error is might be due to tycho but we have the same version as mainline. Tried both 4.2.2 and 4.3.x http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1423375 You're right. This is a Tycho issue. More specifically, I'll switch Tycho to BR/R on maven-local rather than maven itself. Maven on rawhide no longer has the functionality to resolve system artifacts. (see http://pkgs.fedoraproject.org/cgit/maven.git/commit/?id=2cf4fd6a25ca10bdd9a579d499e9148612d18d1d). Building with http://koji.fedoraproject.org/koji/buildinfo?buildID=382586 (tycho-0.16.0-21) should resolve this issue. http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1426382 It looks like an issue with eclipse-emf but we have the same as mainline [ERROR] Cannot resolve project dependencies: [ERROR] Software being installed: org.eclipse.jdt.feature.group 3.9.0.qualifier [ERROR] Missing requirement: org.eclipse.e4.rcp.feature.group 1.1.0.qualifier requires 'org.eclipse.emf.common.feature.group [2.7.0,3.0.0)' but it could not be found [ERROR] Cannot satisfy dependency: org.eclipse.jdt.feature.group 3.9.0.qualifier depends on: org.eclipse.platform.feature.group 3.8.0 [ERROR] Cannot satisfy dependency: org.eclipse.platform.feature.group 4.3.0.qualifier depends on: org.eclipse.rcp.feature.group 0.0.0 [ERROR] Cannot satisfy dependency: org.eclipse.rcp.feature.group 4.3.0.qualifier depends on: org.eclipse.e4.rcp.feature.group 0.0.0 [ERROR] [ERROR] Internal error: java.lang.RuntimeException: "No solution found because the problem is unsatisfiable.": ["Unable to satisfy dependency from org.eclipse.e4.rcp.feature.group 1.1.0.qualifier to org.eclipse.emf.common.feature.group [2.7.0,3.0.0).", "Unable to satisfy dependency from org.eclipse.e4.rcp.feature.group 1.1.0.qualifier to org.eclipse.emf.ecore.feature.group [2.7.0,3.0.0).", "No solution found because the problem is unsatisfiable."] -> [Help 1] org.apache.maven.InternalErrorException: Internal error: java.lang.RuntimeException: "No solution found because the problem is unsatisfiable.": ["Unable to satisfy dependency from org.eclipse.e4.rcp.feature.group 1.1.0.qualifier to org.eclipse.emf.common.feature.group [2.7.0,3.0.0).", "Unable to satisfy dependency from org.eclipse.e4.rcp.feature.group 1.1.0.qualifier to org.eclipse.emf.ecore.feature.group [2.7.0,3.0.0).", "No solution found because the problem is unsatisfiable."] http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1436738 Latest build problem Tried to rebuild latest Eclipse version: http://arm.koji.fedoraproject.org/koji/buildinfo?buildID=112557 DEBUG util.py:257: --> cglib-2.2-14.fc19.noarch DEBUG util.py:257: --> glassfish-jsp-2.2.6-3.fc19.noarch DEBUG util.py:257: Error: Package: at-spi2-atk-2.7.5-1.fc19.armv7hl (build) DEBUG util.py:257: Requires: at-spi2-core >= 2.7.5 DEBUG util.py:257: Installing: at-spi2-core-2.7.4.1-1.fc19.armv7hl (build) DEBUG util.py:257: at-spi2-core = 2.7.4.1-1.fc19 DEBUG util.py:257: Error: Package: at-spi2-atk-2.7.5-1.fc19.armv7hl (build) DEBUG util.py:257: Requires: at-spi2-core >= 2.7.5 DEBUG util.py:257: Available: at-spi2-core-2.7.4.1-1.fc19.armv7hl (build) DEBUG util.py:257: at-spi2-core = 2.7.4.1-1.fc19 Latest build failure: http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1444789 We're on the new stable arm infra now so most of the transient repo issues should be gone unless it's a random soname bump that the builders are working their way through. re-opening and updating WOO HOO! Thank you for your assistance with getting this complete it is very much appreciated! |