Created attachment 744092 [details] Reproducer Description of problem: The Weld test org.jboss.weld.tests.beanDeployment.noclassdeffound.NoClassDefFoundErrorTest fails on EAP 6.1.0.ER6. The same problem was already fixed in Weld in https://issues.jboss.org/browse/WELD-1215, so it seems that it is now caused by a change in class loading/JBoss Modules. org.jboss.arquillian.container.spi.client.container.DeploymentException: Cannot deploy: 0e0d263b-72a0-49ca-9950-73d6403a679d.jar at org.jboss.as.arquillian.container.ArchiveDeployer.deployInternal(ArchiveDeployer.java:83) at org.jboss.as.arquillian.container.ArchiveDeployer.deployInternal(ArchiveDeployer.java:64) at org.jboss.as.arquillian.container.ArchiveDeployer.deploy(ArchiveDeployer.java:46) ... Caused by: java.lang.Exception: {"JBAS014671: Failed services" => {"jboss.deployment.unit.\"0e0d263b-72a0-49ca-9950-73d6403a679d.jar\".WeldStartService" => "org.jboss.msc.service.StartException in service jboss.deployment.unit.\"0e0d263b-72a0-49ca-9950-73d6403a679d.jar\".WeldStartService: Failed to start service Caused by: java.lang.LinkageError: Failed to link org/jboss/weld/tests/beanDeployment/noclassdeffound/Bar (Module \"deployment.0e0d263b-72a0-49ca-9950-73d6403a679d.jar:main\" from Service Module Loader) Caused by: java.lang.NoClassDefFoundError: org.jboss.weld.tests.beanDeployment.noclassdeffound.Foo Caused by: java.lang.ClassNotFoundException: org.jboss.weld.tests.beanDeployment.noclassdeffound.Foo from [Module \"deployment.0e0d263b-72a0-49ca-9950-73d6403a679d.jar:main\" from Service Module Loader]"}} at org.jboss.as.controller.client.helpers.standalone.impl.ServerDeploymentPlanResultFuture.getActionResult(ServerDeploymentPlanResultFuture.java:134) ... Version-Release number of selected component (if applicable): EAP 6.1.0.ER6 Weld 1.1.12.Final-redhat-1 How reproducible: Deploy the attached reproducer to EAP. Alternatively, run the org.jboss.weld.tests.beanDeployment.noclassdeffound.NoClassDefFoundErrorTest Actual results: Archive does not deploy, DeploymentException is thrown. Expected results: The NCDFE is caught, the invalid class is not regarded as a bean, and the archive is succesfully deployed.
Note that this used to work on EAP 6.1.0.ER5, so it is a very recent regression. I can also reproduce the problem on some Seam 2.3 examples, where seam-remoting contains org/jboss/seam/remoting/gwt/GWTService, which implements com.google.gwt.user.server.rpc.SerializationPolicyProvider, which is contained an optional gwt-servlet dependency. This example deploys on EAP6 <= 6.1.0.ER5 with a warning ( Failed to define class org.jboss.seam.remoting.gwt.GWTService ), but fails to deploy on EAP 6.1.0.ER6. (it works if I add the missing gwt-servlet jar)
How common is this problem? I mean, what is the expected behavior, since the outclass is already invalid, shouldn't the deployment fail, anyway? David, can you comment?
Well, on all the previous versions of EAP6, such deployment would produce a warning, such as 12:02:23,135 WARN [org.jboss.modules] (MSC service thread 1-2) Failed to define class org.jboss.weld.tests.beanDeployment.noclassdeffound.Bar in Module "deployment.NoClassDefFoundErrorTest.jar:main" from Service Module Loader: java.lang.LinkageError: Failed to link org/jboss/weld/tests/beanDeployment/noclassdeffound/Bar (Module "deployment.NoClassDefFoundErrorTest.jar:main" from Service Module Loader) but would continue with the deployment. e.g. Weld explicitly checks for this condition and will produce an additional INFO message: WELD-000119 Not generating any bean definitions from org.jboss.weld.tests.beanDeployment.noclassdeffound.Bar$ThisWillCauseAnError because of underlying class loading error so I assume this has been a feature that such deployments are allowed (with a warning). The use case could be an library with some optional dependencies, that should continue to work when those optional dependencies are not provided (as is the case of Seam Remoting and its GWT servlet dependency).
This commit https://github.com/jbossas/jboss-modules/commit/29015b4be5f015f02978e06a96ceca0781cbe6c4 is the culprit. I tried reverting it and it fixes the problem.
Unfortunately you are wrong. The commit changes the exception that is thrown; it is not the cause of the problem. The problem is that the code which catches these exceptions in the Weld deployment layer does not catch all the exceptions that it should; this commit just adds a circumstance in which that problem is triggered. Would it be possible to get the complete stack trace for the exception?
Created attachment 744738 [details] ER6-Weld Run of NoClassDefFoundErrorTest First run - unmodified ER6 Second run - ER6 with jboss-modules commit 29015b4b reverted Line 56 - first warn Line 172,215 - error Line 321 - second run
Created attachment 744739 [details] ER6-Seam-chatroom-log Deployment of Seam Chatroom example (uses remoting, has dependency on gwt-servlet) First run - unmodified ER6 Second run - ER6 with jboss-modules commit 29015b4b reverted L177 - first warn L392 - error L447 - second run
If that's the case, this would have to be also fixed in JBoss Seam integration.
Created attachment 744754 [details] Seam Chatroom reproducer
The problem is that BeanDeployer inside of Weld does not robustly handle linkage problems. The code: private boolean isBeanCandidate(Class<?> clazz) { try { return !clazz.isAnnotation() && !clazz.isEnum() && !Reflections.isNonStaticInnerClass(clazz); } catch (NoClassDefFoundError e) { logIgnoredClass(clazz.getName(), e); return false; } } should be: private boolean isBeanCandidate(Class<?> clazz) { try { return !clazz.isAnnotation() && !clazz.isEnum() && !Reflections.isNonStaticInnerClass(clazz); } catch (LinkageError e) { logIgnoredClass(clazz.getName(), e); return false; } } I'll submit a pull request to the Weld project to get this fixed.
https://github.com/weld/core/pull/303 has the fix for this issue.
The fix wouldn't fix the Seam case though, would it?
No, it definitely will not. I do not know the Seam code base so I cannot find the error, but odds are very good it is a similar problem.
I applied the weld patch to the product build of weld 1.1.12.Final and rebuilt it for ER7. https://brewweb.devel.redhat.com/buildinfo?buildID=270641
I have released Weld 1.1.13 with David's fix, and I have also identified and fixed a few other places in the Weld code base that were subject to similar problems. PR is here: https://github.com/jbossas/jboss-eap/pull/143
Right, Stuart's new Weld release fixes the bug in multiple locations, so we should be using that instead!
Verified on EAP 6.1.0.ER7. The Weld problem is fixed.
Even though this particular problem is fixed, there is another problem -- if we deploy a CDI portable extension with missing dependencies, the error is back -- #961470. I have a question in this regard: what was the reason for the change in exception/error handling (https://github.com/jbossas/jboss-modules/commit/29015b4b) in the JBoss Modules classloader? Should LinkageError be considered "unexpected" in ModuleClassLoader? (honestly interested)
The reason for the change in error handing is that when the exception is wrapped, the JDK tends to trim off the cause, resulting in useless stack traces and hard-to-diagnose problems. It also causes an OSGi TCK problem, and a few other miscellaneous issues related to the fact that the specific exception type is wrong so it causes the wrong action to be taken (this is especially true around class definition, which we do concurrently and which needs the right error type to be thrown to work correctly). LinkageError is the base class for run time class loading issues. NoClassDefFoundError is a subclass of LinkageError. What happens is, people coded for the specific exception they *saw* "that one time" without understanding the reason for the exception and the other similar possible problems that can occur. LinkageError is always the right thing to catch when you want to "absorb" a runtime class load problem. Even were that patch reverted and the other problems ignored, there would still be common failure scenarios in that code which would just immediately come back to haunt us as soon as someone hits them. This includes, wrong bytecode version errors, verify errors, missing dependencies on the immediate class (rather than a transitive class as is the case here).