Example build: http://lightning.mw.lab.eng.bos.redhat.com/viewLog.html?buildId=5188&tab=buildResultsDiv&buildTypeId=EAP_6xIgnoreLinux This is still going on in CI.
Currently most of failing tests are ignored, but list potentially include whole testsuite. Two flags, if WIP, clone for proper branch.
Taking this build: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-63x-patched-testsuite-windows/11/jdk=jdk1.7,label_exp=eap-sustaining%20&&%20w2k8r2%20&&%20x86_64/testReport/ ManagementOpTimeoutTestCase is first to fail with timeout exception here. It fails right on the first line of @Before method (container.start(DEFAULT_JBOSSAS);), server configuration should be clear at this point, but see server starting logs for all the mess: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-63x-patched-testsuite-windows/11/jdk=jdk1.7,label_exp=eap-sustaining%20&&%20w2k8r2%20&&%20x86_64/testReport/org.jboss.as.test.manualmode.management.cli/ManagementOpTimeoutTestCase/testTimeoutCausesRestartRequired/ I've asked security guys to go trough manual-mode tests and try to localize the test breaking the configuration (see part which repeats all the time that server tries to boot up "Remoting "management-client" read-1, fatal error: 46: General SSLEngine problem")
Problem is caused by HTTPSConnectioWithCLITestCase which secures the ManagementNativeRealm with SSL and is unable to unsecure it due to issue descrined in BZ1105003. See https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-63x-patched-testsuite-windows/11/jdk=jdk1.8,label_exp=eap-sustaining && w2k8 && x86_64/testReport/org.jboss.as.test.manualmode.management.cli/HTTPSConnectioWithCLITestCase/resetConfigurationForNativeInterface/ Configuration snippet left by the test: <security-realm name="ManagementNativeRealm"> <server-identities> <ssl> <keystore path="W:\workspace\eap-63x-patched-testsuite-windows\2e26cbc8\testsuite\integration\manualmode\target\workdir\native-if-workdir\server.keystore" keystore-password="123456"/> </ssl> </server-identities> <authentication> <truststore path="W:\workspace\eap-63x-patched-testsuite-windows\2e26cbc8\testsuite\integration\manualmode\target\workdir\native-if-workdir\server.truststore" keystore-password="123456"/> </authentication> </security-realm>
In these tests there is intermittent problem with initialization of CLI tool, which is configured to take custom jboss-cli.xml file with SSL settings. For testing SSL connection to server CustomCLIExecutor class is used: CLIhttps://github.com/jbossas/jboss-eap/blob/6fe2590e7f3ae6adb6987752ba0f3e44401f335b/testsuite/shared/src/main/java/org/jboss/as/test/integration/management/util/CustomCLIExecutor.java Sometimes the CLI initialization freezes, which results in that command is not executed and test therefore fail. This is the output from frozen initialization: INFO [org.jboss.modules] JBoss Modules version 1.3.4.Final-redhat-1 INFO [org.xnio] XNIO Version 3.0.10.GA-redhat-1 INFO [org.xnio.nio] XNIO NIO Implementation Version 3.0.10.GA-redhat-1 INFO [org.jboss.remoting] JBoss Remoting version 3.3.3.Final-redhat-1 when the CLI is unable to connect to server and execute operations. @Alexey: Please can you look at HTTPSConnectioWithCLITestCase which uses this CustomCLIExecutor, if it can be improved somehow to not cause these intermittent failures. I've tried to investigate the problems, but I haven't found any fix for that. Or is there any better approach to test SSL connection with CLI tool?
Fixing the link from comment 5. https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-63x-patched-testsuite-windows/11/jdk=jdk1.8,label_exp=eap-sustaining%20&&%20w2k8%20&&%20x86_64/testReport/org.jboss.as.test.manualmode.management.cli/HTTPSConnectioWithCLITestCase/resetConfigurationForNativeInterface/
Created attachment 936531 [details] Hot fix
Hot fix for HTTPSConnectioWithCLITestCase could be to use standalone.xml configuration just for this particular test, so if it fails due to BZ1105003 other tests wouldn't be affected by this. See attachment 936531 [details]
Created attachment 936539 [details] hot_fix
What I would try to do first in this case is to use :reload instead of reload and see if it makes any difference. For the users we would advise to use the command instead of the operation. The command contains a waiting and reconnecting logic. While the operation simply returns immediately. This reconnecting part of the command is not 100% reliable in my experience. Simply because what is available to implement that logic didn't show a consistent result. It works most of the time but once in awhile I saw connection timeouts for whatever reason. In this test, after reload is sent to the CLI (which has its own waiting-reconnecting logic), there is still another wait-for-the-server logic in place. So, it'll be fine to switch to :reload for the CLI just to see whether it helps and what you see is the problem of CLI waiting and reconnecting to the controller.
Thanks Alexey, we will try that. I'll do a PR once we finish the CP testing https://github.com/pkremens/jboss-eap/commit/d7e44e9a5767856c05358413a4ad012d7c685dd3
Petr Kremensky <pkremens> updated the status of jira WFLY-3890 to Closed
EAP 6.4.0.DR2 run with fix included. https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-as-testsuite-solaris/lastCompletedBuild/RELEASE=6.4.0,jdk=java17_default,label_exp=solaris11%20&&%20sparc/testReport/org.jboss.as.test.manualmode.management.cli/HTTPSConnectioWithCLITestCase/resetConfigurationForNativeInterface/ Manual node tests no longer fails due to 'Could not stop container' and 'Could not start container', but the using :reload command doesn't fix the root cause.