Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 900154 (JBPAPP6-493)

Summary: Highly unstable testsuite in ER 4.1 with frequent build failures due to timeout
Product: [JBoss] JBoss Enterprise Application Platform 6 Reporter: Madhumita Sadhukhan <msadhukh>
Component: TestsuiteAssignee: Ondřej Žižka <ozizka>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 6.0.0CC: msadhukh, ozizka, pslavice, smcgowan
Target Milestone: ---   
Target Release: EAP 6.0.0   
Hardware: Unspecified   
OS: Unspecified   
URL: http://jira.jboss.org/jira/browse/JBPAPP6-493
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-13 12:43:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 899680    
Bug Blocks:    
Attachments:
Description Flags
org.jboss.as.test.smoke.messaging.client.messaging.MessagingClientTestCase.txt
none
unstablebuildER4.1 none

Description Madhumita Sadhukhan 2012-04-04 12:35:11 UTC
project_key: JBPAPP6

EAP6 testsuite run is highly unstable in ER4.1 with build often terminating abruptly due to failures in basic and clustering targets due to timeout.

Random build Failure in basic target due to timeout was introduced in ER4.1.Failure in clustering target was seen earlier but was fixed with -Dsurefire.forked.process.timeout=1800
However even above property has no effect in current testsuite run.

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.10-redhat-1:test (basic-integration-default-web.surefire) on project jboss-as-ts-integ-basic: Failure or timeout -> [Help 1]
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.10-redhat-1:test (ts.surefire.clust.multinode-manual-udp) on project jboss-as-ts-integ-clust: There are test failures.

Comment 1 Madhumita Sadhukhan 2012-04-04 12:36:41 UTC
Ondrej ,

        I believe you and Paul are already aware of this.This is a tracker jira to verify testsuite stability in ER5

Comment 2 Ondřej Žižka 2012-04-04 20:36:41 UTC
Hi Madhumita,

1) is it the same error as before?
2) if so, have you tried to increase the timeout?
3) if not, there are some other configurable timeouts, see the testsuite docs.

Either way, this is result of running the "prod" branch as it is now:
{code}
[INFO] JBoss Application Server Test Suite: Aggregator ... SUCCESS [22.884s]
[INFO] JBoss Application Server Test Suite: Integration .. SUCCESS [1.942s]
[INFO] JBoss Application Server Test Suite: Integration - Smoke  SUCCESS [1:38.630s]
[INFO] JBoss Application Server Test Suite: Integration - Basic  SUCCESS [12:18.438s]
[INFO] JBoss Application Server Test Suite: Integration - Clustering  SUCCESS [10:55.882s]
[INFO] JBoss Application Server Test Suite: Integration - IIOP  SUCCESS [21.210s]
[INFO] JBoss Application Server Test Suite: Integration - XTS  SUCCESS [17.163s]
[INFO] JBoss Application Server Test Suite: Integration - Multinode Tests  SUCCESS [23.226s]
[INFO] JBoss Application Server Test Suite: Integration - Manual Mode Tests  SUCCESS [3:54.841s]
[INFO] JBoss Application Server Test Suite: Compatibility Tests  SUCCESS [24.031s]
[INFO] JBoss Application Server Test Suite: Domain Mode Integration Tests  SUCCESS [2:33.664s]
[INFO] JBoss Application Server Test Suite: Benchmark Tests  SUCCESS [7.959s]
[INFO] JBoss Application Server Test Suite: Stress Tests . SUCCESS [1.210s]
{code}

Therefore, you need to increase timeout limit until it passes, or figure out what's wrong in QA lab.
In case you discover some error in the testsuite, please describe the issue and assign back to me.

Comment 3 Jan Lanik 2012-04-05 08:48:17 UTC
I'm attaching an arror report I got while running testsuite sgainst ER4.1

Comment 4 Jan Lanik 2012-04-05 08:48:17 UTC
Attachment: Added: org.jboss.as.test.smoke.messaging.client.messaging.MessagingClientTestCase.txt


Comment 5 Madhumita Sadhukhan 2012-04-05 12:41:12 UTC
Ondra changing timeouts with every ER cycle and going for trial and error with several automated builds does not seem like a sane idea.
Option should be to use a standard timeout value that works.

As for above stacktrace you printed I also did get to see clean runs as well but it was rare.
Are you getting stable runs everytime against prod repo used in ER4.1?If so what is the timeout value used?
Regarding QA lab the tests are still stable on ER3(apart from benchmark and stress failures).

We will be using this testsuite for one-off regression hence unstability is major concern and the reason i highlighted is it is alarming in ER4.1.

I am attaching detailed stacktrace for build failures (attached file unstablebuildER4.1)

I can see clustering tests are not failing for timeout but for certain test failures
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.10-redhat-1:test (ts.surefire.clust.multinode-manual-udp) on project jboss-as-ts-integ-clust: There are test failures...........................................................
..................................................................................................................................
Caused by: org.apache.maven.plugin.MojoFailureException: There are test failures.

Comment 6 Madhumita Sadhukhan 2012-04-05 12:42:00 UTC
Ondra.....Surprising that the build is failing for test failures (instead of timeout as in basic target)

Comment 7 Madhumita Sadhukhan 2012-04-05 12:44:06 UTC
Attachment: Added: unstablebuildER4.1


Comment 8 Ondřej Žižka 2012-04-05 15:20:24 UTC
1) Changing timeout limits is the only idea I have regarding builds hitting timing out.

2) What build are you referring to? I can quickly check if I can be of any help with failures you see.

3) Regarding MessagingClientTestCase. If test authors followed my advice and put some message into the exception, we would see that the problem is that queue was not removed after 
{code:java}
        op = new ModelNode();
        op.get("operation").set("remove");
        op.get("address").add("subsystem", "messaging");
        op.get("address").add("hornetq-server", "default");
        op.get("address").add("queue", queueName);
        applyUpdate(op, client);
{code}
I don't know whether that's supposed to be synchronous operation. Please refer to docs.
My guess is that it's async, and with low-performance NFS it causes race condition;  therefore the test needs to actively wait for the queue to be removed, with some timeout.

Comment 9 Ondřej Žižka 2012-04-05 15:24:45 UTC
All right, credits for empty exception goes to Jeff, not QE's :)  I'm going to discuss with him.

Comment 10 Madhumita Sadhukhan 2012-04-05 15:28:55 UTC
please check stacktrace in attached file where clustering targets fail due to test failures.....
and exactly for your comment above this is where i want to stress....i never saw the same failure in several runs which Jan has pointed above.Infact I saw several test failures due to server already running or data source already exists etc....
referring https://issues.jboss.org/browse/JBPAPP-8448 ....
I believe proper test clean-up is real cause of unstability issues

Comment 11 Shelly McGowan 2012-04-09 11:56:30 UTC
Pull request for MessagingClientTestCase: https://github.com/jbossas/jboss-as/pull/1994

Comment 12 Paul Gier 2012-04-09 17:31:26 UTC
Link: Added: This issue depends JBPAPP-8368


Comment 13 Paul Gier 2012-04-09 17:31:50 UTC
This seems to have been resolved by JBPAPP-8368

Comment 14 Paul Gier 2012-04-09 22:02:12 UTC
I may have closed this one too soon, we had another timeout build after the successful build.
http://hudson.qa.jboss.com/hudson/job/JBoss-EAP-6.0.x-MEAD/347/

Comment 15 Ondřej Žižka 2012-04-09 22:44:17 UTC
But that job didn't have `surefire.forked.process.timeout` at all.
I've added it, let's see.  Run #348.

Comment 16 Ondřej Žižka 2012-04-10 03:23:33 UTC
Tadaa: https://hudson.qa.jboss.com/hudson/job/JBoss-EAP-6.0.x-MEAD/

Comment 17 Ondřej Žižka 2012-04-10 03:24:02 UTC
I mean... Tadaa:  https://hudson.qa.jboss.com/hudson/job/JBoss-EAP-6.0.x-MEAD/348/

Comment 18 Madhumita Sadhukhan 2012-04-13 12:43:16 UTC
ER5 looks much better and I got couple of runs with no compilation failures ...closing this as of now will reopen if there is regression in future

Comment 19 Anne-Louise Tangring 2012-11-05 17:23:33 UTC
Docs QE Status: Removed: NEW