Created attachment 876372 [details]
There are failures for qa tests when running with jdbc object store (failing consistently on all tested dbs):
Running on Narayana 4.17.17.Final.
Created attachment 876376 [details]
Created attachment 876377 [details]
Created attachment 876379 [details]
Created attachment 876380 [details]
Created attachment 876382 [details]
The failure is because the test is timing out waiting for the recovery system to recover failed transactions. I have seen this before on very slow connections to the db hosting the transaction logs.
Is this something that is reproducible. If not then I would be tempted to close as cannot reproduce. Note that this is safe because the logs are still in the db and the recovery system would eventually replay the pending transactions.
ok, I see. So is there some way how to increase the timeout or something?
The a bit strange thing is that this was not happening in previous testing cycle what I can say. But there could be some changes in our networking infrastructure.
What I can say it's easily reproducible if you try to run the test in the way that's described in comment #6. There are connection to our testing oracle database. But this happening for any database in our lab. I've tested against my local postgres installation:
ant -f run-tests.xml -Dtest.name=crashrecovery12 -Dtest.methods="CrashRecovery12_Test02" -Dprofile=postgres -Djdbc.db.url="localhost" -Djdbc.db.name=crashrec -Djdbc.db.user=crashrec -Djdbc.db.password=crashrec -Djdbc.db.port=5432 onetest
And fails are consistent there too.
Tom Jenkinson <firstname.lastname@example.org> updated the status of jira JBTM-2133 to Closed
Tom Jenkinson <email@example.com> updated the status of jira JBTM-2130 to Closed
I've tested this with EAP 6.3.0.ER5 with version 4.17.20.Final. The failures mentioned here are still valid.
The same for EAP 6.3.0.ER9 on 4.17.21.Final.
We discussed this with Mike and I found out that issue was already solved by JBTM-2130.
The remaining fails were only test issue.
There were two things - as the test with jdbc object store was not working I set too big number of MFACTOR (about 40) and it causes that this test were failing. When I put it to something normal like 3 test started to pass.
Then the reproducer on my local machine was not working as I did set MFACTOR but I run it manually and I didn't consider that the system variable MFACTOR is recognized just when using narayana.sh script. When running by hand the TaskImpl.properties are needed to be change directly (e.g. COMMAND_LINE_12=-DCoreEnvironmentBean.timeoutFactor=2, COMMAND_LINE_13=-DCoordinatorEnvironmentBean.defaultTimeout=240)
Setting as verified on Naryana 4.17.21.Final (what is version of EAP 6.3.0.GA)
Thanks and sorry Mike
Jenkins job tested here: