Bug 1091425 - 1 CMR and 2 XA crashrec: error in log while recovering
Summary: 1 CMR and 2 XA crashrec: error in log while recovering
Keywords:
Status: CLOSED DUPLICATE of bug 1104227
Alias: None
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: Transaction Manager
Version: 6.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: EAP 6.4.0
Assignee: Michael
QA Contact: Hayk Hovsepyan
Russell Dickenson
URL:
Whiteboard:
: 1075008 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-04-25 14:51 UTC by Hayk Hovsepyan
Modified: 2014-10-25 12:17 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2014-09-05 07:24:22 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
ServerLog (430.41 KB, text/x-log)
2014-04-25 14:51 UTC, Hayk Hovsepyan
no flags Details
ConfigurationHints (2.21 KB, application/x-shellscript)
2014-04-25 14:52 UTC, Hayk Hovsepyan
no flags Details
NewServerLog (481.77 KB, text/x-log)
2014-05-02 07:40 UTC, Hayk Hovsepyan
no flags Details
TraceServerLog (952.71 KB, text/x-log)
2014-05-02 09:30 UTC, Hayk Hovsepyan
no flags Details
MySql55Log (876.27 KB, text/x-log)
2014-05-06 12:44 UTC, Hayk Hovsepyan
no flags Details
Oracle12cLog (902.40 KB, text/x-log)
2014-05-06 12:45 UTC, Hayk Hovsepyan
no flags Details
PGSQLServerLog (937.41 KB, text/x-log)
2014-05-09 11:08 UTC, Hayk Hovsepyan
no flags Details
Oracle12cTrace (864.64 KB, text/x-log)
2014-06-09 11:55 UTC, Hayk Hovsepyan
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1104227 0 unspecified CLOSED Incorrent jndi of datasource provided to jdbc driver for recovery when more datasources exist 2021-02-22 00:41:40 UTC

Internal Links: 1104227

Description Hayk Hovsepyan 2014-04-25 14:51:42 UTC
Created attachment 889781 [details]
ServerLog

Description of problem:
When trying to execute crashrec scenario with 1 CMR and 2 XA respurces (working with postgresql), there are some errors in log which is attached.


Version-Release number of selected component (if applicable):
jboss-eap-6.3.0.ER2

It is covered in test case "JPAMultiXACMRCrashRecoveryTestCase#commitHaltFirstXA" of "lrco" branch of "git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-transactions.git" repo.

You can reproduce it by running "mvn clean verify -Dtest=JPAMultiXACMRCrashRecoveryTestCase#commitHaltFirstXA -Dmaven.test.failure.ignore=true -fae -Djbossts.noJTS" in directory "~/eap-tests-transactions/integration/jbossts". 

You need postgresql92 database with 3 crashrec users. And "postgresql92-jdbc-driver.jar" put in some directory and mentioned in "ds.properties"

Please find some hints and examples to configure ds.properies file with postgresql in attached file.

How reproducible:
always

Steps to Reproduce:
1) enlist CMR datasource
2) enlist TestXAResource
3) enlist 2 XA datasources
4) prepare all
5) crash on "commit" method exit of TestXAResource
6) reboot the server and call recovery
7) at this point only the CMR is committed, and the 2 XA resources are not committed, but in fact they should 

Actual results:
only CMR is committed

Expected results:
all resources must be committed

Additional info:
the same when it is 1 CMR 1 TestXAResource and 1 XAResource

Comment 1 Hayk Hovsepyan 2014-04-25 14:52:34 UTC
Created attachment 889783 [details]
ConfigurationHints

Comment 3 Hayk Hovsepyan 2014-05-02 07:40:29 UTC
Created attachment 891730 [details]
NewServerLog

Adding the full log file once again.

Comment 4 Hayk Hovsepyan 2014-05-02 09:30:47 UTC
Created attachment 891754 [details]
TraceServerLog

Attached new log with TRACE level.

Comment 5 Michael 2014-05-06 11:04:51 UTC
The error is coming from the Postgresql resource manager not the CMR implementation. When recovery attempts to commit the resource an XAER_RMERR error is thrown which the TM reports as a rollback. So the question that needs resolving is why does the RM return XAER_RMERR.

The highlights from the log are:

The recovery attempt is here:
11:24:52,644 DEBUG [com.arjuna.ats.arjuna] (Periodic Recovery) RecoverAtomicAction.replayPhase2 recovering 0:ffff7f000001:-3e3426a8:5363643d:4a ActionStatus is ActionStatus.COMMITTED

Byteman triggers another crash:
11:24:52,647 INFO  [stdout] (Periodic Recovery) org.jboss.byteman.contrib.dtest.Instrumentor_com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord_topLevelCommit_crashatEXIT execute()

Server restarts:
11:24:54,919 INFO  [org.jboss.as] (MSC service thread 1-6) JBAS015899: JBoss EAP 6.3.0.Beta1 (AS 7.4.0.Final-redhat-10) starting

There is a byteman rule parsing error:
11:25:06,642 WARN  [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016009: Caught:: org.jboss.byteman.rule.exception.ExecuteException: MethodExpression.interpret : exception invoking method remoteTrace file /tmp/jbosstscrashrectests9039010908941169760 line 23
Caused by: java.rmi.NotBoundException: org.jboss.as.test.jbossts.common.TestXAResourceRecovered

There is another recovery attempt:
11:25:16,664 DEBUG [com.arjuna.ats.arjuna] (Periodic Recovery) RecoverAtomicAction.replayPhase2 recovering 0:ffff7f000001:-3e3426a8:5363643d:4a ActionStatus is ActionStatus.COMMITTED

The Postgresql resource manager throws an XAER_RMERR error:
11:25:16,677 WARN  [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016036: commit on < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff7f000001:-3e3426a8:5363643d:4a, node_name=1, branch_uid=0:ffff7f000001:-3e3426a8:5363643d:5e, subordinatenodename=null, eis_name=java:jboss/xa-datasources/CrashRecoveryDS2 > (XAResourceWrapperImpl@5d0a8ec4[xaResource=org.jboss.jca.adapters.jdbc.xa.XAManagedConnection@60231594 pad=false overrideRmValue=null productName=PostgreSQL productVersion=9.1.7 jndiName=java:jboss/xa-datasources/CrashRecoveryDS1]) failed with exception $XAException.XAER_RMERR: org.postgresql.xa.PGXAException: Error committing prepared transaction

This means it rolled back

Comment 6 Hayk Hovsepyan 2014-05-06 12:44:48 UTC
Created attachment 892854 [details]
MySql55Log

Comment 7 Hayk Hovsepyan 2014-05-06 12:45:24 UTC
Created attachment 892856 [details]
Oracle12cLog

Comment 8 Hayk Hovsepyan 2014-05-06 12:49:54 UTC
Michael,

thanks for your research, 
I run the same scenario for other databases and it fails for other databases as well, with different errors in logs.

please find the server trace logs for the same scenario run on mysql55 and oracle12c, there is NPE there.

Comment 9 Michael 2014-05-06 13:17:07 UTC
The oracle problem is:

ORA-24774: cannot switch to specified transaction
Cause: The transaction specified in the call refers to a transaction created by a different user.
Action: Create transactions with the same authentication so that they can be switched.

which looks like a configuration error.

The NPE in the mysql log is triggered by your helper
org.jboss.as.test.jbossts.common.TestXAResourceRecoveryHelper.preDestroy

are you sure the test is configured correctly.

Comment 10 Hayk Hovsepyan 2014-05-09 11:08:32 UTC
Created attachment 894002 [details]
PGSQLServerLog

Hi Michael,

There was a problem with configuration and it is fixed now.
For MySQL and Oracle there are separate bugs which are already logged. So now the actual problem is related to PostgreSQL.

Please find the new attachment of trace log: PGSQLServerLog.
You can see that exception there "org.postgresql.util.PSQLException: ERROR: prepared transaction belongs to another database".
One more strange thing is that it tries to rollback XA resource, but should commit as crash happened in second phase.

There are also exceptions with "TestXAResource" (test framework problem), but you can ignore it as it does not affect on real XA resources.

Comment 11 Michael 2014-05-09 13:29:55 UTC
That postgressql error and the rollback attempt is coming from your test harness (JPACrashHelper#wipeOutTxsInDoubt()) not from the TM. How come you are attempting to rollback the transaction branch.

[Note: I do see later on in the log a recovery attempt on the same branch (131077_AAAAAAAAAAAAAP//fwAAAQVYtNhTbLRmAAAASjk=_AAAAAAAAAAAAAP//fwAAAQVYtNhTbLRmAAAAWQAAAAIAAAAA) which fails because the branch does not exist].

Comment 12 Hayk Hovsepyan 2014-06-09 11:55:17 UTC
Created attachment 904618 [details]
Oracle12cTrace

Hi Michael,

Please find attached the log of running under Oracle12c.
Data sources are configured in a way that only one XA datasource is recoverable.
Seems like the cause is the same as https://bugzilla.redhat.com/show_bug.cgi?id=1104227.

/Hayk.

Comment 13 Ondrej Chaloupka 2014-07-08 14:27:44 UTC
*** Bug 1075008 has been marked as a duplicate of this bug. ***

Comment 15 Hayk Hovsepyan 2014-09-05 07:24:22 UTC
Hi Mike,

Thanks for your investigations.
The problem of this issue is JCA related, when it does not take the correct datasource for recovery. The problem is described more detailed in: https://bugzilla.redhat.com/show_bug.cgi?id=1104227
As this BZ is duplicate I am going to close it.
Sorry for inconvenience.

/Hayk.

*** This bug has been marked as a duplicate of bug 1104227 ***


Note You need to log in before you can comment on or make changes to this bug.