Bug 1032641

Summary: IllegalStateException from periodic recovery when server is reloaded with XA datasource being defined
Product: [JBoss] JBoss Enterprise Application Platform 6 Reporter: Ondrej Chaloupka <ochaloup>
Component: Transaction ManagerAssignee: Michael <mmusgrov>
Status: CLOSED CURRENTRELEASE QA Contact: Ondrej Chaloupka <ochaloup>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.2.0CC: hhovsepy, jpederse, mmusgrov, msimka, ochaloup
Target Milestone: ---   
Target Release: EAP 6.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previous versions of JBoss EAP 6 could encounter an `IllegalStateException` during some transaction recovery operations. The transaction recovery system operates by querying resources for their view of 'in-doubt' transaction branches. It uses a "RecoveryHelper" which JCA registers to achieve this. When a resource is removed from the system, the RecoveryHelper is de-registered. In previous versions of the product there was a race condition in the code whereby if the removal happened during a "recovery scan" then the helper was not removed. This resulted in the possibility that transaction recovery would continue using resources even though they had been removed from the server (which could produce the `IllegalStateExceptions`). In this release, if the current recovery scan is using the resource, it waits for it to finish and then remove the helper.
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1127272    

Description Ondrej Chaloupka 2013-11-20 14:13:19 UTC
When you have configured an XA datasource in the server, you will wait for more than 10 seconds and you will reload the server configuration (jboss cli reload command) you will get the IllegalStateException [1].

This occurs from EAP 6.2.0.ER7. We did not hit it under EAP 6.2.0.ER6. As there was transaction update from 4.17.12 to 4.17.15 I would speculate whether this problem is connected with https://issues.jboss.org/browse/JBTM-1700.

How to reproduce:
1) Define a XA resource e.g. like:
<xa-datasource jndi-name="java:jboss/xa-datasources/CrashRecoveryDS" pool-name="CrashRecoveryDS" enabled="true">
    <xa-datasource-property name="ServerName">db-hostname</xa-datasource-property>
    <xa-datasource-property name="PortNumber">3306</xa-datasource-property>
    <xa-datasource-property name="DatabaseName">dbname</xa-datasource-property>
    <xa-datasource-class>com.mysql.jdbc.jdbc2.optional.MysqlXADataSource</xa-datasource-class>
    <driver>mysql55-jdbc-driver.jar</driver>
    <security>
        <user-name>user</user-name>
        <password>user</password>
    </security>
</xa-datasource>
2) Run the e.g. EAP 6.2.0.CR2
3) Wait 10 or 15 seconds and reload the server
for I in `seq 1 3`; do echo "running #${I}"; ./bin/jboss-cli.sh -c ":reload"; sleep 15; done
4) You will see the IllegalArgumentException



[1]
ERROR [org.jboss.jca.core.tx.jbossts.XAResourceRecoveryImpl] (Periodic Recovery) IJ000906: Error during crash recovery: java:jboss/xa-datasources/CrashRecoveryDS (Could not create connection): javax.resource.ResourceException: Could not create connection                              
 at org.jboss.jca.core.tx.jbossts.XAResourceRecoveryImpl.getXAResources(XAResourceRecoveryImpl.java:170)
 at com.arjuna.ats.internal.jbossatx.jta.XAResourceRecoveryHelperWrapper.getXAResources(XAResourceRecoveryHelperWrapper.java:51) [jbossjts-integration-4.17.15.Final-redhat-2.jar:4.17.15.Final-redhat-2]
 at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.resourceInitiatedRecoveryForRecoveryHelpers(XARecoveryModule.java:516) [jbossjts-jacorb-4.17.15.Final-redhat-2.jar:4.17.15.Final-redhat-2]
 at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:182) [jbossjts-jacorb-4.17.15.Final-redhat-2.jar:4.17.15.Final-redhat-2]
 at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:743) [jbossjts-jacorb-4.17.15.Final-redhat-2.jar:4.17.15.Final-redhat-2]
 at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:371) [jbossjts-jacorb-4.17.15.Final-redhat-2.jar:4.17.15.Final-redhat-2]
Caused by: java.lang.IllegalStateException
 at org.jboss.msc.value.InjectedValue.getValue(InjectedValue.java:47)
 at org.jboss.as.connector.subsystems.datasources.AbstractDataSourceService.driverClassLoader(AbstractDataSourceService.java:237)
 at org.jboss.as.connector.subsystems.datasources.AbstractDataSourceService.access$200(AbstractDataSourceService.java:89)
 at org.jboss.as.connector.subsystems.datasources.AbstractDataSourceService$AS7DataSourceDeployer$1.getClassLoader(AbstractDataSourceService.java:402)
 at org.jboss.jca.adapters.jdbc.xa.XAManagedConnectionFactory.getXADataSource(XAManagedConnectionFactory.java:581)
 at org.jboss.jca.adapters.jdbc.xa.XAManagedConnectionFactory.getXAManagedConnection(XAManagedConnectionFactory.java:444)
 ... 8 more

Comment 2 tom.jenkinson 2013-11-20 14:40:44 UTC
It looks more like a JCA issue as the root cause is coming from inside there where we request a connection?

I think there was a bug with ER6 that meant :reload didn;t work at all so that could have masked an underlying issue?

Jesper can you shed light on the IllegalStateException please?

Comment 3 Jesper Pedersen 2013-11-21 03:30:03 UTC
Looks like that <driver> has been set correctly, so post your entire configuration of the :datasources: subsystem, include the content of the module itself.

Comment 4 Martin Simka 2013-11-21 09:31:28 UTC
on EAP 6.2.0.CR3 same behavior

standalone.xml - datasources subsystem:
*********************

<subsystem xmlns="urn:jboss:domain:datasources:1.1">
            <datasources>
                <datasource jndi-name="java:jboss/datasources/ExampleDS" pool-name="ExampleDS" enabled="true" use-java-context="true">
                    <connection-url>jdbc:h2:mem:test;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=FALSE</connection-url>
                    <driver>h2</driver>
                    <security>
                        <user-name>sa</user-name>
                        <password>sa</password>
                    </security>
                </datasource>
                <xa-datasource jndi-name="java:jboss/xa-datasources/CrashRecoveryDS" pool-name="CrashRecoveryDS" enabled="true">
                   <xa-datasource-property name="ServerName">
                        db01.mw.lab.eng.bos.redhat.com
                   </xa-datasource-property>  
                   <xa-datasource-property name="PortNumber">  
                        3306
                   </xa-datasource-property>  
                   <xa-datasource-property name="DatabaseName">  
                        dballo01
                   </xa-datasource-property>  
                   <xa-datasource-class>com.mysql.jdbc.jdbc2.optional.MysqlXADataSource</xa-datasource-class>  
                   <driver>mysql-connector-java-5.1.23-bin.jar</driver>  
                   <security>  
                       <user-name>dballo01</user-name>  
                       <password>dballo01</password>  
                   </security>  
                </xa-datasource>
                <drivers>
                    <driver name="h2" module="com.h2database.h2">
                        <xa-datasource-class>org.h2.jdbcx.JdbcDataSource</xa-datasource-class>
                    </driver>
                </drivers>
            </datasources>
        </subsystem>

*********************

Driver mysql-connector-java-5.1.23-bin.jar is deployed in standalone/deployments directory.

Comment 5 Jesper Pedersen 2013-11-25 13:17:31 UTC
Use a proper module deployment of the MySQL JDBC driver.

Comment 6 Martin Simka 2013-11-25 14:08:01 UTC
Same behavior when I add jdbc driver as a module. 

jboss-eap-6.2$ ls modules/org/mysql/jdbc/main/
module.xml  mysql-connector-java-5.1.23-bin.jar

jboss-eap-6.2$ cat modules/org/mysql/jdbc/main/module.xml 
<?xml version="1.0" encoding="UTF-8"?>
<module xmlns="urn:jboss:module:1.1" name="org.mysql.jdbc">
    <resources>
        <resource-root path="mysql-connector-java-5.1.23-bin.jar"/>
    </resources>

    <dependencies>
        <module name="javax.transaction.api" />
        <module name="javax.api" />
    </dependencies>
</module>

standalone.xml - datasources subsystem:
*********************
<subsystem xmlns="urn:jboss:domain:datasources:1.1">
            <datasources>
                <datasource jndi-name="java:jboss/datasources/ExampleDS" pool-name="ExampleDS" enabled="true" use-java-context="true">
                    <connection-url>jdbc:h2:mem:test;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=FALSE</connection-url>
                    <driver>h2</driver>
                    <security>
                        <user-name>sa</user-name>
                        <password>sa</password>
                    </security>
                </datasource>
                <xa-datasource jndi-name="java:jboss/xa-datasources/CrashRecoveryDS" pool-name="CrashRecoveryDS" enabled="true">
                        <xa-datasource-property name="ServerName">     
                                db01.mw.lab.eng.bos.redhat.com
                        </xa-datasource-property>     
                        <xa-datasource-property name="PortNumber">     
                                3306
                        </xa-datasource-property>     
                        <xa-datasource-property name="DatabaseName">     
                                dballo00
                        </xa-datasource-property>     
                        <driver>mysql</driver>     
                        <security>     
                                <user-name>dballo00</user-name>
                                <password>dballo00</password>
                        </security>     
                </xa-datasource>
                <drivers>
                    <driver name="h2" module="com.h2database.h2">
                        <xa-datasource-class>org.h2.jdbcx.JdbcDataSource</xa-datasource-class>
                    </driver>
                    <driver name="mysql" module="org.mysql.jdbc">
                        <xa-datasource-class>com.mysql.jdbc.jdbc2.optional.MysqlXADataSource</xa-datasource-class>
                    </driver>
                </drivers>
            </datasources>

Comment 7 Stefano Maestri 2014-10-10 15:59:00 UTC
I have double checked and JCA (XaDataSourceService) is removing correctly recovery resource during its stop(). And I've double checked the ISE appear after a full stop() happen correctly calling this code

XAResourceRecoveryRegistry rr = transactionIntegrationValue.getValue().getRecoveryRegistry();

            for (XAResourceRecovery recovery : deploymentMD.getRecovery()) {
                if (recovery != null) {
                    try {
                        recovery.shutdown();
                    } catch (Exception e) {
                        ConnectorLogger.SUBSYSTEM_DATASOURCES_LOGGER.error("Error during recovery shutdown", e);
                    } finally {
                        rr.removeXAResourceRecovery(recovery);
                    }
                }
            }

I think problem is on TS side, and I'm reassigning the issue to Tom. Of course I'm more than open to work together on this one.
I have also a commit, w/ few changes to have a more correct services up/down order in border situation, but I'm not rising PR for this, since it's not related IMHO to this issue and it's not eligible for a backporting.

Comment 8 tom.jenkinson 2014-10-14 09:48:22 UTC
Does this happen always and repeatedly or just once? 

It may be possible to interleave:
https://github.com/jbosstm/narayana/blob/master/ArjunaJTA/jta/classes/com/arjuna/ats/internal/jta/recovery/arjunacore/XARecoveryModule.java#L109

With:
https://github.com/jbosstm/narayana/blob/master/ArjunaJTA/jta/classes/com/arjuna/ats/internal/jta/recovery/arjunacore/XARecoveryModule.java#L506

Assigning to Mike to take a look once you have answered.

Thanks,
Tom

Comment 9 Jesper Pedersen 2014-10-24 14:04:06 UTC
removeXAResourceRecovery must be a blocking call, if 'recovery' is currently active for recovery. Then an exclusive lock must be used for the removal of the instance. So, yes, race condition atm

Comment 10 Michael 2014-11-03 14:23:43 UTC
Ondrej,

I fixed some locking issues in this area in revision 4.17.23 (JBTM-2241, JBTM-2242). Would it be possible to retest using this version of jbossts?

Comment 11 Ondrej Chaloupka 2014-11-03 14:52:20 UTC
Hi Mike,

I've tested it on EAP 6.4.0.DR7 where the 4.17.23 is shipped in and I haven't been able to reproduce the issue yet.
(The issue is reproducable with 4.17.22)

Setting as verified.

Thanks
Ondra

Comment 12 JBoss JIRA Server 2014-11-03 15:05:37 UTC
Michael Musgrove <mmusgrov> updated the status of jira JBTM-2282 to Resolved

Comment 13 JBoss JIRA Server 2014-11-18 14:30:10 UTC
Tom Jenkinson <tom.jenkinson> updated the status of jira JBTM-2282 to Closed