Bug 1045659 - Error loading global condition cache: ORA-01427: single-row subquery returns more than one row
Summary: Error loading global condition cache: ORA-01427: single-row subquery returns ...
Keywords:
Status: ON_QA
Alias: None
Product: RHQ Project
Classification: Other
Component: Alerts
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified vote
Target Milestone: ---
: RHQ 4.14
Assignee: Nobody
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1198086
TreeView+ depends on / blocked
 
Reported: 2013-12-21 00:18 UTC by Elias Ross
Modified: 2022-03-31 04:27 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:


Attachments (Terms of Use)

Description Elias Ross 2013-12-21 00:18:18 UTC
Description of problem:


21:00:56,617 ERROR [org.rhq.enterprise.server.alert.engine.internal.AlertConditionCacheCoordinator] (EJB default - 2) Error loading global condition cache: javax.ejb.EJBException: javax.persistence.Persis
tenceException: org.hibernate.exception.DataException: could not extract ResultSet
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleExceptionInOurTx(CMTTxInterceptor.java:165) [jboss-as-ejb3-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4]
...
Caused by: org.hibernate.exception.DataException: could not extract ResultSet
        at org.hibernate.exception.internal.SQLStateConversionDelegate.convert(SQLStateConversionDelegate.java:134) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
        at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
        at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:125) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
        at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:110) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
        at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:53) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
        at org.hibernate.loader.Loader.getResultSet(Loader.java:2031) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
        at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1832) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
...
Caused by: java.sql.SQLException: ORA-01427: single-row subquery returns more than one row

        at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:440) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:837) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:445) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:191) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:523) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:207) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:1010) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1185) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1275) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3576) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3620) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1491) [ojdbc6.jar:11.2.0.2.0]
        at org.jboss.jca.adapters.jdbc.CachedPreparedStatement.executeQuery(CachedPreparedStatement.java:107)
        at org.jboss.jca.adapters.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:462)
        at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:48) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
        ... 270 more

Shorter version:

12:03:56,089 ERROR [org.jboss.as.ejb3.invocation] (EJB default - 4) JBAS014134: EJB Invocation failed on component AlertConditionManagerBean for method public abstract org.rhq.core.domain.util.PageList org.rhq.enterprise.server.alert.AlertConditionManagerLocal.getAlertConditionComposites(org.rhq.core.domain.auth.Subject,java.lang.Integer,org.rhq.core.domain.alert.AlertConditionCategory,org.rhq.core.domain.util.PageControl): javax.ejb.EJBException: javax.persistence.PersistenceException: org.hibernate.exception.DataException: could not extract ResultSet
        at org.rhq.enterprise.server.alert.AlertConditionManagerLocal$$$view118.getAlertConditionComposites(Unknown Source) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.alert.engine.internal.GlobalConditionCache.loadCaches(GlobalConditionCache.java:109) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.alert.engine.internal.GlobalConditionCache.<init>(GlobalConditionCache.java:82) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.alert.engine.internal.AlertConditionCacheCoordinator.reloadGlobalCache(AlertConditionCacheCoordinator.java:114) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.alert.engine.AlertConditionCacheManagerBean.reloadGlobalCache(AlertConditionCacheManagerBean.java:110) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.common.PerformanceMonitorInterceptor.monitorHibernatePerformance(PerformanceMonitorInterceptor.java:32) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.common.TransactionInterruptInterceptor.addCheckedActionToTransactionManager(TransactionInterruptInterceptor.java:78) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.authz.RequiredPermissionsInterceptor.checkRequiredPermissions(RequiredPermissionsInterceptor.java:156) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.alert.engine.AlertConditionCacheManagerLocal$$$view152.reloadGlobalCache(Unknown Source) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerBean.reloadGlobalCacheIfNeeded(CacheConsistencyManagerBean.java:111) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerBean.reloadServerCacheIfNeeded(CacheConsistencyManagerBean.java:96) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.common.PerformanceMonitorInterceptor.monitorHibernatePerformance(PerformanceMonitorInterceptor.java:32) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.common.TransactionInterruptInterceptor.addCheckedActionToTransactionManager(TransactionInterruptInterceptor.java:78) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.authz.RequiredPermissionsInterceptor.checkRequiredPermissions(RequiredPermissionsInterceptor.java:156) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerLocal$$$view16.reloadServerCacheIfNeeded(Unknown Source) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerBean.handleHeartbeatTimer(CacheConsistencyManagerBean.java:86) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.alert.AlertConditionManagerBean.getAlertConditionComposites(AlertConditionManagerBean.java:158) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.common.PerformanceMonitorInterceptor.monitorHibernatePerformance(PerformanceMonitorInterceptor.java:32) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.common.TransactionInterruptInterceptor.addCheckedActionToTransactionManager(TransactionInterruptInterceptor.java:78) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.authz.RequiredPermissionsInterceptor.checkRequiredPermissions(RequiredPermissionsInterceptor.java:156) [rhq-server.jar:4.9.0]
        at org.rhq.core.domain.util.JoinFetchReportingQueryTranslator.list(JoinFetchReportingQueryTranslator.java:84) [rhq-core-domain-ejb3.jar:4.9.0]

I was thinking it might be because of this query:


    @NamedQuery(name = AlertCondition.QUERY_BY_CATEGORY_AVAILABILITY, query = "" //
....
        + "           SELECT a.availabilityType " //
        + "             FROM Availability a " //
        + "             JOIN a.resource ar " //
        + "            WHERE ar = res " //
        + "              AND a.endTime IS NULL " //
....
public class AlertCondition implements Serializable {

The query that might be returning multiple rows, but I did not see this:

select resource_id from rhq_availability
where end_time is null
group by resource_id
HAVING ( COUNT(resource_id) > 1 )

... This is if rhq_availability has two availabilities for one resource.

This seems to have cleared itself up. Is it possible for this query to return multiple results?


Version-Release number of selected component (if applicable): 4.9


How reproducible: Unclear. Unknown.

Comment 1 Jay Shaughnessy 2014-01-08 21:41:52 UTC
That would seem like the most likely subquery although it should not happen that a resource has two null-end-time Availability entries.  Having said that, I won't say it's impossible, and as such I have code that corrects that situation if ever detected during availability report handling for that resource.

There has been some further work for 4.10 around cache reloading, although not specifically for an issue like this.

I suggest we monitor this one and see if it re-occurs in a 4.10+ release.

Comment 2 Elias Ross 2014-11-19 19:44:21 UTC
I've seen this again.

It seems to happen to a host (platform resource) due to an agent that was reimaged. It seemed to go away once the agent was reconnected.

Comment 3 Elias Ross 2014-11-19 20:32:49 UTC
Is it possible that the detection code simply fails to clean up certain entries like this, especially if the resource belongs to an agent that is in a bad state?

Comment 4 Jay Shaughnessy 2014-11-20 16:02:51 UTC
I think the code is cleaning things up appropriately but that there must be some time window leaving us vulnerable.  Platforms have various special handling and I suspect the window may happen, for example, during a backfill for an agent that was unexpectedly terminated (or reimaged).  If we get unlucky and there is a cache refresh during this window of having 2 entries with a null end time then we get this issue.

I'm not sure what the best approach is, trying to identify and close the window may be more difficult than maybe changing the query to ensure a single return value.  Unfortunately there is not a common syntax for LIMITing to 1 row, which means I think we'd have to remove this as a NamedQuery and construct it at runtime, with the proper vendor-specific syntax.

Comment 5 Michael Burman 2015-03-30 08:55:20 UTC
Prevent the error by modifying the query in master (not the window where this might happen):

commit 47e27a677fdfd230896afac74dc78c0241bdd66c
Author: Michael Burman <yak>
Date:   Mon Mar 30 11:53:12 2015 +0300

    [BZ 1045659] Add aggregate function max() to subquery to prevent loading more than one availability_type


Note You need to log in before you can comment on or make changes to this bug.