Bug 1045659

Summary: Error loading global condition cache: ORA-01427: single-row subquery returns more than one row
Product: [Other] RHQ Project Reporter: Elias Ross <genman>
Component: AlertsAssignee: Nobody <nobody>
Status: ON_QA --- QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.9CC: hrupp, jshaughn
Target Milestone: ---   
Target Release: RHQ 4.14   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1198086    

Description Elias Ross 2013-12-21 00:18:18 UTC
Description of problem:


21:00:56,617 ERROR [org.rhq.enterprise.server.alert.engine.internal.AlertConditionCacheCoordinator] (EJB default - 2) Error loading global condition cache: javax.ejb.EJBException: javax.persistence.Persis
tenceException: org.hibernate.exception.DataException: could not extract ResultSet
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleExceptionInOurTx(CMTTxInterceptor.java:165) [jboss-as-ejb3-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4]
...
Caused by: org.hibernate.exception.DataException: could not extract ResultSet
        at org.hibernate.exception.internal.SQLStateConversionDelegate.convert(SQLStateConversionDelegate.java:134) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
        at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
        at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:125) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
        at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:110) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
        at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:53) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
        at org.hibernate.loader.Loader.getResultSet(Loader.java:2031) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
        at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1832) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
...
Caused by: java.sql.SQLException: ORA-01427: single-row subquery returns more than one row

        at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:440) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:837) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:445) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:191) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:523) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:207) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:1010) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1185) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1275) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3576) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3620) [ojdbc6.jar:11.2.0.2.0]
        at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1491) [ojdbc6.jar:11.2.0.2.0]
        at org.jboss.jca.adapters.jdbc.CachedPreparedStatement.executeQuery(CachedPreparedStatement.java:107)
        at org.jboss.jca.adapters.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:462)
        at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:48) [hibernate-core-4.2.0.CR1.jar:4.2.0.CR1]
        ... 270 more

Shorter version:

12:03:56,089 ERROR [org.jboss.as.ejb3.invocation] (EJB default - 4) JBAS014134: EJB Invocation failed on component AlertConditionManagerBean for method public abstract org.rhq.core.domain.util.PageList org.rhq.enterprise.server.alert.AlertConditionManagerLocal.getAlertConditionComposites(org.rhq.core.domain.auth.Subject,java.lang.Integer,org.rhq.core.domain.alert.AlertConditionCategory,org.rhq.core.domain.util.PageControl): javax.ejb.EJBException: javax.persistence.PersistenceException: org.hibernate.exception.DataException: could not extract ResultSet
        at org.rhq.enterprise.server.alert.AlertConditionManagerLocal$$$view118.getAlertConditionComposites(Unknown Source) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.alert.engine.internal.GlobalConditionCache.loadCaches(GlobalConditionCache.java:109) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.alert.engine.internal.GlobalConditionCache.<init>(GlobalConditionCache.java:82) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.alert.engine.internal.AlertConditionCacheCoordinator.reloadGlobalCache(AlertConditionCacheCoordinator.java:114) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.alert.engine.AlertConditionCacheManagerBean.reloadGlobalCache(AlertConditionCacheManagerBean.java:110) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.common.PerformanceMonitorInterceptor.monitorHibernatePerformance(PerformanceMonitorInterceptor.java:32) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.common.TransactionInterruptInterceptor.addCheckedActionToTransactionManager(TransactionInterruptInterceptor.java:78) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.authz.RequiredPermissionsInterceptor.checkRequiredPermissions(RequiredPermissionsInterceptor.java:156) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.alert.engine.AlertConditionCacheManagerLocal$$$view152.reloadGlobalCache(Unknown Source) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerBean.reloadGlobalCacheIfNeeded(CacheConsistencyManagerBean.java:111) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerBean.reloadServerCacheIfNeeded(CacheConsistencyManagerBean.java:96) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.common.PerformanceMonitorInterceptor.monitorHibernatePerformance(PerformanceMonitorInterceptor.java:32) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.common.TransactionInterruptInterceptor.addCheckedActionToTransactionManager(TransactionInterruptInterceptor.java:78) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.authz.RequiredPermissionsInterceptor.checkRequiredPermissions(RequiredPermissionsInterceptor.java:156) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerLocal$$$view16.reloadServerCacheIfNeeded(Unknown Source) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerBean.handleHeartbeatTimer(CacheConsistencyManagerBean.java:86) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.alert.AlertConditionManagerBean.getAlertConditionComposites(AlertConditionManagerBean.java:158) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.common.PerformanceMonitorInterceptor.monitorHibernatePerformance(PerformanceMonitorInterceptor.java:32) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.common.TransactionInterruptInterceptor.addCheckedActionToTransactionManager(TransactionInterruptInterceptor.java:78) [rhq-server.jar:4.9.0]
        at org.rhq.enterprise.server.authz.RequiredPermissionsInterceptor.checkRequiredPermissions(RequiredPermissionsInterceptor.java:156) [rhq-server.jar:4.9.0]
        at org.rhq.core.domain.util.JoinFetchReportingQueryTranslator.list(JoinFetchReportingQueryTranslator.java:84) [rhq-core-domain-ejb3.jar:4.9.0]

I was thinking it might be because of this query:


    @NamedQuery(name = AlertCondition.QUERY_BY_CATEGORY_AVAILABILITY, query = "" //
....
        + "           SELECT a.availabilityType " //
        + "             FROM Availability a " //
        + "             JOIN a.resource ar " //
        + "            WHERE ar = res " //
        + "              AND a.endTime IS NULL " //
....
public class AlertCondition implements Serializable {

The query that might be returning multiple rows, but I did not see this:

select resource_id from rhq_availability
where end_time is null
group by resource_id
HAVING ( COUNT(resource_id) > 1 )

... This is if rhq_availability has two availabilities for one resource.

This seems to have cleared itself up. Is it possible for this query to return multiple results?


Version-Release number of selected component (if applicable): 4.9


How reproducible: Unclear. Unknown.

Comment 1 Jay Shaughnessy 2014-01-08 21:41:52 UTC
That would seem like the most likely subquery although it should not happen that a resource has two null-end-time Availability entries.  Having said that, I won't say it's impossible, and as such I have code that corrects that situation if ever detected during availability report handling for that resource.

There has been some further work for 4.10 around cache reloading, although not specifically for an issue like this.

I suggest we monitor this one and see if it re-occurs in a 4.10+ release.

Comment 2 Elias Ross 2014-11-19 19:44:21 UTC
I've seen this again.

It seems to happen to a host (platform resource) due to an agent that was reimaged. It seemed to go away once the agent was reconnected.

Comment 3 Elias Ross 2014-11-19 20:32:49 UTC
Is it possible that the detection code simply fails to clean up certain entries like this, especially if the resource belongs to an agent that is in a bad state?

Comment 4 Jay Shaughnessy 2014-11-20 16:02:51 UTC
I think the code is cleaning things up appropriately but that there must be some time window leaving us vulnerable.  Platforms have various special handling and I suspect the window may happen, for example, during a backfill for an agent that was unexpectedly terminated (or reimaged).  If we get unlucky and there is a cache refresh during this window of having 2 entries with a null end time then we get this issue.

I'm not sure what the best approach is, trying to identify and close the window may be more difficult than maybe changing the query to ensure a single return value.  Unfortunately there is not a common syntax for LIMITing to 1 row, which means I think we'd have to remove this as a NamedQuery and construct it at runtime, with the proper vendor-specific syntax.

Comment 5 Michael Burman 2015-03-30 08:55:20 UTC
Prevent the error by modifying the query in master (not the window where this might happen):

commit 47e27a677fdfd230896afac74dc78c0241bdd66c
Author: Michael Burman <yak>
Date:   Mon Mar 30 11:53:12 2015 +0300

    [BZ 1045659] Add aggregate function max() to subquery to prevent loading more than one availability_type