Bug 586503 - Compatible group containing over 1000 resources causes ORA-01795: maximum number of expressions in a list is 1000
Compatible group containing over 1000 resources causes ORA-01795: maximum num...
Status: CLOSED CURRENTRELEASE
Product: RHQ Project
Classification: Other
Component: Database (Show other bugs)
1.3
All All
urgent Severity high (vote)
: ---
: ---
Assigned To: Joseph Marques
Heiko W. Rupp
:
Depends On:
Blocks: jon24-perf jon-sprint11-bugs 610179
  Show dependency treegraph
 
Reported: 2010-04-27 14:32 EDT by Larry O'Leary
Modified: 2010-10-23 11:07 EDT (History)
6 users (show)

See Also:
Fixed In Version: 2.4
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
JON 2.3.0 using Oracle 10g
Last Closed: 2010-08-12 12:54:38 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Complete error and stack trace from RHQ UI and server log (151.14 KB, text/plain)
2010-04-27 14:33 EDT, Larry O'Leary
no flags Details
screenshot - compatible group containing over 1000 resources (116.36 KB, image/png)
2010-06-24 08:20 EDT, Rajan Timaniya
no flags Details

  None (edit)
Description Larry O'Leary 2010-04-27 14:32:31 EDT
Description of problem:
It appears that some group types will place group members of the group in an IN expression when executing certain queries at the database.  When using Oracle as the database for RHQ, this will result in the operation failing with ORA-01795: maximum number of expressions in a list is 1000.

Query execution should be configurable to limit the maximum number of items in the IN expression to something the database can handle and in such cases, RHQ should break the queries up to prevent exceeding the limit.

This issue will impact any user who currently has a resource group exceeding 1000 members or that could potentially have a resource DynaGroup that would exceed 1000 members.


Version-Release number of selected component (if applicable):
JON 2.3
Oracle 10g

How reproducible:
Always

Steps to Reproduce:
# Inventory base EAP 5 install into JON
# Execute createTestScripts.sh to generate 1000+ shell scripts
# Copy 1000+ test shell scripts into $JBOSS_HOME/bin
# Execute manual discovery scan
   - Resource -> Platforms >> <platform which EAP is on>
   - Operations -> New tab
   - Manual Autodiscovery
   - Detailed Discovery = y
   - Click Schedule button
# Wait for schedule discoveryscan to complete
# Verify that scripts were added to inventory
   - Resources -> Services
   - View: Script
   - Should be 1000+ total
# Create DynaGroup containing script resources
   - Groups -> New Group Definition
   - Name: loleary - 779613 - Scripts 
   - Recursive: Yes
   - Click Create button
   - Group Definition Conditions:   resource.type.plugin = JBossAS5
                                    resource.type.name = Script
   - Click OK button
   - Click Calculate Groups button
# Verify new group contains 1000+ children/descendants
# View the newly created group
   - Groups -> Compatible Groups
   - DynaGroup - loleary - 779613 - Scripts

  
Actual results:
An error occurs: /rhq/common/monitor/events/data.xhtml @51,56 test="#{not empty EventsTimelineUIBean.data}" Cant instantiate class: org.rhq.enterprise.gui.measurement.graphs.EventsTimelineUIBean.
With root cause: java.sql.SQLException: ORA-01795: maximum number of expressions in a list is 1000
	at oracle.jdbc.driver.SQLStateMapping.newSQLException(SQLStateMapping.java:74)
	at oracle.jdbc.driver.DatabaseError.newSQLException(DatabaseError.java:131)
	at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:204)
	at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:455)
	at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:413)
	at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:1034)
	at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:194)
	at oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:791)
	at oracle.jdbc.driver.T4CPreparedStatement.executeMaybeDescribe(T4CPreparedStatement.java:866)
	at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1186)
	at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3387)
	at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3431)
	at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1203)
	at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeQuery(CachedPreparedStatement.java:90)
	at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:342)
	at org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:187)
	at org.hibernate.loader.Loader.getResultSet(Loader.java:1787)
	at org.hibernate.loader.Loader.doQuery(Loader.java:674)
	at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:236)
	at org.hibernate.loader.Loader.doList(Loader.java:2213)
	... 190 more

Expected results:
Ability to perform operations on a group that contains more then 1000 members
Comment 1 Larry O'Leary 2010-04-27 14:33:48 EDT
Created attachment 409549 [details]
Complete error and stack trace from RHQ UI and server log
Comment 2 Larry O'Leary 2010-04-27 14:43:52 EDT
Here is the script to generate the 1000+ test scripts for testing:

#!/bin/sh
#

mkdir -p testScripts
i=1

while [ $i -le 1005 ]; do
   fileName="testScript${i}.sh"
   echo "# This is an empty script" > testScripts/$fileName
   chmod +x testScripts/$fileName
   let 'i=i+1'
done
Comment 3 Larry O'Leary 2010-04-27 14:52:15 EDT
This exception also is seen in the logs and is associated with the following relevant error/stack:

ERROR [facelets.viewhandler] Error Rendering View[/rhq/group/monitor/graphs.xhtml]
   ...
	at org.rhq.enterprise.server.event.EventManagerBean.findEventsForResources(EventManagerBean.java:257)
	at org.rhq.enterprise.server.event.EventManagerBean.getSeverityBucketsForResources(EventManagerBean.java:413)
	at org.rhq.enterprise.server.event.EventManagerBean.getSeverityBucketsForCompGroup(EventManagerBean.java:392)


ERROR [org.rhq.enterprise.gui.measurement.graphs.IndicatorChartsUIBean] Error while looking up metric chart data for EntityContext[category=ResourceGroup,resourceId=-1,groupId=10578,parent=-1,type=-1]
   ...
	at org.rhq.enterprise.server.measurement.MeasurementChartsManagerBean.getAggregateMetricDisplaySummaries(MeasurementChartsManagerBean.java:454)
	at org.rhq.enterprise.server.measurement.MeasurementChartsManagerBean.getMetricDisplaySummariesForCompatibleGroup(MeasurementChartsManagerBean.java:120)
	at org.rhq.enterprise.server.measurement.MeasurementChartsManagerBean.getMetricDisplaySummariesForCompatibleGroup(MeasurementChartsManagerBean.java:179)
   ...
	at org.rhq.enterprise.server.measurement.MeasurementScheduleManagerBean.findSchedulesByResourcesAndDefinitions(MeasurementScheduleManagerBean.java:269)
	at org.rhq.enterprise.server.measurement.MeasurementScheduleManagerBean.findSchedulesByResourceIdsAndDefinitionId(MeasurementScheduleManagerBean.java:253)
Comment 5 Larry O'Leary 2010-04-30 11:49:19 EDT
Note that this issue is actually something that should be fixed in Hibernate (http://opensource.atlassian.com/projects/hibernate/browse/HHH-1123) but we may still have to implement a workaround in the event that the fix is too far off.
Comment 6 Joseph Marques 2010-06-16 01:39:37 EDT
commit c1a231d4fd8eac13c0d42903181b48db1747a2e9
Author: Joseph Marques <joseph@redhat.com>
Date:   Tue Jun 15 17:00:59 2010 -0400

    BZ-586503: fix for listing event histories for (auto)groups w/1000+ members

-----

Note, still need to handle several other cases before this bug can be verified:

* measurement schedules
* measurement data
* event severity buckets
Comment 7 Joseph Marques 2010-06-16 03:12:45 EDT
commit 3bb9ae8d8d7e387a5ec769041c367483284ea68b
Author: Joseph Marques <joseph@redhat.com>
Date:   Wed Jun 16 03:07:04 2010 -0400

    BZ-586503: fix compile error loading events in the timeline view
Comment 8 Joseph Marques 2010-06-16 03:18:59 EDT
commit 6cd39ac10e7c6f8ba3b63fc9dd31eef482c457f2
Author: Joseph Marques <joseph@redhat.com>
Date:   Wed Jun 16 03:07:45 2010 -0400

    BZ-586503: fix for retrieving event buckets for (auto)groups w/1000+ members
Comment 9 Joseph Marques 2010-06-16 08:53:17 EDT
commit ae1e5c433aa686093e90b93b952c3ca983ca0efe
Author: Joseph Marques <joseph@redhat.com>
Date:   Wed Jun 16 08:50:41 2010 -0400

    BZ-586503: restore EventManageRemote methods for compatibility with previous relases
Comment 10 Joseph Marques 2010-06-17 12:10:39 EDT
commit d351dac5aca18abd34afa677cf882e3d18278ff5
Author: Joseph Marques <joseph@redhat.com>
Date:   Thu Jun 17 12:07:08 2010 -0400

    BZ-586503: fix for listing MeasurementSchedules for (auto)groups w/1000+ members
    
    * monitor>schedules now works for arbitrarily-sized groups
    * also, add sorting to schedule displayName / dataType columns
Comment 11 Joseph Marques 2010-06-18 06:29:54 EDT
commit 38474143b77cce6ac141cc62407fc8845834bec3
Author: Joseph Marques <joseph@redhat.com>
Date:   Fri Jun 18 06:24:50 2010 -0400

    BZ-586503: fix for listing MeasurementDisplaySummaries for (auto)groups w/1000+ members
Comment 12 Joseph Marques 2010-06-18 13:01:14 EDT
commit f595c483fcd288a1f796822742ede7aff4fb9742
Author: Joseph Marques <joseph@redhat.com>
Date:   Fri Jun 18 12:52:41 2010 -0400

    BZ-586503: fix for listing MeasurementDataComposites for (auto)groups w/1000+ members
Comment 13 Charles Crouch 2010-06-18 13:44:48 EDT
Assigning this to Heiko, from a dev perspective, to have our perf environment *upgraded*, not installed fresh, to include the latest updates to the release branch, which will include this fix. This should cover general testing of how we work at scale.

At the same time QE should be able to test the specific scenarios related to this bug.
Comment 14 Rajan Timaniya 2010-06-24 07:53:43 EDT
Verified on JON 2.4 GA_QA (tag-jon-release: 42) with Oracle 10g
http://hudson-qe.rhq.rdu.redhat.com:8080/view/JON/job/tag-jon-release/42/

Steps:
# Inventory base EAP 5 install into JON
# Execute createTestScripts.sh to generate 1029 shell scripts
# Copy 1029 test shell scripts into $JBOSS_HOME/bin
# Execute manual discovery scan
   - Resource -> Platforms >> <platform which EAP is on>
   - Operations -> New tab
   - Manual Autodiscovery
   - Detailed Discovery = y
   - Click Schedule button
# Wait for schedule discoveryscan to complete
# Verify that scripts were added to inventory
   - Resources -> Services
   - View: Script
   - Should be 1029 total
# Create DynaGroup containing script resources
   - Groups -> New Group Definition
   - Name: rajangroup - 1001 - scripts 
   - Recursive: Yes
   - Click Create button
   - Group Definition Conditions:   resource.type.plugin = JBossAS5
                                    resource.type.name = Script
   - Click OK button
   - Click Calculate Groups button

Observation:
There is new group contains 1029 children/descendants
# View the newly created group
   - Groups -> Compatible Groups
   - DynaGroup - rajangroup - 1001 - scripts
Comment 15 Rajan Timaniya 2010-06-24 08:20:41 EDT
Created attachment 426547 [details]
screenshot - compatible group containing over 1000 resources
Comment 16 Joseph Marques 2010-06-24 08:45:53 EDT
Test procedures appear to be incomplete.  Rajan, I see that you were trying to follow Larry O'Leary's instructions (perhaps because he was the one that filed the bug?) but as you can see from the comments/commits I made to the case there are several other pages in the UI that must be verified.

Tue Jun 15 17:00:59 2010 -0400
fix for listing event histories for (auto)groups w/1000+ members

Wed Jun 16 03:07:45 2010 -0400
fix for retrieving event buckets for (auto)groups w/1000+ members   

Thu Jun 17 12:07:08 2010 -0400
fix for listing MeasurementSchedules for (auto)groups w/1000+ members

Fri Jun 18 06:24:50 2010 -0400
fix for listing MeasurementDisplaySummaries for (auto)groups w/1000+ members   

Fri Jun 18 12:52:41 2010 -0400
fix for listing MeasurementDataComposites for (auto)groups w/1000+ members

As you can see, this bug is much much large than just verifying that you can navigate to the inventory>overview sub-tab.  In actuality, here are the sub-tabs that are affected by the above comments/commits:

Event>History (for resource groups and auto-groups)
Monitor>Graphs (for resource groups and auto-groups)
Monitor>Tables (for resource groups and auto-groups)
Monitor>Schedules (for resource groups and auto-groups)

Most importantly, considering the magnitude of this rewrite, we absolutely need to verify that there are no functional regressions (of which he found 2 that I pushed fixes for shortly thereafter).  To that end, I've already been working with Heiko on this BZ, who is using the performance environment to test drive these changes.

Accordingly, I'm going to move the bug back into ON_QA until Heiko is satisfied with his testing.  I'm also going to update the ticket to reflect that dev is taking on the responsibility of testing this feature.
Comment 17 Heiko W. Rupp 2010-06-25 07:11:15 EDT
Display of Metrics (Graph, Tables, Schedules) works on comp. groups of 6700 resources and 5metrics each
Comment 18 John Mazzitelli 2010-07-01 13:38:20 EDT
see BZ #610179 for another test case that should be performed to see if this fixes it. specifically, a dynagroup should create a group that has alot of JBossAS 4 instances and then try to uninventory the platform.
Comment 19 Heiko W. Rupp 2010-07-02 08:37:20 EDT
Autogroup with 1200 resources works on Metrics->{Graphs,Tables,Schedules( display+set)} and Events work. And the tree also shows all 1200 services.

Unfortunately I don't have Event data in the system, but at least I see now blow-up when showing the Event tab, which triggers a search over the events of those 1200 resources.
Comment 20 Corey Welton 2010-07-07 23:27:32 EDT
Testing in the perf environment has been deemed successful for this bz.  Closing.
Comment 21 Corey Welton 2010-08-12 12:54:38 EDT
Mass-closure of verified bugs against JON.

Note You need to log in before you can comment on or make changes to this bug.