Description of problem: It appears that some group types will place group members of the group in an IN expression when executing certain queries at the database. When using Oracle as the database for RHQ, this will result in the operation failing with ORA-01795: maximum number of expressions in a list is 1000. Query execution should be configurable to limit the maximum number of items in the IN expression to something the database can handle and in such cases, RHQ should break the queries up to prevent exceeding the limit. This issue will impact any user who currently has a resource group exceeding 1000 members or that could potentially have a resource DynaGroup that would exceed 1000 members. Version-Release number of selected component (if applicable): JON 2.3 Oracle 10g How reproducible: Always Steps to Reproduce: # Inventory base EAP 5 install into JON # Execute createTestScripts.sh to generate 1000+ shell scripts # Copy 1000+ test shell scripts into $JBOSS_HOME/bin # Execute manual discovery scan - Resource -> Platforms >> <platform which EAP is on> - Operations -> New tab - Manual Autodiscovery - Detailed Discovery = y - Click Schedule button # Wait for schedule discoveryscan to complete # Verify that scripts were added to inventory - Resources -> Services - View: Script - Should be 1000+ total # Create DynaGroup containing script resources - Groups -> New Group Definition - Name: loleary - 779613 - Scripts - Recursive: Yes - Click Create button - Group Definition Conditions: resource.type.plugin = JBossAS5 resource.type.name = Script - Click OK button - Click Calculate Groups button # Verify new group contains 1000+ children/descendants # View the newly created group - Groups -> Compatible Groups - DynaGroup - loleary - 779613 - Scripts Actual results: An error occurs: /rhq/common/monitor/events/data.xhtml @51,56 test="#{not empty EventsTimelineUIBean.data}" Cant instantiate class: org.rhq.enterprise.gui.measurement.graphs.EventsTimelineUIBean. With root cause: java.sql.SQLException: ORA-01795: maximum number of expressions in a list is 1000 at oracle.jdbc.driver.SQLStateMapping.newSQLException(SQLStateMapping.java:74) at oracle.jdbc.driver.DatabaseError.newSQLException(DatabaseError.java:131) at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:204) at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:455) at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:413) at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:1034) at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:194) at oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:791) at oracle.jdbc.driver.T4CPreparedStatement.executeMaybeDescribe(T4CPreparedStatement.java:866) at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1186) at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3387) at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3431) at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1203) at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeQuery(CachedPreparedStatement.java:90) at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:342) at org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:187) at org.hibernate.loader.Loader.getResultSet(Loader.java:1787) at org.hibernate.loader.Loader.doQuery(Loader.java:674) at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:236) at org.hibernate.loader.Loader.doList(Loader.java:2213) ... 190 more Expected results: Ability to perform operations on a group that contains more then 1000 members
Created attachment 409549 [details] Complete error and stack trace from RHQ UI and server log
Here is the script to generate the 1000+ test scripts for testing: #!/bin/sh # mkdir -p testScripts i=1 while [ $i -le 1005 ]; do fileName="testScript${i}.sh" echo "# This is an empty script" > testScripts/$fileName chmod +x testScripts/$fileName let 'i=i+1' done
This exception also is seen in the logs and is associated with the following relevant error/stack: ERROR [facelets.viewhandler] Error Rendering View[/rhq/group/monitor/graphs.xhtml] ... at org.rhq.enterprise.server.event.EventManagerBean.findEventsForResources(EventManagerBean.java:257) at org.rhq.enterprise.server.event.EventManagerBean.getSeverityBucketsForResources(EventManagerBean.java:413) at org.rhq.enterprise.server.event.EventManagerBean.getSeverityBucketsForCompGroup(EventManagerBean.java:392) ERROR [org.rhq.enterprise.gui.measurement.graphs.IndicatorChartsUIBean] Error while looking up metric chart data for EntityContext[category=ResourceGroup,resourceId=-1,groupId=10578,parent=-1,type=-1] ... at org.rhq.enterprise.server.measurement.MeasurementChartsManagerBean.getAggregateMetricDisplaySummaries(MeasurementChartsManagerBean.java:454) at org.rhq.enterprise.server.measurement.MeasurementChartsManagerBean.getMetricDisplaySummariesForCompatibleGroup(MeasurementChartsManagerBean.java:120) at org.rhq.enterprise.server.measurement.MeasurementChartsManagerBean.getMetricDisplaySummariesForCompatibleGroup(MeasurementChartsManagerBean.java:179) ... at org.rhq.enterprise.server.measurement.MeasurementScheduleManagerBean.findSchedulesByResourcesAndDefinitions(MeasurementScheduleManagerBean.java:269) at org.rhq.enterprise.server.measurement.MeasurementScheduleManagerBean.findSchedulesByResourceIdsAndDefinitionId(MeasurementScheduleManagerBean.java:253)
Note that this issue is actually something that should be fixed in Hibernate (http://opensource.atlassian.com/projects/hibernate/browse/HHH-1123) but we may still have to implement a workaround in the event that the fix is too far off.
commit c1a231d4fd8eac13c0d42903181b48db1747a2e9 Author: Joseph Marques <joseph> Date: Tue Jun 15 17:00:59 2010 -0400 BZ-586503: fix for listing event histories for (auto)groups w/1000+ members ----- Note, still need to handle several other cases before this bug can be verified: * measurement schedules * measurement data * event severity buckets
commit 3bb9ae8d8d7e387a5ec769041c367483284ea68b Author: Joseph Marques <joseph> Date: Wed Jun 16 03:07:04 2010 -0400 BZ-586503: fix compile error loading events in the timeline view
commit 6cd39ac10e7c6f8ba3b63fc9dd31eef482c457f2 Author: Joseph Marques <joseph> Date: Wed Jun 16 03:07:45 2010 -0400 BZ-586503: fix for retrieving event buckets for (auto)groups w/1000+ members
commit ae1e5c433aa686093e90b93b952c3ca983ca0efe Author: Joseph Marques <joseph> Date: Wed Jun 16 08:50:41 2010 -0400 BZ-586503: restore EventManageRemote methods for compatibility with previous relases
commit d351dac5aca18abd34afa677cf882e3d18278ff5 Author: Joseph Marques <joseph> Date: Thu Jun 17 12:07:08 2010 -0400 BZ-586503: fix for listing MeasurementSchedules for (auto)groups w/1000+ members * monitor>schedules now works for arbitrarily-sized groups * also, add sorting to schedule displayName / dataType columns
commit 38474143b77cce6ac141cc62407fc8845834bec3 Author: Joseph Marques <joseph> Date: Fri Jun 18 06:24:50 2010 -0400 BZ-586503: fix for listing MeasurementDisplaySummaries for (auto)groups w/1000+ members
commit f595c483fcd288a1f796822742ede7aff4fb9742 Author: Joseph Marques <joseph> Date: Fri Jun 18 12:52:41 2010 -0400 BZ-586503: fix for listing MeasurementDataComposites for (auto)groups w/1000+ members
Assigning this to Heiko, from a dev perspective, to have our perf environment *upgraded*, not installed fresh, to include the latest updates to the release branch, which will include this fix. This should cover general testing of how we work at scale. At the same time QE should be able to test the specific scenarios related to this bug.
Verified on JON 2.4 GA_QA (tag-jon-release: 42) with Oracle 10g http://hudson-qe.rhq.rdu.redhat.com:8080/view/JON/job/tag-jon-release/42/ Steps: # Inventory base EAP 5 install into JON # Execute createTestScripts.sh to generate 1029 shell scripts # Copy 1029 test shell scripts into $JBOSS_HOME/bin # Execute manual discovery scan - Resource -> Platforms >> <platform which EAP is on> - Operations -> New tab - Manual Autodiscovery - Detailed Discovery = y - Click Schedule button # Wait for schedule discoveryscan to complete # Verify that scripts were added to inventory - Resources -> Services - View: Script - Should be 1029 total # Create DynaGroup containing script resources - Groups -> New Group Definition - Name: rajangroup - 1001 - scripts - Recursive: Yes - Click Create button - Group Definition Conditions: resource.type.plugin = JBossAS5 resource.type.name = Script - Click OK button - Click Calculate Groups button Observation: There is new group contains 1029 children/descendants # View the newly created group - Groups -> Compatible Groups - DynaGroup - rajangroup - 1001 - scripts
Created attachment 426547 [details] screenshot - compatible group containing over 1000 resources
Test procedures appear to be incomplete. Rajan, I see that you were trying to follow Larry O'Leary's instructions (perhaps because he was the one that filed the bug?) but as you can see from the comments/commits I made to the case there are several other pages in the UI that must be verified. Tue Jun 15 17:00:59 2010 -0400 fix for listing event histories for (auto)groups w/1000+ members Wed Jun 16 03:07:45 2010 -0400 fix for retrieving event buckets for (auto)groups w/1000+ members Thu Jun 17 12:07:08 2010 -0400 fix for listing MeasurementSchedules for (auto)groups w/1000+ members Fri Jun 18 06:24:50 2010 -0400 fix for listing MeasurementDisplaySummaries for (auto)groups w/1000+ members Fri Jun 18 12:52:41 2010 -0400 fix for listing MeasurementDataComposites for (auto)groups w/1000+ members As you can see, this bug is much much large than just verifying that you can navigate to the inventory>overview sub-tab. In actuality, here are the sub-tabs that are affected by the above comments/commits: Event>History (for resource groups and auto-groups) Monitor>Graphs (for resource groups and auto-groups) Monitor>Tables (for resource groups and auto-groups) Monitor>Schedules (for resource groups and auto-groups) Most importantly, considering the magnitude of this rewrite, we absolutely need to verify that there are no functional regressions (of which he found 2 that I pushed fixes for shortly thereafter). To that end, I've already been working with Heiko on this BZ, who is using the performance environment to test drive these changes. Accordingly, I'm going to move the bug back into ON_QA until Heiko is satisfied with his testing. I'm also going to update the ticket to reflect that dev is taking on the responsibility of testing this feature.
Display of Metrics (Graph, Tables, Schedules) works on comp. groups of 6700 resources and 5metrics each
see BZ #610179 for another test case that should be performed to see if this fixes it. specifically, a dynagroup should create a group that has alot of JBossAS 4 instances and then try to uninventory the platform.
Autogroup with 1200 resources works on Metrics->{Graphs,Tables,Schedules( display+set)} and Events work. And the tree also shows all 1200 services. Unfortunately I don't have Event data in the system, but at least I see now blow-up when showing the Event tab, which triggers a search over the events of those 1200 resources.
Testing in the perf environment has been deemed successful for this bz. Closing.
Mass-closure of verified bugs against JON.