Bug 1244941 - Group inventory pages (compatible, mixed, all) fail to display groups due to UI timeouts
Summary: Group inventory pages (compatible, mixed, all) fail to display groups due to ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Database
Version: JON 3.3.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ER01
: JON 3.3.4
Assignee: Libor Zoubek
QA Contact: Filip Brychta
URL:
Whiteboard:
Depends On: 1240854
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-20 19:58 UTC by Larry O'Leary
Modified: 2019-08-15 04:55 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 1240854
Environment:
Last Closed: 2015-10-28 14:37:01 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1295863 0 urgent CLOSED The number of resources in All Groups/Compatible Groups page is not correct all the time 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHSA-2015:1947 0 normal SHIPPED_LIVE Important: Red Hat JBoss Operations Network 3.3.4 update 2015-10-28 18:36:15 UTC

Internal Links: 1295863

Description Larry O'Leary 2015-07-20 19:58:37 UTC
+++ This bug was initially created as a clone of Bug #1240854 +++

Description of problem:
When a JBoss ON system has a large number of groups (100) which have a large number of explicit (100) and implicit (100) members with an inventory that contains 10s of 1000s of resources with a vast availability history, the resource group pages take several seconds to display and in extreme cases will not display at all.

Version-Release number of selected component (if applicable):
3.1.2

How reproducible:
Frequently

Steps to Reproduce:
1. Install, configure, and start JBoss ON 3.1.2 system.
2. Add to inventory 100 EAP 6 servers or other resources that will allow a topology of 100 resource groups made up of 100 resources each with 100 or more child servers/services.
3. Create 100 compatible and recursive groups containing 100 explicit members each.
4. Navigate to the _Compatible Groups Inventory_ page.

Actual results:
After several seconds a modal dialog appears containing the message "Server returned FAILURE with no error message." The message center contains the following warning:

    Failed to load group composite data. This occurred because the server is taking a long time to complete this request. Please be aware that the server may still be processing your request and it may complete shortly. You can check the server logs to see if any abnormal errors occurred.

Expected results:
The list of groups should appear after a couple of seconds.

Additional info:
This issue is a direct result of Hibernate generating a very poor native query:

    select * from ( 
      select (
        select count(resourceav3_.ID) from RHQ_RESOURCE_GROUP_RES_EXP_MAP explicitre1_, RHQ_RESOURCE resource2_ 
        inner join RHQ_RESOURCE_AVAIL resourceav3_ on resource2_.ID=resourceav3_.RESOURCE_ID 
        where resourcegr0_.ID=explicitre1_.RESOURCE_GROUP_ID and explicitre1_.RESOURCE_ID=resource2_.ID and resource2_.INVENTORY_STATUS='COMMITTED'
      ) as col_0_0_, (
        select count(resourceav6_.ID) from RHQ_RESOURCE_GROUP_RES_EXP_MAP explicitre4_, RHQ_RESOURCE resource5_ 
        inner join RHQ_RESOURCE_AVAIL resourceav6_ on resource5_.ID=resourceav6_.RESOURCE_ID 
        where resourcegr0_.ID=explicitre4_.RESOURCE_GROUP_ID and explicitre4_.RESOURCE_ID=resource5_.ID and resource5_.INVENTORY_STATUS='COMMITTED' and resourceav6_.AVAILABILITY_TYPE=0
      ) as col_1_0_, (
        select count(resourceav9_.ID) from RHQ_RESOURCE_GROUP_RES_EXP_MAP explicitre7_, RHQ_RESOURCE resource8_ 
        inner join RHQ_RESOURCE_AVAIL resourceav9_ on resource8_.ID=resourceav9_.RESOURCE_ID 
        where resourcegr0_.ID=explicitre7_.RESOURCE_GROUP_ID and explicitre7_.RESOURCE_ID=resource8_.ID and resource8_.INVENTORY_STATUS='COMMITTED' and resourceav9_.AVAILABILITY_TYPE=2
      ) as col_2_0_, (
        select count(resourceav12_.ID) from RHQ_RESOURCE_GROUP_RES_EXP_MAP explicitre10_, RHQ_RESOURCE resource11_ 
        inner join RHQ_RESOURCE_AVAIL resourceav12_ on resource11_.ID=resourceav12_.RESOURCE_ID 
        where resourcegr0_.ID=explicitre10_.RESOURCE_GROUP_ID and explicitre10_.RESOURCE_ID=resource11_.ID and resource11_.INVENTORY_STATUS='COMMITTED' and resourceav12_.AVAILABILITY_TYPE=3
      ) as col_3_0_, (
        select count(resourceav15_.ID) from RHQ_RESOURCE_GROUP_RES_IMP_MAP implicitre13_, RHQ_RESOURCE resource14_ 
        inner join RHQ_RESOURCE_AVAIL resourceav15_ on resource14_.ID=resourceav15_.RESOURCE_ID 
        where resourcegr0_.ID=implicitre13_.RESOURCE_GROUP_ID and implicitre13_.RESOURCE_ID=resource14_.ID and resource14_.INVENTORY_STATUS='COMMITTED'
      ) as col_4_0_, (
        select count(resourceav18_.ID) from RHQ_RESOURCE_GROUP_RES_IMP_MAP implicitre16_, RHQ_RESOURCE resource17_ 
        inner join RHQ_RESOURCE_AVAIL resourceav18_ on resource17_.ID=resourceav18_.RESOURCE_ID 
        where resourcegr0_.ID=implicitre16_.RESOURCE_GROUP_ID and implicitre16_.RESOURCE_ID=resource17_.ID and resource17_.INVENTORY_STATUS='COMMITTED' and resourceav18_.AVAILABILITY_TYPE=0
      ) as col_5_0_, (
        select count(resourceav21_.ID) from RHQ_RESOURCE_GROUP_RES_IMP_MAP implicitre19_, RHQ_RESOURCE resource20_ 
        inner join RHQ_RESOURCE_AVAIL resourceav21_ on resource20_.ID=resourceav21_.RESOURCE_ID 
        where resourcegr0_.ID=implicitre19_.RESOURCE_GROUP_ID and implicitre19_.RESOURCE_ID=resource20_.ID and resource20_.INVENTORY_STATUS='COMMITTED' and resourceav21_.AVAILABILITY_TYPE=2
      ) as col_6_0_, (
        select count(resourceav24_.ID) from RHQ_RESOURCE_GROUP_RES_IMP_MAP implicitre22_, RHQ_RESOURCE resource23_ 
        inner join RHQ_RESOURCE_AVAIL resourceav24_ on resource23_.ID=resourceav24_.RESOURCE_ID 
        where resourcegr0_.ID=implicitre22_.RESOURCE_GROUP_ID and implicitre22_.RESOURCE_ID=resource23_.ID and resource23_.INVENTORY_STATUS='COMMITTED' and resourceav24_.AVAILABILITY_TYPE=3
      ) as col_7_0_, resourcegr0_.ID as col_8_0_ 
      from RHQ_RESOURCE_GROUP resourcegr0_ 
      where resourcegr0_.CATEGORY='COMPATIBLE' and resourcegr0_.visible=1
    ) where rownum <= 50; 

Because each column is a sub-select in itself, it is like executing 8 queries per 50 rows. In my own testing the above query took 11.1 seconds to return results. In the user's case which reported this issue, their database has 100s of 1000s more resource and resource availability records then mine. In their case execution time was 29.7 seconds.

Using an optimized approach to generate an identical result set using a native query which will work with PostgreSQL and Oracle execution time was reduced. In my case the 50 rows were returned in 0.2 seconds and in the user's case it was 1.1 seconds. 

It should also be noted that using the optimized query, the number of items per page could be increased from 50 and yield similar query execution times.

The optimized query I tested with is:

    SELECT  explicit.totalCount AS col_0_0_, explicit.availDown AS col_1_0_, explicit.availUnknown AS col_2_0_, explicit.availDisabled AS col_3_0_, implicit.totalCount AS col_4_0_, implicit.availDown AS col_5_0_, implicit.availUnknown AS col_6_0_, implicit.availDisabled AS col_7_0_, explicit.resource_group_id AS col_8_0_
    FROM    (
                SELECT      COUNT(rhq_resource_avail.resource_id) AS totalCount, SUM(CASE WHEN rhq_resource.inventory_status = 'COMMITTED' AND rhq_resource_avail.availability_type = 0 THEN 1 ELSE 0 END) AS availDown, SUM(CASE WHEN rhq_resource.inventory_status = 'COMMITTED' AND rhq_resource_avail.availability_type = 2 THEN 1 ELSE 0 END) AS availUnknown, SUM(CASE WHEN rhq_resource.inventory_status = 'COMMITTED' AND rhq_resource_avail.availability_type = 3 THEN 1 ELSE 0 END) AS availDisabled, rhq_resource_group.id AS resource_group_id
                FROM        rhq_resource_group
                LEFT JOIN   rhq_resource_group_res_exp_map ON rhq_resource_group.id = rhq_resource_group_res_exp_map.resource_group_id
                LEFT JOIN   rhq_resource ON rhq_resource_group_res_exp_map.resource_id = rhq_resource.id
                LEFT JOIN   rhq_resource_avail ON rhq_resource.id = rhq_resource_avail.resource_id
                WHERE       rhq_resource_group.category = 'COMPATIBLE' AND rhq_resource_group.visible = 1
                GROUP BY    rhq_resource_group.id
            ) explicit
            LEFT JOIN 
            (
                SELECT      COUNT(rhq_resource_avail.resource_id) AS totalCount, SUM(CASE WHEN rhq_resource.inventory_status = 'COMMITTED' AND rhq_resource_avail.availability_type = 0 THEN 1 ELSE 0 END) AS availDown, SUM(CASE WHEN rhq_resource.inventory_status = 'COMMITTED' AND rhq_resource_avail.availability_type = 2 THEN 1 ELSE 0 END) AS availUnknown, SUM(CASE WHEN rhq_resource.inventory_status = 'COMMITTED' AND rhq_resource_avail.availability_type = 3 THEN 1 ELSE 0 END) AS availDisabled, rhq_resource_group.id AS resource_group_id
                FROM        rhq_resource_group
                LEFT JOIN   rhq_resource_group_res_imp_map ON rhq_resource_group.id = rhq_resource_group_res_imp_map.resource_group_id
                LEFT JOIN   rhq_resource ON rhq_resource_group_res_imp_map.resource_id = rhq_resource.id
                LEFT JOIN   rhq_resource_avail ON rhq_resource.id = rhq_resource_avail.resource_id
                WHERE       rhq_resource_group.category = 'COMPATIBLE' AND rhq_resource_group.visible = 1
                GROUP BY    rhq_resource_group.id
            ) implicit ON explicit.resource_group_id = implicit.resource_group_id
    WHERE   rownum <= 50;

Please note that the projected column names were used to match that of the Hibernate native query. This was for testing and comparison purposes.

--- Additional comment from Larry O'Leary on 2015-07-07 19:14:56 EDT ---

This seems to be related/regression of some upstream bug reports that discuss the same issue.

Bug 865698 - Usage of implicitResources in ClusterManagerBean#buildQuery causes performance issue
Bug 534293 - (RHQ-1103) slowness on resource group browser
Bug 535237 - (RHQ-1955) group rendering is slow
Bug 720497 - w/ lots of compat groups defined, scrolling quickly to bottom of table on #Inventory/Groups/AllGroups view, causes gwt RequestTimeoutException
    

However, many of these appear to have resulted in some performance optimization or discussion of such optimizations. I would therefore think this is either a regression or the upstream bugs were closed without actually resulting in any changes.

--- Additional comment from Jay Shaughnessy on 2015-07-08 09:43:58 EDT ---

The code for ResourceGroupManagerBean.findResourceGroupCompositesByCriteria() is specialized in various ways to return the composite data.  It's likely these variations are causing an inefficiency.  We likely need to carefully examine the criteria query being generated and see if we can optimize.  I haven't studied Larry's optimized query above, but if it's a dedicated query just to solve this use case it would require a new call chain, because the current logic utilizes the general-purpose criteria call.

As far as I can tell this code is not changed between 3.1.x and 3.3.x.  Let me know if you want me to look into this, it's probably code I wrote...

--- Additional comment from Libor Zoubek on 2015-07-08 11:03:21 EDT ---

Yeah, it is pretty tricky, because the code combines ResourceGroupCriteria together with JPA queries.

I am working on rewriting the original JPA subquery to what Larry suggests. (not sure if it's doable, since I did not get to valid query yet and JPA may not support CASE statements inside SUM statements.

--- Additional comment from Libor Zoubek on 2015-07-14 03:40:47 EDT ---

branch:  master
link:    https://github.com/rhq-project/rhq/commit/5cc77db6b
time:    2015-07-09 14:34:39 +0200
commit:  5cc77db6b8ec6873a95d596d04376f4dccf08b32
author:  Libor Zoubek - lzoubek
message: Bug 1240854 - Group inventory pages (compatible, mixed, all) fail to
         display groups due to UI timeouts

         Replace subqueries by JOINs. Due to JPA limitation of
         subqueries (One cannot write select x from (select y from z)).
         One query has been replaced with 2
         (or 3) queries which we have to join in memory. This should not
         be a perf issue, since we're joining composite instance sets
         (max count of such set equals to total resource group count in
         DB). A case when 2 (or 3) subsequent queries could return sets
         of different sizes (due to underlying group deletion/addition)
         is handled as well.


branch:  master
link:    https://github.com/rhq-project/rhq/commit/0f6a0f949
time:    2015-07-13 16:17:30 +0200
commit:  0f6a0f9498139d0baa49f4cd830204a1b80138b8
author:  Libor Zoubek - lzoubek
message: Bug 1240854 - Group inventory pages (compatible, mixed, all) fail to
         display
         Fix counting total resource count (only count COMMITTED
         resources)

--- Additional comment from Jay Shaughnessy on 2015-07-14 09:35:31 EDT ---


Just mentioning that I reviewed Libor's solution and I could not find any further optimizations.

Comment 3 Libor Zoubek 2015-09-30 20:18:13 UTC
in 3.3.x branch


commit d822e24d62af87e5a201194f7ffa4cf854e080f6
Author: Libor Zoubek <lzoubek>
Date:   Fri Sep 25 16:15:59 2015 +0200

    Bug 1240854 - Group inventory pages (compatible, mixed, all) fail to

    Update copyright
    (cherry picked from commit 8ebc027a5fbc93902f1ff1c4cea4b30a9740deaa)

    Signed-off-by: Libor Zoubek <lzoubek>

commit 09ae6ab2dee9ed9f3eaaf6d7115173712c68e71f
Author: Libor Zoubek <lzoubek>
Date:   Tue Sep 22 10:49:14 2015 +0200

    Bug 1240854 - Group inventory pages (compatible, mixed, all) fail to display
    groups due to UI timeouts

    Load group facets to the right ResouceGroupComposite instances

    (cherry picked from commit 6e881b44f77beacaa83a1b9154c7439d417fec15)
    Signed-off-by: Libor Zoubek <lzoubek>

commit bec870df3a5f3dfeba1038108930b3a6befbd266
Author: Libor Zoubek <lzoubek>
Date:   Thu Sep 17 10:51:02 2015 +0200

    Bug 1240854 - Group inventory pages (compatible, mixed, all) fail to …
    …display

    Use non-generic (JPA 1.0 compatible) way to create queries

    (cherry picked from commit fb177ff5b354af14fc05d13b165c5df953070cf0)
    Signed-off-by: Libor Zoubek <lzoubek>

commit ddc8ac6f6489fd9fb82fb247fe2f7d82adac3d0a
Author: Libor Zoubek <lzoubek>
Date:   Tue Sep 15 14:42:16 2015 +0200

    Bug 1240854 - Group inventory pages (compatible, mixed, all) fail to display

    Re-write to make queries even faster. We select only groups based on initial
    criteria, then we select composites (implicit/explicit) but only by group
    ids from first query. This should be faster, because database has to order
    results only once - when selecting plain groups. Expensive join/composite
    queries can be unordered.

    (cherry picked from commit 62f0e6bf27f0ee6aeb09a1581795ba15c80f20c1)
    Signed-off-by: Libor Zoubek <lzoubek>

commit cc9f5c0d50f5977a792d1aa23b692e00ef1957c1
Author: Libor Zoubek <lzoubek>
Date:   Thu Aug 27 11:52:17 2015 +0200

    Bug 1240854 - Group inventory pages (compatible, mixed, all) fail to display
    groups due to UI timeouts

    Fix generated GROUP BY clauses. On oracle it is not enough to use group by
    resourcegroup, but it need also the field name (so it has to look like
    'resourcegroup.id'). Added various tests for
    findResourceGroupCompositesByCriteria

    (cherry picked from commit 03d63872248b6eb07173cdb02f2a546bbb02ab9f)
    Signed-off-by: Libor Zoubek <lzoubek>

commit a44c9a11f2f240610fd51c54a1469db300b9d6f5
Author: Libor Zoubek <lzoubek>
Date:   Tue Aug 18 23:17:48 2015 +0200

    Bug 1240854 - Group inventory pages (compatible, mixed, all) fail to display
    groups due to UI timeouts

    CriteriaQueryGenerator is now able to write group by aliases, based on
    ordering fields. In a specific case, when we alter projection, override FROM
    clause (whith several JOINs) override GROUP BY clause the resulting query
    must be grouped by ordering fields - this fix in CriteriaQueryGenerator adds
    a new flag, which enables this feature - so default behaviour does not
    change. This fixes issue with ordering of ResourceGroupComposites based on
    criteria.

    (cherry picked from commit e835d8957e5d0a61d85d67f9755cf56c0f65c655)
    Signed-off-by: Libor Zoubek <lzoubek>

commit b6639167915dbeb6eb040a475d59b9416100b985
Author: Libor Zoubek <lzoubek>
Date:   Tue Aug 18 23:15:48 2015 +0200

    Bug 1240854 - Group inventory pages (compatible, mixed, all) fail to display
    groups due to UI timeouts

    use left (outer) joins, so we don't miss empty groups

    (cherry picked from commit ee0902a5508e16363bbe6b482e423d109f4a0738)
    Signed-off-by: Libor Zoubek <lzoubek>

commit e3fc0ba6e615661feec26a65599479cbb5f0e2ef
Author: Libor Zoubek <lzoubek>
Date:   Fri Jul 31 19:28:39 2015 +0200

    Bug 1240854 - Group inventory pages (compatible, mixed, all) fail to display

    PageList which contains in memory joined results *must* contain paging info
    taken from one of original pagelists

    (cherry picked from commit abfcbdbdcc0122cfef82a78aea213a5c696195c5)
    Signed-off-by: Libor Zoubek <lzoubek>

commit 152a1085b301f444e6795a69bed59290fbfc80f5
Author: Libor Zoubek <lzoubek>
Date:   Mon Jul 13 16:17:30 2015 +0200

    Bug 1240854 - Group inventory pages (compatible, mixed, all) fail to display

    Fix counting total resource count (only count COMMITTED resources)

    (cherry picked from commit 0f6a0f9498139d0baa49f4cd830204a1b80138b8)
    Signed-off-by: Libor Zoubek <lzoubek>

commit d013835ff2fad09c548a61df04bd0de386890090
Author: Libor Zoubek <lzoubek>
Date:   Thu Jul 9 14:34:39 2015 +0200

    Bug 1240854 - Group inventory pages (compatible, mixed, all) fail to display
    groups due to UI timeouts

    Replace subqueries by JOINs. Due to JPA limitation of subqueries (One cannot
    write select x from (select y from z)). One query has been replaced with 2
    (or 3) queries which we have to join in memory. This should not be a perf
    issue, since we're joining composite instance sets (max count of such set
    equals to total resource group count in DB). A case when 2 (or 3) subsequent
    queries could return sets of different sizes (due to underlying group
    deletion/addition) is handled as well.

    (cherry picked from commit 5cc77db6b8ec6873a95d596d04376f4dccf08b32)
    Signed-off-by: Libor Zoubek <lzoubek>

commit 5419181f4f88e6e4b4dd4de492598f4acf43c5a0
Author: Michael Burman <miburman>
Date:   Tue Sep 29 18:49:42 2015 +0300

    [BZ 1258870] Expose purgeEventForContext and deleteEventsForContext to remote API

    (cherry picked from commit fc3849c64739399d95c76c01316775e2b7b044f2)
    Signed-off-by: Libor Zoubek <lzoubek>

commit ee4afd78df30af016539b925de06179827c40773
Author: Libor Zoubek <lzoubek>
Date:   Tue Sep 8 16:04:11 2015 +0200

    Bug 1234912 - Do not authenticate against new storage node when
    replication_factor of system_auth keyspace is wrong

    taken from one of original pagelists

    (cherry picked from commit abfcbdbdcc0122cfef82a78aea213a5c696195c5)
    Signed-off-by: Libor Zoubek <lzoubek>

commit 152a1085b301f444e6795a69bed59290fbfc80f5
Author: Libor Zoubek <lzoubek>
Date:   Mon Jul 13 16:17:30 2015 +0200

    Bug 1240854 - Group inventory pages (compatible, mixed, all) fail to display

    Fix counting total resource count (only count COMMITTED resources)

    (cherry picked from commit 0f6a0f9498139d0baa49f4cd830204a1b80138b8)
    Signed-off-by: Libor Zoubek <lzoubek>

commit d013835ff2fad09c548a61df04bd0de386890090
Author: Libor Zoubek <lzoubek>
Date:   Thu Jul 9 14:34:39 2015 +0200

    Bug 1240854 - Group inventory pages (compatible, mixed, all) fail to display
    groups due to UI timeouts

    Replace subqueries by JOINs. Due to JPA limitation of subqueries (One cannot
    write select x from (select y from z)). One query has been replaced with 2
    (or 3) queries which we have to join in memory. This should not be a perf
    issue, since we're joining composite instance sets (max count of such set
    equals to total resource group count in DB). A case when 2 (or 3) subsequent
    queries could return sets of different sizes (due to underlying group
    deletion/addition) is handled as well.

    (cherry picked from commit 5cc77db6b8ec6873a95d596d04376f4dccf08b32)
    Signed-off-by: Libor Zoubek <lzoubek>

Comment 4 Simeon Pinder 2015-10-09 04:40:20 UTC
Moving to ON_QA as available to test with the following build:
https://brewweb.devel.redhat.com/buildinfo?buildID=460382

 *Note: jon-server-patch-3.3.0.GA.zip maps to ER01 build of
 jon-server-3.3.0.GA-update-04.zip.

Comment 5 Filip Brychta 2015-10-14 12:03:17 UTC
Verified on:
Version :	
3.3.0.GA Update 04
Build Number :	
821a526:fa7b1a1

Comment 7 errata-xmlrpc 2015-10-28 14:37:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1947.html


Note You need to log in before you can comment on or make changes to this bug.