Red Hat Bugzilla – Bug 784571
Resource tree not complete when more than 200 children
Last modified: 2013-09-03 11:19:22 EDT
Description of problem:
UI Resource tree is not complete when there are more than 200 child resources for a parent resource.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. deploy many applications in the monitored JBossAS resource (in order to have more than 200 resources)
2. open the UI Resource Tree and expand JBossAS node
Only a part of the child resources are displayed. Sometimes, at refresh, different child resources are displayed.
The complete resource tree should be displayed
I think the problem is that the resources are retrieved from the backend in a PageList, but not all pages are fetched.
See the class :
cause: ResourceCriteria uses by default a PageControl(0, 200),
solution 1. disable paging: criteria.clearPaging(),
solution 2. process all result pages, not only the first one.
We should see if we can pick this is up in the new smartgwt work
I think it may actually make sense to limit the number of children that are displayed in the tree. If a Resource had 1000 children, adding them all to the tree would not be very usable, since unlike listgrids, the tree provides no filtering, sorting, or paging. And I think 200 seems like a pretty reasonable maximum. What do other people think?
Cotel, note to see all of the child Resources, you can always just go to the parent Resource's Inventory > Children tab.
[master http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commitdiff;h=f879a71] adds a Child Resources item to the context menu of Resource nodes that provides a slightly quicker way to get to a Resource's Inventory > Children tab. Costel, hopefully this will make things a little easier for you.
It is frustrating for a user when he doesn't see all the resources and it is confusing when he makes a refresh and sees another set of resources than the first time. He cannot know about the limitation of 200 children and he thinks it is a bug, just like we did :). At least an explanatory message should be displayed in this case.
About the possibility to see all children in Children tab: indeed here they are all but they are all together in a bunch, they not arranged in any way, EJBs WARs EARs etc they are all mixed up and is difficult to find a particular resource.
Maybe here the resources could be grouped by type, just like they are in the tree browser.
Yeah, I hear you. I will look into why a different set of children are coming back on subsequent visits; I suspect we may just not be specifying any sorting on the underlying datasource fetch.
I agree a message informing the user when not all children are being shown would be helpful. I'm just not sure where the message could be displayed; do you have any ideas?
We can't use grouping on the Inventory > Children, because grouping can't be used for SmartGWT grids backed by a paged server-side datasource. SmartGWT grouping is only designed to work with grids with a relatively small fixed-size data set.
However, you should be able to use the search bar to easily filter the results by type. For example, entering "type=WAR" will list all children that are WAR's.
I think the message could be displayed on the top banner, where warn and error messages are usually displayed.
JON 3.1.1 ER1 build is available. Moving to ON_QA.
Moving back to ON_DEV. This was incorrectly moved to ON_QA in comment 8.
This is a sort of tricky problem and we already have some inconsistency in the way the tree is handled. For example, if you navigate directly to a resource, via a link in other view, or a bookmark, we *always* display all siblings. So, for the issue above, if you really wanted to see more than 200 you could navigate to any one of them directly and they will all show up.
The problem mentioned occurs only when expanding a node to see children. In that case we do limit to 200 and we don't perform server-side sort.
That latter can be easily fixed, so that you'd get the same 200 each time. But I think we need a more consistent solution here. I can think of two:
First, we could simply return all of the children all of the time. This may actually be OK as it is unlikely in general to have a lot of children. The obvious downside is the case when there are a lot of children and then the tree fetch/build will take a lot of time and expand vertically to a great degree. Note that with direct navigation, as mentioned above, we can currently already get into this situation (and no one has really complained).
The second idea would be to introduce "paging" nodes in the tree for when there are a large number of siblings. In this case we'd likely fetch them all but instead of rendering each node we'd introduce a number of nodes that basically "page" the data, or represent ranges. For example, if we set the tree paging at 100 then we'd have a paging node for each 100 and one for the remainder, if necessary.
The advantage here would be much less vertical expansion (not likely any less work). Also, it should be more consistent in the UI, the same for direct navigation and node expansion. The disadvantage is that it's more work and adds more nesting to avoid vertical expansion.
Returning all the children all the time is not a bad idea. We run the RHQ 4.4 server patched to do exactly this and is OK. We monitor Jbosses with a lot of children and it takes a few seconds to load all of them but this is acceptable (almost, a "please wait, loading..." message would be nice). Also vertical expansion is not a big problem.
What about lazy loading of the node's children ? I mean when you click on a node, only the first level of children will be loaded.
I think all-children-all-time with lazy loading would be the perfect solution.
release/jon3.1.x commit 21458aefaa3bd0ec66fb53d670af38505d6512b8
[Bug 847014 - Resource tree not complete when more than 200 children]
Allow an unlimited number of children when expanding the tree node. Although
this may create a large vertical expansion, it may also be what the
user wants. See BZ comments for more on the approach and alternatives
but basically, take this [easy] approach until it's clear that users, or UX,
want something else.
Cherry-pick of master dd7803912491f2c610baedb0023a9c1c2cc142a4
"What about lazy loading of the node's children ? I mean when you click on a node, only the first level of children will be loaded."
Actually, this is what happens, we don't fetch more than one level below the expanded parent resource when expanding nodes manually. By "all of the children" I actually did mean all of the immediate children, not grandchildren etc...
I totally agree with the request for a "loading..." message while the tree is rendering/updating. So I added:
Bug 847138 - RFE: Add indication that resource tree is loading or expanding
And I updated master with initial support for this. It will need to be reviewed to see if people like it.
As Jay describes in https://bugzilla.redhat.com/show_bug.cgi?id=784571#c10 this is a non-trivial problem. The concern I have with the latest changes are that we are exchanging a sub-optimal (but not terrible) solution for one with the possibility of unknown (therefore potentially very large) consequences.
My main concern is around the behaviour of the tree in the presence of a very large number of resources (>>200). For example I'm aware of customers that have had 3000+ EJBs deployed on a single (massive) JBAS instance. Those would all be rendered under a single node, under all circumstances, with the latest implementation. [I realise that direct navigation to a particular resource would trigger this issue today, but I believe we are significantly increasing the likelihood of someone hitting this in the new implementation]. The unknown consequences that I was referring to above are things like excessively long load times, higher browser mem/cpu usage, high serverside mem/cpu usaage...
Given the unknowns I think it makes sense to reinstate the 200child limit when navigating the tree while we try to determine a better solution, i.e. a solution that addresses both tree browsing and direct navigation to a resource. Bigger picture, I'm skeptical of a tree based widget really being the best tool to showing an arbitrarily large number of elements. We already have the Inventory>Child Resources or Inventory>Members subtabs which show the full set of entries under a node, but using our infinitely scrollable table implementation. Going forward I would also like to get UXD input on how they think we should be handling this situation.
One additional point which Jay brought up while discussing this topic was the case of directly navigating to a resource which has >200 sibling resources.
Consider the case of a resource with 1000child resources and we restrict the number of resources rendered under any tree node to 200 in *all* circumstances. Then if the user navigates directly to child #777 from the Inventory browser say, which 200 sibling resources do we want show in the tree?
a) child#1, child#2, .. child#199, child#777
b) child#677, child#678, .. child#777 .., child#875, child#876
or something else?
Wouldn't be an alternative to have a button with "See all" for the cases when the tree is not completed ?
release/jon3.1.x commit 70f081b48da38147a661d0e1fbc04f9b87609516
We decided not to allow an unlimited number of child resources be returned
when expanding a node in the tree. But there are major problems with
simply limiting the children to 200. This is what we do today, and the
results are unordered. Meaning you can lose any number of resources of
any types and it is not possible to know what you are missing. And we don't
indicate that the tree is incomplete. If we add ordering (alphabetical by
resource name) you still lose any number of resources, but always the same
ones. In this way it is much easier to realize that you are missing
child resources (and/or entire child types) because the resources returned
obviously stop as some lexigraphical point.
Still, we don't want to allow an unbounded fetch in the GUI. The solution
here is to move the unbounded fetch to the server side and to introduce
SLSB support to prune the results and return the pruned child set. In this
way we can limit the max child resources while returning a much more
reasonable resource set.
The way it works is as follows: There are two variable in use:
Will not return more than this number of resources. If the original fetch
exceeds maxResources then maxResourcesByType will be enforced. If, after
trimming by type, maxResources is still exceeded, then the tail
resources (assuming a sorted set) will be removed to enforce the limit.
If <=0 the default (currently set to 1000) will be used.
If maxResources is exceeded by the initial result set then members of each
type will be trimmed down to meet this limit.
If <=0 the default (currently set to 200) will be used.
The defaults may change after this check-in based on testing results. The
GUI will not set these values in the fetch, so the defaults will be applied.
The default values can be overriden (and applied to all fetches/sessions) by
setting the following system properties (in rhq-server.properties and restarting):
So, this allows us to:
- Avoid an unbounded fetch.
- Have any distribution of child resources of various types as long as the
maxResources limit is not exceeded.
- Apply an even and understandable pruning when necessary.
It does not yet allow us to indicate in the tree which resource types have
been pruned. This will be a future enhancement and tracked as
Cherry-pick of master 30a6818e7ae101dc8df7edc198c5ee9ac790683e
** note - verification of the 1000/200 defaults is in progress. The defaults
** are expected to be fine, but are still subject to change. A further comment
** will be added noting the final values.
I've performed several tests and the limits seem fine. Using perftest plugin I ran the following scenarios:
Here we have 1000 children across 3 child types fetched when expanding the server node. On my weak laptop (2 core, 3G, and not much processing to spare)
Expansion took 4s. All expected resources (all 1000) were pulled and scrolling worked fine when expanding them all. They all come back despite the fact that all three child types exceed maxResourcesByType (200), because the total does not exceed maxResources (1000).
Expansion took 4s. All expected resources (600) were pulled and scrolling worked fine when expanding them all. Each service type was trimmed to 200, the max for each service. This is sort of the worst case scenario, where we have way too many resources and they are evenly distributed across types.
So, leaving defaults as is.
To test this with the perftest plugin deploy the perftest plugin and ensure your agent is updated.
Note that you'll need to grab this plugin from master. It was updated with the necessary scenario in commit f0f52f180b65682996586b4a0438a1865cae8381.
Once in place, restart the agent with the first set of properties in the above comment (either on the command line or set them in rhq-agent-env.sh via RHQ_AGENT_ADDITIONAL_JAVA_OPTS).
Start the agent and it will generate and "discover" the perftest resources.
Navigate to your platform resource and then expand the "server-omegas" node, and then the actual server. This will then fetch all the services. Once the child types are rendered in the tree they can be expanded to validate the resources.
To run new numbers:
1) shut down agent
2) uninventory the server-omega resource
3) restart the agent with then new scenario properties
4) import, etc...
Also, if you want to override the actual defaults for bounding the fetch, see comment 18 about the properties you can set in rhq-server.properties.
Moving to ON_QA. The JON 3.1.1 ER3 build is available at https://brewweb.devel.redhat.com/buildinfo?buildID=230321.
qe task for this is on this QE ticket --> https://engineering.redhat.com/trac/jon/ticket/283
smoketest for JON 3.1.1 is complete (per discussion with armine)
Bulk closing of old issues in VERIFIED state.