Description of problem: Need to have metrics on task execution time and history per arch to determine total recipe effectiveness. e.g. time used by tasks itself / overall recipe time. Example: A machine's time is solely taken up by running tasks. Machines that are pre-occupied by running small number of tasks for a very long amount of time are probably more indicitive of resource abuse than large number of tasks run for small amounts of time. Save this data per arch so that graph can be shown together with min, max and median. Counting should not include harness overhead. Motivation: Admin wants to find tasks which run for 'extreme' lengths of time and investigate as to their proportion of overall machines time and whether we need to place limits on these. Users want see how efficiantly their tasks are so that they can schedule long tasks as last in recipes, can notice significant execution time change in time and search for reasons.
ncoghlan: probably not too hard to do, since we'll likely be building up the machine level metrics from task level metrics
Amit, another one where the main question is whether or not we're already recording the required data in the DB.
Using the data from the distro, distro_tree, recipe_task, recipe_resource, task, arch tables, it is possible to retrieve the data per arch and system: mysql> select task.name,recipe_resource.fqdn, arch.arch,task.avg_time,timediff(recipe_task.finish_time,recipe_task.start_time) from distro, distro_tree,recipe,recipe_task,task,arch,recipe_resource where recipe_task.task_id=task.id and recipe.id=recipe_task.recipe_id and recipe.distro_tree_id=distro_tree.id and arch.id=distro_tree.arch_id and recipe_resource.recipe_id=recipe_task.recipe_id order by task.name; Returns sample data such as: | /installation/mytask/Sanity/ext4-test | virt-10 | x86_64 | 300 | 01:12:35 | | /installation/mytask/Sanity/ext4-test | virt-12 | x86_64 | 300 | 00:08:22 The per system/arch data can then be aggregated and inferences drawn from them. The recorded time does however include the installation of the task RPM. I am not sure if we have any data that is recorded exclusively for the duration during which the task is executing.
(In reply to comment #3) This works except for the first task in each recipe (usually but not necessarily /distribution/install) which will include installation time. I think this report will need to filter those out.
(In reply to comment #4) > (In reply to comment #3) > > This works except for the first task in each recipe (usually but not > necessarily /distribution/install) which will include installation time. I > think this report will need to filter those out. Right. The initial report that gets generated will have all these tasks. It is probably a fair assumption that those who will look at the reports are well informed users and hence they can either choose to ignore /distribution/install or such tasks or specify tasks that they don't want to see and the report can be regenerated the data *without* making any further database queries. Hence, the implementation can fetch the data for all tasks when the report is generated and regenerate the graphs/reports as per user's preferences. The latter would be a secondary requirement, if its not to difficult to implement in the reporting environment.
Admin guide updated to include the above information: http://gerrit.beaker-project.org/#/c/1546/
http://gerrit.beaker-project.org/1576
Another case that needs to use a subquery in order to work in more SQL dialects.
http://gerrit.beaker-project.org/1630
Latest version also works in Teiid with the standard MySQL -> Teiid conversions of adding the SQL_TSI_ prefix and replacing the date/time strings with explicit PARSETIMESTAMP calls.
Beaker 0.11.0 has been released.