Bug 1005443 - [Docs][RFE][DWH] Document data model for Data Warehouse
[Docs][RFE][DWH] Document data model for Data Warehouse
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: Documentation (Show other bugs)
3.5.0
Unspecified Unspecified
medium Severity medium
: ovirt-4.1.6
: ---
Assigned To: Emma Heftman
Byron Gravenorst
: FutureFeature
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-06 21:22 EDT by Bryan Yount
Modified: 2017-11-08 05:35 EST (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-11-08 05:35:09 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Docs
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Bryan Yount 2013-09-06 21:22:54 EDT
Description of problem:
A customer has requested that we publish our data model on RHEV-M reports. Specifically, they stated "It would really help if Red-hat published their data model on reporting. [The] current document describes the tables and the fields, but there doesn’t seem to be any documentation on how to join tables and what the indexes are."

Version-Release number of selected component (if applicable):
3.2

Additional info:
Customers like to run their own reports against the rhevm_history database. Publishing this information would assist them in this endeavor.
Comment 2 Tim Hildred 2013-10-27 20:04:05 EDT
Reassigning to Jodi Biddle (jbiddle@redhat.com) as I am no longer working on Red Hat Enterprise Virtualization documentation.
Comment 4 Yaniv Lavi 2015-09-02 08:32:56 EDT
We should try to describe in text even without charts the different types of connector field that are used:
- IDs
- Data point.
- history version 
and describe how to use them.
Comment 5 Andrew Dahms 2015-09-02 20:37:52 EDT
Changing status back to 'New' until re-assignment.
Comment 6 Sandro Bonazzola 2015-10-26 08:33:19 EDT
this is an automated message. oVirt 3.6.0 RC3 has been released and GA is targeted to next week, Nov 4th 2015.
Please review this bug and if not a blocker, please postpone to a later release.
All bugs not postponed on GA release will be automatically re-targeted to

- 3.6.1 if severity >= high
- 4.0 if severity < high
Comment 7 Yaniv Lavi 2016-05-10 07:27:16 EDT
(In reply to Yaniv Dary from comment #4)
> We should try to describe in text even without charts the different types of
> connector field that are used:
> - IDs
> - Data point.
> - history version 
> and describe how to use them.

Also indexes.
Comment 10 Yaniv Lavi 2016-11-08 07:50:33 EST
Can you provide some text on the column types in DWH:
- Object IDs
- Metrics\configuration values
- History version fields.
Comment 11 Shirly Radco 2016-12-19 04:35:35 EST
(In reply to Yaniv Dary from comment #10)
> Can you provide some text on the column types in DWH:

- Object IDs

In the configuration tables each Object (Host,vm,disk,nic,...) has a unique ID.
For example: Each host in the "host_configuration" table, holds a unique host_id.

> - History version fields.

Each configuration table holds the "history_id" that represents the ID of the specific configuration in that moment in time.

On each configuration change for the specific Object the "history_id" is updated, as opposed to the Object_id that is constant for each Object.

- Metrics\configuration values

Each statistics table (samples/hourly/daily) holds the Object's unique ID.

In order to get the specific Object configuration, for each statistic, in the specific time when the statistic was collected, we also save the Object's configuration version.

For example, each host statistic in tables "host_samples_history","host_hourly_history" and "host_daily_history" has the "host_id" and "host_configuration_version" columns.

The Object's configuration version is used to join with the configuration table on the "history_id" column.


A configuration table may have a related Object ID and configuration version columns and as well.
For example: The  "host_configuration" table also holds the "cluster_configuration_version" and "cluster_id"

- indexes

The tables are indexed according to the way the table is usually queried by.

The configuration tables are indexed by the Object_id and related Object IDs as well.
For example: The  "host_configuration" table is indexed by host_id and also by the "cluster_id".


The statistics tables are indexed by the Object's ID , configuration version, history_datetime, and the related Objects IDs and configuration version.
Comment 12 Lucy Bopf 2017-08-02 04:21:20 EDT
Assigning to Emma for review.
Comment 13 Emma Heftman 2017-08-02 11:15:56 EDT
Hi Shirly      
I've been assigned to this old bug and I see you were also involved a while back. 

Can you please clarify what the "host_configuration_version" column contains.
Thanks!
Comment 15 Emma Heftman 2017-08-06 06:29:44 EDT
I will be making the following changes to the DWH Guide:
1. A new column will appear in each table indicating with the field is indexed.

2. Additional explanations will be added to the Unique ID, configuration ID and history ID explaining how the tables can be joined.

3. Extra descriptions will be added to the other fields, where necessary.
Comment 17 Emma Heftman 2017-08-07 07:27:27 EDT
Hey Shirly
I'm copying the questions that I asked you by mail...please answer them here.

1. Hi
we have two configuration tables with storage domains, why do we need two and how are they joined? 

v4_1_map_history_datacenters_storage_domains

v4_1_configuration_history_storage_domains

2. Which of the history_id from these two tables is used for storage_configuration_version in storage domain statistics views.

3. It a a mistake that device_configuration_version appears in the v4_1_configuration_history_vms_devices view? Surely it's the same as history_ID

shouldn't it should only be vm_configuration.

Should device configuration version appear in another configuration view instead?

4.  The following table has no configuration_version fields at all. I think something is missing - vm_configuration_version?  

v4_1_configuration_history_vms_disks
Comment 18 Shirly Radco 2017-08-07 08:48:36 EDT
(In reply to Emma Heftman from comment #17)
> Hey Shirly
> I'm copying the questions that I asked you by mail...please answer them here.
> 
> 1. Hi
> we have two configuration tables with storage domains, why do we need two
> and how are they joined? 
> 
> v4_1_map_history_datacenters_storage_domains

This is a map between the datacenters and the storage_domains.
We keep in it the entities keys and attach and detach dates so we can know when were the connected and when not.

We join to the datacenter configuration table with the datacenter_id and to the  storage_domain_configuration table with the storage_domain_id.

> 
> v4_1_configuration_history_storage_domains
> 
> 2. Which of the history_id from these two tables is used for
> storage_configuration_version in storage domain statistics views.

storage_configuration_version will be used with storage_domain_configuration table history_id.

> 
> 3. It a a mistake that device_configuration_version appears in the
> v4_1_configuration_history_vms_devices view? Surely it's the same as
> history_ID

device_configuration_version is joined with either the vm_interface_configuration table on history_id field or with the vm_disk_configuration table on history_id field, depending if the value of the type field in the vm_device_history table (v4_1_configuration_history_vms_devices view is based on this table). The type value will be 'disk' or 'interface'.

> 
> shouldn't it should only be vm_configuration.
> 
> Should device configuration version appear in another configuration view
> instead?
> 
> 4.  The following table has no configuration_version fields at all. I think
> something is missing - vm_configuration_version?  
> 
> v4_1_configuration_history_vms_disks

Its not required. The vm_configuration_version will be taken from the vm_device_history.
Comment 19 Emma Heftman 2017-08-07 09:52:54 EDT
(In reply to Shirly Radco from comment #18)
> (In reply to Emma Heftman from comment #17)
> > Hey Shirly
> > I'm copying the questions that I asked you by mail...please answer them here.
> > 
> > 1. Hi
> > we have two configuration tables with storage domains, why do we need two
> > and how are they joined? 
> > 
> > v4_1_map_history_datacenters_storage_domains
> 
> This is a map between the datacenters and the storage_domains.
> We keep in it the entities keys and attach and detach dates so we can know
> when were the connected and when not.
> 
> We join to the datacenter configuration table with the datacenter_id and to
> the  storage_domain_configuration table with the storage_domain_id.
> 
> > 
> > v4_1_configuration_history_storage_domains
> > 
> > 2. Which of the history_id from these two tables is used for
> > storage_configuration_version in storage domain statistics views.
> 
> storage_configuration_version will be used with storage_domain_configuration
> table history_id.
> 
> > 
> > 3. It a a mistake that device_configuration_version appears in the
> > v4_1_configuration_history_vms_devices view? Surely it's the same as
> > history_ID
> 
> device_configuration_version is joined with either the
> vm_interface_configuration table on history_id field or with the
> vm_disk_configuration table on history_id field, depending if the value of
> the type field in the vm_device_history table
> (v4_1_configuration_history_vms_devices view is based on this table). The
> type value will be 'disk' or 'interface'.
> 
> > 
> > shouldn't it should only be vm_configuration.
> > 
> > Should device configuration version appear in another configuration view
> > instead?
> > 
> > 4.  The following table has no configuration_version fields at all. I think
> > something is missing - vm_configuration_version?  
> > 
> > v4_1_configuration_history_vms_disks
> 
> Its not required. The vm_configuration_version will be taken from the
> vm_device_history.

vm_device_history does not appear in the documentation. Should it? And if so what table does it belong to and what does it join with?
Comment 20 Shirly Radco 2017-08-07 10:22:55 EDT
v4_1_configuration_history_vms_devices view is base on vm_device_history.
If there is documentation for the view it is enough.
Comment 26 Emma Heftman 2017-10-26 04:46:02 EDT
The updated documentation is available on the Customer Portal:

https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html-single/data_warehouse_guide/

Note You need to log in before you can comment on or make changes to this bug.