Bug 970586 - [rhevm-dwh] History DB - vm_disks_usage Tables have many rows of data where VMs dont have any disk
[rhevm-dwh] History DB - vm_disks_usage Tables have many rows of data where V...
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine-dwh (Show other bugs)
3.2.0
Unspecified Unspecified
unspecified Severity high
: ---
: 3.3.0
Assigned To: Yaniv Lavi
David Botzer
infra
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-04 08:17 EDT by David Botzer
Modified: 2016-02-10 14:10 EST (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-07-28 10:05:35 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
pg_log (22.75 KB, application/x-gzip)
2013-06-04 08:17 EDT, David Botzer
no flags Details
rhevm-logs (375.51 KB, application/x-gzip)
2013-06-04 08:18 EDT, David Botzer
no flags Details
multi-map (144.73 KB, image/png)
2013-06-09 08:54 EDT, David Botzer
no flags Details
delete-date-more than 24H (73.46 KB, image/png)
2013-06-09 08:55 EDT, David Botzer
no flags Details
more tables (64.58 KB, image/png)
2013-06-09 08:55 EDT, David Botzer
no flags Details
more2 (57.61 KB, image/png)
2013-06-09 08:56 EDT, David Botzer
no flags Details
ftftf (72.42 KB, image/png)
2013-06-09 08:56 EDT, David Botzer
no flags Details

  None (edit)
Description David Botzer 2013-06-04 08:17:06 EDT
Created attachment 756736 [details]
pg_log

Description of problem:
In History DB - vm_disks_usage Tables have many rows of data where VMs dont have any disk
for 8x1000 VMs pools tables gather up to 25M rows
(vm_disks_usage_samples_history,25519585)

Version-Release number of selected component (if applicable):
3.2/sf17.3

How reproducible:
Always

Steps to Reproduce:
1.install rhevm+dwh+reports
2.create vm without NIC nor Disk
3.create template
4.create several pools of 1000 vms
5.Leave setup for 24H
6.Delete all pools & VMs
7.via pgsql -Vacuum both DBs - engine & history
8.reboot Server/rhevm

Actual results:
- History DB gets up to 12GB.
- Postgres ../base/ folder takes allot of disk space and its data cannot be 
  deleted /var/lib/pgsql/data/base/* --> *5.2G*
-and several History Tables
 (vm_configuration,133379)
 (vm_disks_usage_hourly_history,835190)
 (vm_disks_usage_samples_history,25519585)
 (vm_hourly_history,835192)
 (vm_samples_history,25518585) 

Expected results:
Should not collect any data in vm_disks_usage Tables where no disk in VM

Additional info:
This also happens when I stop DWH

More stats on tables: Engine
               relation                |  size
----------------------------------------+---------
 public.audit_log                       | 8896 kB
 public.vm_device                       | 4816 kB
 public.pk_vm_device                    | 2072 kB
 public.vm_static                       | 2032 kB
 public.vm_dynamic                      | 1736 kB
 public.idx_audit_log_vm_template_name  | 1192 kB
 public.idx_audit_correlation_id        | 1104 kB
 public.idx_audit_log_user_name         | 1104 kB
 public.vm_statistics                   | 1072 kB
 public.idx_audit_log_storage_pool_name | 1040 kB
(10 rows)

./Run_Hist_Tables_Size.sh
                     relation                     |  size
--------------------------------------------------+---------
 public.vm_samples_history                        | 2638 MB
 public.vm_disks_usage_samples_history            | 1935 MB
 public.vm_samples_history_vm_id_idx              | 1282 MB
 public.idx_disks_usage_vm_id_samples             | 1276 MB
 public.idx_vm_configuration_version_samples      | 961 MB
 public.idx_vm_history_datetime_samples           | 817 MB
 public.idx_disks_usage_history_datetime_samples  | 815 MB
 public.idx_vm_current_host_configuration_samples | 808 MB
 public.vm_samples_history_pkey                   | 694 MB
 public.vm_disks_usage_samples_history_pkey       | 682 MB
Comment 1 David Botzer 2013-06-04 08:18:10 EDT
Created attachment 756737 [details]
rhevm-logs
Comment 2 David Botzer 2013-06-04 08:40:46 EDT
database files that occupies large disk space /var/lib/pgsql/data/base
In History Tables:
 vm_disks_usage_samples_history
 idx_disks_usage_vm_id_samples
 vm_samples_history
 vm_samples_history_vm_id_idx
 idx_vm_configuration_version_samples
 idx_vm_history_datetime_samples
 idx_disks_usage_history_datetime_samples
 idx_vm_current_host_configuration_samples
and more
-rw-------. 1 postgres postgres  820M Jun  4 15:29 101341
-rw-------. 1 postgres postgres  917M Jun  4 15:01 100095.1
-rw-------. 1 postgres postgres  963M Jun  4 15:29 101342
-rw-------. 1 postgres postgres  1.0G Jun  4 15:29 101344
-rw-------. 1 postgres postgres  1.0G Jun  4 11:57 101334.1
-rw-------. 1 postgres postgres  1.0G Jun  4 15:29 101334
-rw-------. 1 postgres postgres  1.0G Jun  4 11:56 100105
-rw-------. 1 postgres postgres  1.0G Jun  4 15:01 100095

ovirt_engine_history=# select relname from pg_class  where relfilenode = 100095;
            relname             
--------------------------------
 vm_disks_usage_samples_history

ovirt_engine_history=# select relname from pg_class  where relfilenode = 100105;
            relname            
-------------------------------
 idx_disks_usage_vm_id_samples
Comment 3 Yaniv Lavi 2013-06-09 03:49:33 EDT
Do you have any floating disks or templates?



Yaniv
Comment 4 David Botzer 2013-06-09 04:06:41 EDT
I have 3 templates and no floating disk or points :)
Comment 5 Yaniv Lavi 2013-06-09 04:09:15 EDT
Check what disks are collected in the table and match it to a vm. Probably template disks.
Comment 6 David Botzer 2013-06-09 04:35:58 EDT
So ? why collecting templates disks gets up to 12G of history DB ?
why not collect once ?

and what disk usage statistics does a template disks has ?
Comment 7 Barak 2013-06-09 07:56:43 EDT
David,

Can you please provide the above information ?

The redundant disk information you claim is saved in the hitory DB,
What is it's origin ?

This should be easy to check and it is already in your environment.

Please supply the information.
Comment 8 David Botzer 2013-06-09 08:54:26 EDT
--- The VM is made of a Template, and the template has Provisioned disk !!!!

Issue 1:
In pic - vm_samples-disk-usage.png
Its possible to see several VM-ids, and not only 3 VM Ids per 3 templates.
- This table take a long time to load since its 25M rows or more
!! The table shows for each VM as if it has a disk, doesnt matter the number of templates, only becuase the template has a disk !!!

Issue 2: the VMs were deleted more than 24H ago

VMs-and-templates-DISKS.png
Issue 3: I have 3 unique VMs ID and 3 unique Templates
So the questions is should all 6 entities have history records ?
Comment 9 David Botzer 2013-06-09 08:54:52 EDT
Created attachment 758789 [details]
multi-map
Comment 10 David Botzer 2013-06-09 08:55:19 EDT
Created attachment 758790 [details]
delete-date-more than 24H
Comment 11 David Botzer 2013-06-09 08:55:45 EDT
Created attachment 758791 [details]
more tables
Comment 12 David Botzer 2013-06-09 08:56:12 EDT
Created attachment 758792 [details]
more2
Comment 13 David Botzer 2013-06-09 08:56:55 EDT
Created attachment 758793 [details]
ftftf
Comment 14 Simon Grinberg 2013-06-16 07:09:35 EDT
David, lots of data but it's hard to understand what the complains are so I'll try to summaries and pleas correct me if I'm missing something. 

You are saing that:
-1- VMs without disks/or NICs still have statistics collected for them 
-2- Usage statistics are collected for Templates 
-3- I did not get the claim (Issue 3, comment #8
-4- You are claiming there are deleted items that still occupy entries in the DB and should be removed. 


For -1-: This is not the common case and it's not worth the optimization. In the common case you'll have disk (and stateless/pool VM is still a disk) 
For -2-: Though a valid argument, it will not be the real size generator, templates when used in large scale deployments are not the majority of objects, so the benefit from optimizing here will be minor. 
-----> Need to re-evaluate the above if treating Glance and Cinder objects as templates under internal implementation. 
For -4-: If we commit to keeping history data in 24 hours granularity for few years then we can't just delete the history data when the objects are deleted. Or are you claiming we keep it in seconds granularity? 

Please explain -3-
Comment 15 David Botzer 2013-06-16 07:19:28 EDT
Issue 3: I have 3 unique VMs ID and 3 unique Templates
So the questions is should all 6 entities have history records ?

Simon,
I was referring to the fact that a template shouldnt have any history on its disks
Nor statistics, 
why else should a template be needing a history ?!
I think u answered it on -2-
Comment 16 David Botzer 2013-07-28 10:05:35 EDT
Fixed, 3.3/is7
Not Reproducable

Note You need to log in before you can comment on or make changes to this bug.