Created attachment 756736 [details] pg_log Description of problem: In History DB - vm_disks_usage Tables have many rows of data where VMs dont have any disk for 8x1000 VMs pools tables gather up to 25M rows (vm_disks_usage_samples_history,25519585) Version-Release number of selected component (if applicable): 3.2/sf17.3 How reproducible: Always Steps to Reproduce: 1.install rhevm+dwh+reports 2.create vm without NIC nor Disk 3.create template 4.create several pools of 1000 vms 5.Leave setup for 24H 6.Delete all pools & VMs 7.via pgsql -Vacuum both DBs - engine & history 8.reboot Server/rhevm Actual results: - History DB gets up to 12GB. - Postgres ../base/ folder takes allot of disk space and its data cannot be deleted /var/lib/pgsql/data/base/* --> *5.2G* -and several History Tables (vm_configuration,133379) (vm_disks_usage_hourly_history,835190) (vm_disks_usage_samples_history,25519585) (vm_hourly_history,835192) (vm_samples_history,25518585) Expected results: Should not collect any data in vm_disks_usage Tables where no disk in VM Additional info: This also happens when I stop DWH More stats on tables: Engine relation | size ----------------------------------------+--------- public.audit_log | 8896 kB public.vm_device | 4816 kB public.pk_vm_device | 2072 kB public.vm_static | 2032 kB public.vm_dynamic | 1736 kB public.idx_audit_log_vm_template_name | 1192 kB public.idx_audit_correlation_id | 1104 kB public.idx_audit_log_user_name | 1104 kB public.vm_statistics | 1072 kB public.idx_audit_log_storage_pool_name | 1040 kB (10 rows) ./Run_Hist_Tables_Size.sh relation | size --------------------------------------------------+--------- public.vm_samples_history | 2638 MB public.vm_disks_usage_samples_history | 1935 MB public.vm_samples_history_vm_id_idx | 1282 MB public.idx_disks_usage_vm_id_samples | 1276 MB public.idx_vm_configuration_version_samples | 961 MB public.idx_vm_history_datetime_samples | 817 MB public.idx_disks_usage_history_datetime_samples | 815 MB public.idx_vm_current_host_configuration_samples | 808 MB public.vm_samples_history_pkey | 694 MB public.vm_disks_usage_samples_history_pkey | 682 MB
Created attachment 756737 [details] rhevm-logs
database files that occupies large disk space /var/lib/pgsql/data/base In History Tables: vm_disks_usage_samples_history idx_disks_usage_vm_id_samples vm_samples_history vm_samples_history_vm_id_idx idx_vm_configuration_version_samples idx_vm_history_datetime_samples idx_disks_usage_history_datetime_samples idx_vm_current_host_configuration_samples and more -rw-------. 1 postgres postgres 820M Jun 4 15:29 101341 -rw-------. 1 postgres postgres 917M Jun 4 15:01 100095.1 -rw-------. 1 postgres postgres 963M Jun 4 15:29 101342 -rw-------. 1 postgres postgres 1.0G Jun 4 15:29 101344 -rw-------. 1 postgres postgres 1.0G Jun 4 11:57 101334.1 -rw-------. 1 postgres postgres 1.0G Jun 4 15:29 101334 -rw-------. 1 postgres postgres 1.0G Jun 4 11:56 100105 -rw-------. 1 postgres postgres 1.0G Jun 4 15:01 100095 ovirt_engine_history=# select relname from pg_class where relfilenode = 100095; relname -------------------------------- vm_disks_usage_samples_history ovirt_engine_history=# select relname from pg_class where relfilenode = 100105; relname ------------------------------- idx_disks_usage_vm_id_samples
Do you have any floating disks or templates? Yaniv
I have 3 templates and no floating disk or points :)
Check what disks are collected in the table and match it to a vm. Probably template disks.
So ? why collecting templates disks gets up to 12G of history DB ? why not collect once ? and what disk usage statistics does a template disks has ?
David, Can you please provide the above information ? The redundant disk information you claim is saved in the hitory DB, What is it's origin ? This should be easy to check and it is already in your environment. Please supply the information.
--- The VM is made of a Template, and the template has Provisioned disk !!!! Issue 1: In pic - vm_samples-disk-usage.png Its possible to see several VM-ids, and not only 3 VM Ids per 3 templates. - This table take a long time to load since its 25M rows or more !! The table shows for each VM as if it has a disk, doesnt matter the number of templates, only becuase the template has a disk !!! Issue 2: the VMs were deleted more than 24H ago VMs-and-templates-DISKS.png Issue 3: I have 3 unique VMs ID and 3 unique Templates So the questions is should all 6 entities have history records ?
Created attachment 758789 [details] multi-map
Created attachment 758790 [details] delete-date-more than 24H
Created attachment 758791 [details] more tables
Created attachment 758792 [details] more2
Created attachment 758793 [details] ftftf
David, lots of data but it's hard to understand what the complains are so I'll try to summaries and pleas correct me if I'm missing something. You are saing that: -1- VMs without disks/or NICs still have statistics collected for them -2- Usage statistics are collected for Templates -3- I did not get the claim (Issue 3, comment #8 -4- You are claiming there are deleted items that still occupy entries in the DB and should be removed. For -1-: This is not the common case and it's not worth the optimization. In the common case you'll have disk (and stateless/pool VM is still a disk) For -2-: Though a valid argument, it will not be the real size generator, templates when used in large scale deployments are not the majority of objects, so the benefit from optimizing here will be minor. -----> Need to re-evaluate the above if treating Glance and Cinder objects as templates under internal implementation. For -4-: If we commit to keeping history data in 24 hours granularity for few years then we can't just delete the history data when the objects are deleted. Or are you claiming we keep it in seconds granularity? Please explain -3-
Issue 3: I have 3 unique VMs ID and 3 unique Templates So the questions is should all 6 entities have history records ? Simon, I was referring to the fact that a template shouldnt have any history on its disks Nor statistics, why else should a template be needing a history ?! I think u answered it on -2-
Fixed, 3.3/is7 Not Reproducable