Description of problem: The vacuum process is not able to remove the dead rows as they are held by an unfinished transaction from ovirt-dwh. vacuumdb -d ovirt_engine_history -t vm_disk_samples_history -v vacuumdb: vacuuming database "ovirt_engine_history" INFO: vacuuming "public.vm_disk_samples_history" INFO: index "idx_vm_disk_history_datetime_samples" now contains 63428 row versions in 176 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "idx_vm_disk_configuration_version_samples" now contains 63428 row versions in 181 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "vm_disk_samples_history_vm_disk_id_idx" now contains 63428 row versions in 253 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "vm_disk_samples_history_pkey" now contains 63428 row versions in 176 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "vm_disk_samples_history": found 0 removable, 47958 nonremovable row versions in 738 out of 976 pages ==> DETAIL: 45888 dead row versions cannot be removed yet. There were 0 unused item pointers. Skipped 0 pages due to buffer pins. 0 pages are entirely empty. CPU 0.00s/0.01u sec elapsed 0.01 sec. Here is the idle in transaction connection: select pid, query from pg_stat_activity where state='idle in transaction' and datname='ovirt_engine_history' pid | query -------+------------------------------------------------------------------------------------ 17061 | SELECT 'continueAgg', '1' + | FROM history_configuration + | WHERE var_name = 'lastHourAggr' + | AND var_datetime < '2019-02-08 06:00:00.000000+0000' + | 17062 | SELECT DISTINCT 'continueAgg', '1' + | FROM history_configuration + | WHERE var_name = 'lastDayAggr' + | AND var_datetime < '2019-02-07 00:00:00.000000+0000' + | AND + | CASE + | WHEN + | var_datetime = '2019-02-07 00:00:00.000000+0000' + | AND 'true' = + | ( + | SELECT var_value + | FROM history_configuration + | WHERE var_name = 'HourlyAggFailed'+ | ) + | THEN FALSE + | ELSE TRUE + | END + | (2 rows) Version-Release number of selected component (if applicable): ovirt-engine-dwh-4.2.4.3-1.el7ev.noarch How reproducible: Check for the idle in transaction connections Steps to Reproduce: select pid, query from pg_stat_activity where state='idle in transaction' and datname='ovirt_engine_history' Actual results: The connection prevents the garbage collection Expected results: The trnasaction is closed as soon as the aggregation is finished. Additional info: After killing the process 17062, the dead rows are removed. The transaction can cause significant growth of the DB in some busy environments.
Not ready for GA, so will be handled during 4.3.z.
Thanks Shirly that's a very welcomed fix!
Verified on: redhat-release-server-7.7-10.el7.x86_64 rhv-release-4.3.5-6-001.noarch ovirt-engine-dwhd.log (DEBUG) Actions: 1) No idle transactions ovirt_engine_history=# select pid, query from pg_stat_activity where state='idle in transaction' and datname='ovirt_engine_history'; pid | query -----+------- (0 rows) 2) Deletion *_samples_history - have data from 03:00AM (~38Hrs) (host / vm) 3) running 'vacuum' - no dead rows left -bash-4.2$ vacuumdb -d ovirt_engine_history -t vm_disk_samples_history -v vacuumdb: vacuuming database "ovirt_engine_history" INFO: vacuuming "public.vm_disk_samples_history" INFO: index "idx_vm_disk_history_datetime_samples" now contains 750672 row versions in 2219 pages DETAIL: 0 index row versions were removed. 337 index pages have been deleted, 337 are currently reusable. CPU: user: 0.00 s, system: 0.01 s, elapsed: 0.02 s. INFO: index "idx_vm_disk_configuration_version_samples" now contains 750672 row versions in 3481 pages DETAIL: 0 index row versions were removed. 198 index pages have been deleted, 198 are currently reusable. CPU: user: 0.01 s, system: 0.02 s, elapsed: 0.03 s. INFO: index "vm_disk_samples_history_vm_disk_id_idx" now contains 750672 row versions in 4554 pages DETAIL: 0 index row versions were removed. 281 index pages have been deleted, 281 are currently reusable. CPU: user: 0.02 s, system: 0.01 s, elapsed: 0.03 s. INFO: index "vm_disk_samples_history_pkey" now contains 750672 row versions in 2426 pages DETAIL: 0 index row versions were removed. 364 index pages have been deleted, 364 are currently reusable. CPU: user: 0.00 s, system: 0.01 s, elapsed: 0.02 s. INFO: "vm_disk_samples_history": found 0 removable, 176443 nonremovable row versions in 3077 out of 13669 pages ==> DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 320072 There were 22916 unused item pointers. Skipped 0 pages due to buffer pins, 1707 frozen pages. 0 pages are entirely empty. CPU: user: 0.09 s, system: 0.11 s, elapsed: 0.22 s. Open new bug - Closed connection issue https://bugzilla.redhat.com/show_bug.cgi?id=1730384
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2444
sync2jira