Bug 1673808 - ovirt-dwh prevents autovacuum from data garbage collection.
Summary: ovirt-dwh prevents autovacuum from data garbage collection.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine-dwh
Version: 4.2.8-2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.3.5
: 4.3.5
Assignee: Shirly Radco
QA Contact: David Vaanunu
URL:
Whiteboard:
Depends On:
Blocks: 1723168 1723791
TreeView+ depends on / blocked
 
Reported: 2019-02-08 08:11 UTC by Roman Hodain
Modified: 2022-03-13 17:24 UTC (History)
14 users (show)

Fixed In Version: ovirt-engine-dwh-4.3.5-1.el7ev
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-08-12 11:55:10 UTC
oVirt Team: Metrics
Target Upstream Version:
Embargoed:
lsvaty: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-45257 0 None None None 2022-03-13 17:24:19 UTC
Red Hat Knowledge Base (Solution) 3902941 0 None None None 2019-02-12 07:40:10 UTC
Red Hat Product Errata RHBA-2019:2444 0 None None None 2019-08-12 11:55:15 UTC
oVirt gerrit 100998 0 'None' MERGED Close DWH db connections open transactions 2021-01-09 11:32:17 UTC
oVirt gerrit 101053 0 'None' MERGED Close DWH db connections open transactions 2021-01-09 11:32:20 UTC

Description Roman Hodain 2019-02-08 08:11:26 UTC
Description of problem:
The vacuum process is not able to remove the dead rows as they are held by an unfinished transaction from ovirt-dwh.

    vacuumdb -d ovirt_engine_history -t vm_disk_samples_history -v
    vacuumdb: vacuuming database "ovirt_engine_history"
    INFO:  vacuuming "public.vm_disk_samples_history"
    INFO:  index "idx_vm_disk_history_datetime_samples" now contains 63428 row versions in 176 pages
    DETAIL:  0 index row versions were removed.
    0 index pages have been deleted, 0 are currently reusable.
    CPU 0.00s/0.00u sec elapsed 0.00 sec.
    INFO:  index "idx_vm_disk_configuration_version_samples" now contains 63428 row versions in 181 pages
    DETAIL:  0 index row versions were removed.
    0 index pages have been deleted, 0 are currently reusable.
    CPU 0.00s/0.00u sec elapsed 0.00 sec.
    INFO:  index "vm_disk_samples_history_vm_disk_id_idx" now contains 63428 row versions in 253 pages
    DETAIL:  0 index row versions were removed.
    0 index pages have been deleted, 0 are currently reusable.
    CPU 0.00s/0.00u sec elapsed 0.00 sec.
    INFO:  index "vm_disk_samples_history_pkey" now contains 63428 row versions in 176 pages
    DETAIL:  0 index row versions were removed.
    0 index pages have been deleted, 0 are currently reusable.
    CPU 0.00s/0.00u sec elapsed 0.00 sec.
    INFO:  "vm_disk_samples_history": found 0 removable, 47958 nonremovable row versions in 738 out of 976 pages
==> DETAIL:  45888 dead row versions cannot be removed yet.
    There were 0 unused item pointers.
    Skipped 0 pages due to buffer pins.
    0 pages are entirely empty.
    CPU 0.00s/0.01u sec elapsed 0.01 sec.

Here is the idle in transaction connection:
    select pid, query from pg_stat_activity where state='idle in transaction' and datname='ovirt_engine_history'

      pid  |                                       query                                        
    -------+------------------------------------------------------------------------------------
     17061 | SELECT 'continueAgg', '1'                                                         +
           | FROM history_configuration                                                        +
           | WHERE var_name = 'lastHourAggr'                                                   +
           | AND var_datetime < '2019-02-08 06:00:00.000000+0000'                              +
           | 
     17062 | SELECT DISTINCT 'continueAgg', '1'                                                +
           | FROM history_configuration                                                        +
           | WHERE var_name = 'lastDayAggr'                                                    +
           |         AND var_datetime < '2019-02-07 00:00:00.000000+0000'                      +
           |         AND                                                                       +
           |                 CASE                                                              +
           |                         WHEN                                                      +
           |                                 var_datetime = '2019-02-07 00:00:00.000000+0000'  +
           |                                 AND 'true' =                                      +
           |                                         (                                         +
           |                                                 SELECT var_value                  +
           |                                                 FROM history_configuration        +
           |                                                 WHERE var_name = 'HourlyAggFailed'+
           |                                         )                                         +
           |                                 THEN FALSE                                        +
           |                         ELSE TRUE                                                 +
           |                 END                                                               +
           | 
    (2 rows)


Version-Release number of selected component (if applicable):
    ovirt-engine-dwh-4.2.4.3-1.el7ev.noarch

How reproducible:
    Check for the idle in transaction connections

Steps to Reproduce:
    select pid, query from pg_stat_activity where state='idle in transaction' and datname='ovirt_engine_history'


Actual results:
    The connection prevents the garbage collection

Expected results:
    The trnasaction is closed as soon as the aggregation is finished.

Additional info:
    After killing the process 17062, the dead rows are removed. The transaction can cause significant growth of the DB in some busy environments.

Comment 11 Doron Fediuck 2019-03-28 08:18:12 UTC
Not ready for GA, so will be handled during 4.3.z.

Comment 15 Roy Golan 2019-06-25 09:06:24 UTC
Thanks Shirly that's a very welcomed fix!

Comment 20 David Vaanunu 2019-07-16 14:58:19 UTC
Verified on:

redhat-release-server-7.7-10.el7.x86_64
rhv-release-4.3.5-6-001.noarch

ovirt-engine-dwhd.log (DEBUG)


Actions:

1) No idle transactions

ovirt_engine_history=# select pid, query from pg_stat_activity where state='idle in transaction' and datname='ovirt_engine_history';
 pid | query 
-----+-------
(0 rows)


2) Deletion

 *_samples_history - have data from 03:00AM (~38Hrs)
 (host / vm)

3) running 'vacuum' - no dead rows left


-bash-4.2$ vacuumdb -d ovirt_engine_history -t vm_disk_samples_history -v
vacuumdb: vacuuming database "ovirt_engine_history"
INFO:  vacuuming "public.vm_disk_samples_history"
INFO:  index "idx_vm_disk_history_datetime_samples" now contains 750672 row versions in 2219 pages
DETAIL:  0 index row versions were removed.
337 index pages have been deleted, 337 are currently reusable.
CPU: user: 0.00 s, system: 0.01 s, elapsed: 0.02 s.
INFO:  index "idx_vm_disk_configuration_version_samples" now contains 750672 row versions in 3481 pages
DETAIL:  0 index row versions were removed.
198 index pages have been deleted, 198 are currently reusable.
CPU: user: 0.01 s, system: 0.02 s, elapsed: 0.03 s.
INFO:  index "vm_disk_samples_history_vm_disk_id_idx" now contains 750672 row versions in 4554 pages
DETAIL:  0 index row versions were removed.
281 index pages have been deleted, 281 are currently reusable.
CPU: user: 0.02 s, system: 0.01 s, elapsed: 0.03 s.
INFO:  index "vm_disk_samples_history_pkey" now contains 750672 row versions in 2426 pages
DETAIL:  0 index row versions were removed.
364 index pages have been deleted, 364 are currently reusable.
CPU: user: 0.00 s, system: 0.01 s, elapsed: 0.02 s.
INFO:  "vm_disk_samples_history": found 0 removable, 176443 nonremovable row versions in 3077 out of 13669 pages
==> DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 320072
There were 22916 unused item pointers.
Skipped 0 pages due to buffer pins, 1707 frozen pages.
0 pages are entirely empty.
CPU: user: 0.09 s, system: 0.11 s, elapsed: 0.22 s.



Open new bug - Closed connection issue

https://bugzilla.redhat.com/show_bug.cgi?id=1730384

Comment 22 errata-xmlrpc 2019-08-12 11:55:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2444

Comment 23 Daniel Gur 2019-08-28 13:12:48 UTC
sync2jira

Comment 24 Daniel Gur 2019-08-28 13:17:01 UTC
sync2jira


Note You need to log in before you can comment on or make changes to this bug.