Bug 1718165 - ETL service sampling has encountered an error. Please consult the service log for more details.
Summary: ETL service sampling has encountered an error. Please consult the service log...
Keywords:
Status: CLOSED EOL
Alias: None
Product: ovirt-engine-dwh
Classification: oVirt
Component: ETL
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Shirly Radco
QA Contact: Lukas Svaty
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-07 07:01 UTC by nicolas
Modified: 2019-06-13 06:28 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-13 06:28:44 UTC
oVirt Team: Metrics
Embargoed:


Attachments (Terms of Use)
engine log (307.30 KB, application/gzip)
2019-06-07 07:01 UTC, nicolas
no flags Details
dwhd log (128.37 KB, application/gzip)
2019-06-07 07:02 UTC, nicolas
no flags Details

Description nicolas 2019-06-07 07:01:42 UTC
Created attachment 1578236 [details]
engine log

Description of problem:

We're running oVirt 4.1.9, not 4.2.0 as specified in this bug (cannot upgrade yet until [1] is released). Since a few days ago our event list if full of lines like this:

  ETL service sampling has encountered an error. Please consult the service log for more details.

Having a look at the log I see events like:

2019-06-06 13:37:11|NJ4C8T|TOlL8U|FdlWtU|OVIRT_ENGINE_DWH|StatisticsSync|Default|6|Java Exception|tJDBCOutput_7|org.postgresql.util.PSQLException:ERROR: current transaction is aborted, commands ignored until end of transaction block|1
2019-06-06 13:37:11|NJ4C8T|TOlL8U|FdlWtU|OVIRT_ENGINE_DWH|StatisticsSync|Default|6|Java Exception|tJDBCOutput_4|org.postgresql.util.PSQLException:ERROR: current transaction is aborted, commands ignored until end of transaction block|1
Exception in component tJDBCOutput_5
org.postgresql.util.PSQLException: ERROR: current transaction is aborted, commands ignored until end of transaction block
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2157)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1886)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:555)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:363)
        at ovirt_engine_dwh.statisticssync_4_1.StatisticsSync.tJDBCInput_10Process(StatisticsSync.java:9030)
        at ovirt_engine_dwh.statisticssync_4_1.StatisticsSync$5.run(StatisticsSync.java:16071)
2019-06-06 13:37:11|NJ4C8T|TOlL8U|FdlWtU|OVIRT_ENGINE_DWH|StatisticsSync|Default|6|Java Exception|tJDBCOutput_5|org.postgresql.util.PSQLException:ERROR: current transaction is aborted, commands ignored until end of transaction block|1
Exception in component tRunJob_5
java.lang.RuntimeException: Child job running failed
        at ovirt_engine_dwh.samplerunjobs_4_1.SampleRunJobs.tRunJob_5Process(SampleRunJobs.java:1654)
        at ovirt_engine_dwh.samplerunjobs_4_1.SampleRunJobs.tRunJob_6Process(SampleRunJobs.java:1456)
        at ovirt_engine_dwh.samplerunjobs_4_1.SampleRunJobs.tRunJob_1Process(SampleRunJobs.java:1228)
        at ovirt_engine_dwh.samplerunjobs_4_1.SampleRunJobs.tRunJob_4Process(SampleRunJobs.java:1000)
        at ovirt_engine_dwh.samplerunjobs_4_1.SampleRunJobs.tJDBCConnection_2Process(SampleRunJobs.java:767)
        at ovirt_engine_dwh.samplerunjobs_4_1.SampleRunJobs.tJDBCConnection_1Process(SampleRunJobs.java:642)
        at ovirt_engine_dwh.samplerunjobs_4_1.SampleRunJobs$2.run(SampleRunJobs.java:2683)
2019-06-06 13:37:11|FdlWtU|TOlL8U|KNLNa4|OVIRT_ENGINE_DWH|SampleRunJobs|Default|6|Java Exception|tRunJob_5|java.lang.RuntimeException:Child job running failed|1
Exception in component tRunJob_1
java.lang.RuntimeException: Child job running failed
        at ovirt_engine_dwh.sampletimekeepingjob_4_1.SampleTimeKeepingJob.tRunJob_1Process(SampleTimeKeepingJob.java:6067)
        at ovirt_engine_dwh.sampletimekeepingjob_4_1.SampleTimeKeepingJob.tJDBCInput_2Process(SampleTimeKeepingJob.java:5809)
        at ovirt_engine_dwh.sampletimekeepingjob_4_1.SampleTimeKeepingJob.tJDBCConnection_1Process(SampleTimeKeepingJob.java:4444)
        at ovirt_engine_dwh.sampletimekeepingjob_4_1.SampleTimeKeepingJob.tJDBCConnection_2Process(SampleTimeKeepingJob.java:4319)
        at ovirt_engine_dwh.sampletimekeepingjob_4_1.SampleTimeKeepingJob.tRowGenerator_2Process(SampleTimeKeepingJob.java:4188)
        at ovirt_engine_dwh.sampletimekeepingjob_4_1.SampleTimeKeepingJob.tJDBCInput_3Process(SampleTimeKeepingJob.java:3593)
        at ovirt_engine_dwh.sampletimekeepingjob_4_1.SampleTimeKeepingJob.tJDBCInput_5Process(SampleTimeKeepingJob.java:2977)
        at ovirt_engine_dwh.sampletimekeepingjob_4_1.SampleTimeKeepingJob.tJDBCInput_4Process(SampleTimeKeepingJob.java:2295)
        at ovirt_engine_dwh.sampletimekeepingjob_4_1.SampleTimeKeepingJob.tJDBCConnection_3Process(SampleTimeKeepingJob.java:1649)
        at ovirt_engine_dwh.sampletimekeepingjob_4_1.SampleTimeKeepingJob$2.run(SampleTimeKeepingJob.java:11363)

Version-Release number of selected component (if applicable):

4.1.9

How reproducible:

All the time

Additional info:

engine and dwhd logs attached

Comment 1 nicolas 2019-06-07 07:02:07 UTC
Created attachment 1578237 [details]
dwhd log

Comment 2 Shirly Radco 2019-06-10 07:07:42 UTC
This maybe related to https://bugzilla.redhat.com/show_bug.cgi?id=1541924 that was fixed in version 4.2.2.


I believe you can update manually the history_id field in tables  - vm_interface_daily_history, vm_interface_hourly_history, vm_interface_samples_history
from integer to bigint and it should fix the issue, but I did not test on 4.1.

like in:
https://gerrit.ovirt.org/#/c/87139/4/packaging/dbscripts/upgrade/04_02_0020__updated_vm_interface_history_id_to_bigint.sql

Comment 3 nicolas 2019-06-10 08:07:24 UTC
(In reply to Shirly Radco from comment #2)
> This maybe related to https://bugzilla.redhat.com/show_bug.cgi?id=1541924
> that was fixed in version 4.2.2.
> 
> 
> I believe you can update manually the history_id field in tables  -
> vm_interface_daily_history, vm_interface_hourly_history,
> vm_interface_samples_history
> from integer to bigint and it should fix the issue, but I did not test on
> 4.1.
> 
> like in:
> https://gerrit.ovirt.org/#/c/87139/4/packaging/dbscripts/upgrade/
> 04_02_0020__updated_vm_interface_history_id_to_bigint.sql

I created a secondary database to test the change.

I tried doing it that way, however, the update fails because the history_id fields are referenced by the v4_1_statistics_vms_interfaces_resources_usage_daily view.

bla=# SELECT fn_db_change_column_type('vm_interface_daily_history', 'history_id', 'integer', 'bigint');
ERROR:  cannot alter type of a column used by a view or rule
DETALLE:  rule _RETURN on view v4_1_statistics_vms_interfaces_resources_usage_daily depends on column "history_id"
CONTEXTO:  SQL statement "ALTER TABLE vm_interface_daily_history ALTER COLUMN history_id TYPE bigint"
PL/pgSQL function fn_db_change_column_type(character varying,character varying,character varying,character varying) line 9 at EXECUTE statement
bla=# SELECT fn_db_change_column_type('vm_interface_hourly_history', 'history_id', 'integer', 'bigint');
ERROR:  cannot alter type of a column used by a view or rule
DETALLE:  rule _RETURN on view v4_1_statistics_vms_interfaces_resources_usage_hourly depends on column "history_id"
CONTEXTO:  SQL statement "ALTER TABLE vm_interface_hourly_history ALTER COLUMN history_id TYPE bigint"
PL/pgSQL function fn_db_change_column_type(character varying,character varying,character varying,character varying) line 9 at EXECUTE statement
bla=# SELECT fn_db_change_column_type('vm_interface_samples_history', 'history_id', 'integer', 'bigint');
ERROR:  cannot alter type of a column used by a view or rule
DETALLE:  rule _RETURN on view v4_1_statistics_vms_interfaces_resources_usage_samples depends on column "history_id"
CONTEXTO:  SQL statement "ALTER TABLE vm_interface_samples_history ALTER COLUMN history_id TYPE bigint"
PL/pgSQL function fn_db_change_column_type(character varying,character varying,character varying,character varying) line 9 at EXECUTE statement

Comment 4 Shirly Radco 2019-06-10 09:39:20 UTC
Please try adding a file to your 
packaging/dbscripts/upgrade/

like the one in the patch:
packaging/dbscripts/upgrade/04_02_0020__updated_vm_interface_history_id_to_bigint.sql

(make sure that the file number is higher then the last upgrade script)
After that run 
schema.sh -d ovirt_engine_history -u <user> -c apply
that should drop the views make the alterations and recreate the views.

Comment 5 nicolas 2019-06-10 10:47:32 UTC
/usr/share/ovirt-engine-dwh/dbscripts/upgrade# cat 04_01_0021_bigint_instead_of_int.sql
SELECT fn_db_change_column_type('vm_interface_daily_history', 'history_id', 'integer', 'bigint');
SELECT fn_db_change_column_type('vm_interface_hourly_history', 'history_id', 'integer', 'bigint');
SELECT fn_db_change_column_type('vm_interface_samples_history', 'history_id', 'integer', 'bigint');

/usr/share/ovirt-engine-dwh/dbscripts/upgrade# PGPASSWORD=... /usr/share/ovirt-engine-dwh/dbscripts/schema.sh -d ovirt_engine_history -u ovirt_engine_history -c apply
Creating schema ovirt_engine_history@localhost:5432/ovirt_engine_history
Saving custom users permissions on database objects...
upgrade script detected a change in Config, View or Stored Procedure...
Running upgrade sql script '/usr/share/ovirt-engine-dwh/dbscripts/upgrade/pre_upgrade/add_comment_col_to_schema_version.sql'...
Running upgrade sql script '/usr/share/ovirt-engine-dwh/dbscripts/upgrade/pre_upgrade/add_function_to_update_enum_table.sql'...
Running upgrade sql script '/usr/share/ovirt-engine-dwh/dbscripts/upgrade/pre_upgrade/changed_len_of_installed_by.sql'...
Running upgrade sql script '/usr/share/ovirt-engine-dwh/dbscripts/upgrade/pre_upgrade/set_default_lang_code.sql'...
Running upgrade sql script '/usr/share/ovirt-engine-dwh/dbscripts/upgrade/pre_upgrade/set_etl_minimal_version.sql'...
Running upgrade sql script '/usr/share/ovirt-engine-dwh/dbscripts/upgrade/04_01_0021_bigint_instead_of_int.sql'...
Creating views API 3.6...
Creating views API 4.0...
Creating views API 4.1...
Creating ovirt engine reports views...
Creating stored procedures...
Creating stored procedures from /usr/share/ovirt-engine-dwh/dbscripts/common_sp.sql...
Creating stored procedures from /usr/share/ovirt-engine-dwh/dbscripts/create_sp.sql...
Running upgrade sql script '/usr/share/ovirt-engine-dwh/dbscripts/upgrade/post_upgrade/0010_change_enum_lang_code.sql'...
Applying custom users permissions on database objects...

Having a look at the Dashboard, this seems to have fixed the problem.

Just one more question: Once I upgrade to 4.2, will I have any issue because version 4.2 expects to be those fields integers and not bigints? In other words, should I undo this change before upgrading to 4.2?

Thanks for the help.

Comment 6 Shirly Radco 2019-06-10 11:33:03 UTC
That is great! I'm very happy to hear that.
No. There shouldn't be any issue on upgrade.
Can we close this bug?

Comment 7 nicolas 2019-06-10 11:33:58 UTC
Sure. Thanks for the help.

Comment 8 Sandro Bonazzola 2019-06-13 06:28:44 UTC
Closing with resolution EOL since oVirt 4.1 reached end of life and no fix will be pushed to that version.


Note You need to log in before you can comment on or make changes to this bug.