Bug 1331168

Summary: engine-setup should check postgresql version compatibility for remote DBs
Product: [oVirt] ovirt-engine Reporter: Jiri Belka <jbelka>
Component: Setup.EngineAssignee: Yedidyah Bar David <didi>
Status: CLOSED CURRENTRELEASE QA Contact: Jiri Belka <jbelka>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.0.0CC: bugs, dfediuck, didi, emesika, fabrice.bacchella, michal.skrivanek, sbonazzo, ylavi
Target Milestone: ovirt-4.0.2Keywords: Reopened
Target Release: 4.0.2.4Flags: rule-engine: ovirt-4.0.z+
rule-engine: exception+
rule-engine: planning_ack+
dfediuck: devel_ack+
pstehlik: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: engine-setup needs identical postgresql client and server versions, but didn't check for this. Consequence: If a remote database was used, of version different than the client version on the engine machine, sometimes hard-to-understand, random errors might happen. Fix: engine-setup now explicitly verifies that client and server postgresql versions are identical, and behaves accordingly - for interactive sessions, prompts again the user, and for unattended ones just aborts, both with a more meaningful error message. Result: Incompatible postgresql client and server versions cause easier to understand and earlier errors.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-12 14:31:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1324882    

Description Jiri Belka 2016-04-27 20:30:01 UTC
Description of problem:

I'm testing migration from 3.6 EL6 with remote DBs (PG 8.x) to 4.0 EL7 while still using remote DBs (which I restored from backup).

Problem is 4.0 supports PG 9.x but I was still original PG 8.x. engine-setup from 4.0 should know it needs PG 9.x and should check PG version even for remote DBs.

~~~
...
[ INFO  ] Creating/refreshing Engine database schema
[ ERROR ] Failed to execute stage 'Misc configuration': Command '/usr/share/ovirt-engine/dbscripts/schema.sh' failed to execute
[ INFO  ] Yum Performing yum transaction rollback
[ INFO  ] Rolling back database schema
[ INFO  ] Clearing Engine database testengine
[ INFO  ] Restoring Engine database testengine
[ INFO  ] Restoring file '/var/lib/ovirt-engine/backups/engine-20160427200207.ZpK3CP.dump' to database 10.34.63.204:testengine.
[ ERROR ] Errors while restoring testengine database, please check the log file for details
[ INFO  ] Stage: Clean up
          Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20160427200150-j6u8z0.log
[ INFO  ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20160427200254-setup.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Execution of setup failed
~~~

The failure above is not clear about PG version incompatibility. Checking PG version remotely is easy:

~~~
]# psql -h 10.34.63.204 -U testengine -d testengine -W -c 'SELECT version();'
Password for user testengine: 
                                                      version                                                      
-------------------------------------------------------------------------------------------------------------------
 PostgreSQL 8.4.20 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit
(1 row)
~~~

Setup log reveals syntax issue related to DB.

~~~
...
psql:/usr/share/ovirt-engine/dbscripts/materialized_views_sp.sql:125: ERROR:  invalid type name "materialized_views % ROWTYPE"
CONTEXT:  compilation of PL/pgSQL function "creatematerializedview" near line 1
FATAL: Cannot execute sql command: --file=/usr/share/ovirt-engine/dbscripts/materialized_views_sp.sql

2016-04-27 20:02:16 DEBUG otopi.context context._executeMethod:142 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/db/schema.py", line 302, in _misc
    oenginecons.EngineDBEnv.PGPASS_FILE
  File "/usr/lib/python2.7/site-packages/otopi/plugin.py", line 931, in execute
    command=args[0],
RuntimeError: Command '/usr/share/ovirt-engine/dbscripts/schema.sh' failed to execute
2016-04-27 20:02:16 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Misc configuration': Command '/usr/share/ovirt-engine/dbscripts/schema.sh' failed to execute
~~~

Migration works OK if DB is PG 9.x and local.

Version-Release number of selected component (if applicable):
ovirt-engine-setup-base-4.0.0-0.0.master.20160427051413.git340728a.el7.centos.noarch

How reproducible:
100%

Steps to Reproduce:
1. 3.6 EL6 with remote DBs using PG 8.x
2. engine-backup all
3. delete remote DBs and recreate then (see engine-backup --help)
4. restore all using engine-backup
5. engine-setup

Actual results:
engine-setup on 4.0 fails if it is using remote DBs which run on PG 8.x due to syntax

Expected results:
engine-setup should detect remote DBs are on incompatibile PG version and state clearly what should be done

Additional info:

Comment 2 Red Hat Bugzilla Rules Engine 2016-05-05 13:20:49 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 3 Yedidyah Bar David 2016-05-08 07:34:21 UTC
Eli - please reply here on bug 1333746 comment 1. Thanks.

Comment 4 Yedidyah Bar David 2016-05-08 07:35:10 UTC
BTW, not sure why this is marked "Regression". IIRC this bug existed "forever".

Comment 5 Eli Mesika 2016-05-08 08:59:55 UTC
see https://bugzilla.redhat.com/show_bug.cgi?id=1333746#c3

Comment 6 Yedidyah Bar David 2016-05-08 09:08:41 UTC
What exactly should we check re client and server versions?

Comment 7 Eli Mesika 2016-05-08 09:50:44 UTC
el6 => 8.4.x 
el7 => 9.2.x
fedora 23 => 9.4.x

Comment 8 Yedidyah Bar David 2016-05-08 10:10:50 UTC
(In reply to Jiri Belka from comment #0)
> Description of problem:
> 
> I'm testing migration from 3.6 EL6 with remote DBs (PG 8.x) to 4.0 EL7 while
> still using remote DBs (which I restored from backup).
> 
> Problem is 4.0 supports PG 9.x but I was still original PG 8.x. engine-setup
> from 4.0 should know it needs PG 9.x and should check PG version even for
> remote DBs.

That's a reasonable flow, probably best covered by documentation, depending on how we handle the rest.

RHEV 3.6 docs say to use a RHEL6 machine as a remote database.

> Expected results:
> engine-setup should detect remote DBs are on incompatibile PG version and
> state clearly what should be done

I understand that the main cause for opening this bug is bug 1324882, meaning allow using a remote EnterpriseDB 9.5.

Let's move the discussion there for now.

Comment 9 Yedidyah Bar David 2016-05-08 11:59:59 UTC
Closing as duplicate for now. If you think something else is needed on engineering side, before deciding how to continue with bug 1324882, please reopen.

For RHEV 4.0 on RHEL7, we probably want documentation to be updated to say that remote db should be on RHEL7 as well. Feel free to open a doc bug for this. There are probably many other doc changes needed for 4.0, didn't check currently open bugs.

For upstream, I do not think we should do any code change. We support the engine on many different OSes, users can of course install many versions of a remote pg on many different OSes, and I am not aware of actual compatibility complaints. I guess most people that do this either simply use same version.

*** This bug has been marked as a duplicate of bug 1324882 ***

Comment 10 Yaniv Lavi 2016-05-08 12:02:26 UTC
This has nothing to do with that RFE. This is to make sure you are working with the correct version of PGSQL. If we add support for another DB then we can consider other options to check that db.

Comment 11 Yedidyah Bar David 2016-05-08 15:13:09 UTC
(In reply to Yaniv Dary from comment #10)
> This has nothing to do with that RFE. This is to make sure you are working
> with the correct version of PGSQL. If we add support for another DB then we
> can consider other options to check that db.

I think it's actually related to that RFE.

Otherwise, we can just require same version. Is this enough?

Comment 12 Yaniv Lavi 2016-05-09 11:21:28 UTC
(In reply to Yedidyah Bar David from comment #11)
> (In reply to Yaniv Dary from comment #10)
> > This has nothing to do with that RFE. This is to make sure you are working
> > with the correct version of PGSQL. If we add support for another DB then we
> > can consider other options to check that db.
> 
> I think it's actually related to that RFE.
> 
> Otherwise, we can just require same version. Is this enough?

We should require 9.2 minimum, since that is what we want to support from now on.

Comment 13 Yedidyah Bar David 2016-05-10 09:23:15 UTC
(In reply to Yaniv Dary from comment #12)
> (In reply to Yedidyah Bar David from comment #11)
> > Otherwise, we can just require same version. Is this enough?
> 
> We should require 9.2 minimum, since that is what we want to support from
> now on.

Is it enough to require _same_ version?

If you want to allow > 9.2, backup must be optional. 9.2 pg_dump refuses to backup a newer version db. This was already discussed.

Other than for bug 1324882, I don't see why we can't require _same_ version.

So please choose:

1. Require _same_ version

Or

2. Make backup optional - reopen bug 1333746

Our documentation currently says (1.), code does not enforce it.

Comment 14 Yaniv Lavi 2016-05-10 10:20:48 UTC
(In reply to Yedidyah Bar David from comment #13)
> (In reply to Yaniv Dary from comment #12)
> > (In reply to Yedidyah Bar David from comment #11)
> > > Otherwise, we can just require same version. Is this enough?
> > 
> > We should require 9.2 minimum, since that is what we want to support from
> > now on.
> 
> Is it enough to require _same_ version?
> 
> If you want to allow > 9.2, backup must be optional. 9.2 pg_dump refuses to
> backup a newer version db. This was already discussed.
> 
> Other than for bug 1324882, I don't see why we can't require _same_ version.
> 
> So please choose:
> 
> 1. Require _same_ version

What will happen in fedora installations that use db on CentOS?

> 
> Or
> 
> 2. Make backup optional - reopen bug 1333746
> 
> Our documentation currently says (1.), code does not enforce it.

Comment 15 Yedidyah Bar David 2016-05-10 10:29:45 UTC
(In reply to Yaniv Dary from comment #14)
> (In reply to Yedidyah Bar David from comment #13)
> > (In reply to Yaniv Dary from comment #12)
> > > (In reply to Yedidyah Bar David from comment #11)
> > > > Otherwise, we can just require same version. Is this enough?
> > > 
> > > We should require 9.2 minimum, since that is what we want to support from
> > > now on.
> > 
> > Is it enough to require _same_ version?
> > 
> > If you want to allow > 9.2, backup must be optional. 9.2 pg_dump refuses to
> > backup a newer version db. This was already discussed.
> > 
> > Other than for bug 1324882, I don't see why we can't require _same_ version.
> > 
> > So please choose:
> > 
> > 1. Require _same_ version
> 
> What will happen in fedora installations that use db on CentOS?
> 

Do we want to support this?

Not sure. It might work.

Comment 16 Yaniv Lavi 2016-05-23 13:25:56 UTC
oVirt 4.0 beta has been released, moving to RC milestone.

Comment 17 Yaniv Lavi 2016-05-23 13:26:24 UTC
oVirt 4.0 beta has been released, moving to RC milestone.

Comment 18 Yedidyah Bar David 2016-06-13 08:46:22 UTC
Not sure why this is High Severity.

Setting needinfo on Yaniv again, still not sure what to do. Current bottom line is comment 13.

Answering again comment 14:

(In reply to Yaniv Dary from comment #14)
> 
> What will happen in fedora installations that use db on CentOS?

I didn't test this, no idea if it works _today_. I think it does not work.

If we decide to require same version, it will not work (at least with the default pg versions supplied in them).

Comment 19 Yaniv Lavi 2016-06-19 13:23:48 UTC
Require same version for now and we can change, if we are required to.

Comment 20 Michal Skrivanek 2016-06-29 13:38:44 UTC
(In reply to Yedidyah Bar David from comment #4)
> BTW, not sure why this is marked "Regression". IIRC this bug existed
> "forever".

based on this removing a regression kw and consequently the blocker flag

Comment 21 Yaniv Lavi 2016-06-29 13:44:41 UTC
Readding blocker since user can easily get to a unsupported state like this.
If you use CentOS 7 you should use the CentOS postgres db version, same for fedora.

Comment 22 Yedidyah Bar David 2016-06-29 14:47:12 UTC
Still not sure, but if you say so, who am I to object? Especially since it's a pretty simple change.

Comment 23 Red Hat Bugzilla Rules Engine 2016-07-03 10:26:25 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 24 Jiri Belka 2016-07-29 08:15:47 UTC
fail, ovirt-engine-setup-plugin-ovirt-engine-common-4.0.2.1-0.1.el7ev.noarch

- client

# psql -V
psql (PostgreSQL) 9.2.15

- server

# psql -h jb-rhevm36.example.com -U remoteengine remoteengine -c "select version();"
                                                      version                                                      
-------------------------------------------------------------------------------------------------------------------
 PostgreSQL 8.4.20 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-17), 64-bit
(1 row)

While restoring on 4.0 client to a remote DB with PostgreSQL 8.4.20 there is _no_ PG versions check.

...
Preparing to restore:
- Setting credentials for Engine database 'remoteengine'
- Setting credentials for DWH database 'remotedwh'
- Setting credentials for Reports database 'remotereports'
- Unpacking file 'backup'
Restoring:
- Files
- Engine database 'remoteengine'
FATAL: Errors while restoring database remoteengine

Although the code has the diff:

# sed -n '888,912p' /usr/share/ovirt-engine/setup/ovirt_engine_setup/engine_common/database.py
            {
                'key': 'server_version',
                'expected': self._plugin.execute(
                    args=(
                        self.command.get('psql'),
                        '-V',
                    ),
                )[
                    1  # stdout
                ][
                    0  # first line. E.g. on Fedora 23: psql (PostgreSQL) 9.4.8
                ].split(
                    ' '
                )[
                    -1
                ],
                'ok': self._lower_equal,
                'check_on_use': True,
                'needed_on_create': False,
                'error_msg': _(
                    "Postgresql client version is '{expected}', whereas "
                    "the version on {pg_host} is '{current}'. "
                    "Please use a Postgresql server of version '{expected}'."
                ),
            },

Comment 25 Red Hat Bugzilla Rules Engine 2016-07-29 08:15:52 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 26 Red Hat Bugzilla Rules Engine 2016-07-29 08:19:02 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 27 Jiri Belka 2016-07-29 08:23:28 UTC
engine-setup != engine-backup, thus taking back

Comment 28 Jiri Belka 2016-07-29 08:24:40 UTC
ok, ovirt-engine-setup-plugin-ovirt-engine-common-4.0.2.1-0.1.el7ev.noarch

...
         Make sure that database can be accessed remotely.
         
          DWH database host [localhost]: jb-rhevm36.example.com
          DWH database port [5432]: 
          DWH database secured connection (Yes, No) [No]: 
          DWH database name [ovirt_engine_history]: remotedwh
          DWH database user [ovirt_engine_history]: remotedwh
          DWH database password: 
[ ERROR ] Cannot connect to DWH database: Postgresql client version is '9.2.15', whereas the version on jb-rhevm36.example.com is '8.4.20'. Please use a Postgresql server of version '9.2.15'.
          DWH database host [localhost]:

Comment 29 Fabrice Bacchella 2017-04-23 08:03:26 UTC
Because of incompatibility in major version, now ovirt requires minor version to be the same ?