Bug 1498351

Summary: [Docs][RFE][Upgrade] Provide upgrade procedure(s) for standard Manager upgrade from 4.1 to 4.2
Product: Red Hat Enterprise Virtualization Manager Reporter: Lucy Bopf <lbopf>
Component: DocumentationAssignee: Emma Heftman <eheftman>
Status: CLOSED CURRENTRELEASE QA Contact: Tahlia Richardson <trichard>
Severity: high Docs Contact:
Priority: high    
Version: 4.2.0CC: apinnick, asabadra, didi, emesika, gveitmic, jbelka, lbopf, lsurette, mkalinin, mperina, ratamir, rbalakri, rhodain, sradco, srevivo, trichard, ykaul
Target Milestone: ovirt-4.2.3Keywords: FutureFeature, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: docs-accepted
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-23 12:26:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Docs RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1516295, 1560462    
Bug Blocks: 1522620    

Description Lucy Bopf 2017-10-04 06:12:00 UTC
The procedure for upgrading a standard (not self-hosted) Red Hat Virtualization environment, including host updates, must be documented.

Assignee to work with QE to determine workflows, prerequisites, and known issues.

Comment 2 Lucy Bopf 2017-12-12 08:12:29 UTC
Correcting target for GA.

Comment 3 Yedidyah Bar David 2018-01-10 11:04:10 UTC
The current behavior if upgrading with a remote 9.2 database is to simply fail with this error message:

                'Please upgrade the PostgreSQL instance that serves the {db}'
                'database to {v} and retry.\n'
                'If the remote DBMS is on an EL7 system, install '
                'PostgreSQL and the scl utility, and use\n'
                '    postgresql-setup upgrade\n'
                'to upgrade it on the EL7 system.\n'
                'Otherwise please consult the documentation shipped with your '
                'PostgreSQL distribution.'

So the current assumed flow is:

1. Stop the engine and dwh (and whatever else using the database)

2. Upgrade the remote database using a procedure we'll provide, the skeleton of which is in the above error message (which we might want to amend a bit too).

3. Start the engine, dwh, etc., and see that all is ok. We need to decide if this step is allowed at all (can RHVM 4.1 work well with 9.5 PG?), optional, or even mandatory (or at least recommended).

4. Upgrade RHV-M.

Step (3.) might be long theoretically, up to several months or so, if it turns out that there are blockers for the upgrade. During this time, at least some things might not work well - at least engine-backup will refuse to backup. So we might want/need to provide also a procedure to revert (2.).

Comment 4 Sandro Bonazzola 2018-01-10 16:00:33 UTC
Martin, did we test RHV 4.1 with SCL PostgreSQL 9.5?
AFAIK oVirt 4.1 works on fedora 24 which has 9.5.7 so I assume this works well too.

Didi, flow in comment #3 looks good to me. I would ask to upgrade RHV-M right after PostgreSQL is updated. I think step 3 should be allowed but not required and strongly suggest to upgrade RHV-M and DWH as soon as possible after PostgreSQL update.

Comment 5 Martin Perina 2018-01-10 16:19:35 UTC
It's required that engine database (it doesn't matter if local or remote) is switched from 9.2 to 9.5 as a part of engine upgrade from 4.1 to 4.2. And we don't support anything in between (for example upgrading db to 9.5 and then testing that with 4.1 engine, although it could work).

Also we have tested upgrade for both local and remove dbs on CentOS 7, but as mentioned above in single step:

1. Stop 4.1 engine
2. Execute engine-setup upgrade to 4.2
  a. If db is managed by engine-setup, everything is automatic
  b. If db is unmanaged (local or remote) admin needs to perform upgrade and execute setup again afterwards

AFAIK we haven't test upgrade on Fedora, as Fedora 24 users (if any) were already on PG 9.5.

Also please be aware that we will need to add additional manual step around uuid-ossp extension for non-managed engine databases (covered by BZ1515635) for upgrades from 4.1 to 4.2.1 or 4.2.0 to 4.2.1.

I think we have already updatesd installation guide with Software Collection requirements for remote database [1], so we should also add to the upgrade guide the need for repo change, exact steps how to upgrade the database and after successfull upgrade how to uninstall PostgreSQL 9.2 from that machine.


[1] https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.2-beta/html/installation_guide/appe-preparing_a_remote_postgresql_database_for_use_with_the_red_hat_enterprise_virtualization_manager

Comment 6 Yedidyah Bar David 2018-01-11 06:43:43 UTC
(In reply to Martin Perina from comment #5)
> It's required that engine database (it doesn't matter if local or remote) is
> switched from 9.2 to 9.5 as a part of engine upgrade from 4.1 to 4.2.

"Required" as in "I want the code to enforce that", and/or "The docs should say it's required", and/or "I want relevant logs to be able to tell if it was like that". Which?

> And we
> don't support anything in between (for example upgrading db to 9.5 and then
> testing that with 4.1 engine, although it could work).

"don't support" is not very strong, imo.

engine-setup requires that pg client and server are same version. AFAIU the engine itself does not. So a user can upgrade a remote pg without the engine (or us) knowing. If you want to enforce that, or at least log/warn, perhaps open a bug against 4.1 engine.

Comment 10 Martin Perina 2018-01-17 09:04:16 UTC
(In reply to Yedidyah Bar David from comment #6)
> (In reply to Martin Perina from comment #5)
> > It's required that engine database (it doesn't matter if local or remote) is
> > switched from 9.2 to 9.5 as a part of engine upgrade from 4.1 to 4.2.
> 
> "Required" as in "I want the code to enforce that", and/or "The docs should
> say it's required", and/or "I want relevant logs to be able to tell if it
> was like that". Which?

"Required" meaning that this a step in the upgrade. We haven't tested RHV 4.1 with SCL PG 9.5 and RHV 4.2 cannot run on PG 9.2. So it's required to upgrade database to SCL 9.5 as a part of upgrade to RHV 4.2.

> 
> > And we
> > don't support anything in between (for example upgrading db to 9.5 and then
> > testing that with 4.1 engine, although it could work).
> 
> "don't support" is not very strong, imo.
> 
> engine-setup requires that pg client and server are same version. AFAIU the
> engine itself does not. So a user can upgrade a remote pg without the engine
> (or us) knowing. If you want to enforce that, or at least log/warn, perhaps
> open a bug against 4.1 engine.

Isn't the pg client - server performed only around y-releases (9.2.5 on client and 9.2.3 on server should be fine, but 9.5 on client on 9.2 on server is not)?

Comment 11 Yedidyah Bar David 2018-01-17 09:18:46 UTC
(In reply to Martin Perina from comment #10)
> Isn't the pg client - server performed only around y-releases (9.2.5 on
> client and 9.2.3 on server should be fine, but 9.5 on client on 9.2 on
> server is not)?

Not sure what you are asking. engine-setup currently requires exact same version. It was discussed in the past to loosen that, never saw actual bug/rfe.

PG itself allows more-or-less any version match for psql. pg_dump requires more strict matching. Didn't check recently details. But it makes sense - you might not be able to correctly backup (and restore) a database if you do not know about its newer features. So we did the same for engine-setup.

Comment 12 Jiri Belka 2018-01-25 16:20:01 UTC
Following upgrade flow for 4.1 using remote db worked for me but it is not using "in-place" upgrade, it does use 'postgresql-setup upgrade':

-- engine node

# systemctl stop ovirt-engine

-- on db node

# yum install rh-postgresql95 rh-postgresql95-postgresql-contrib
# systemctl is-active postgresql
inactive

# scl enable rh-postgresql95 -- postgresql-setup upgrade
WARNING: using obsoleted argument syntax, try --help
WARNING: arguments transformed to: postgresql-setup --upgrade --unit rh-postgresql95-postgresql
 * upgrading from 'postgresql.service' to 'rh-postgresql95-postgresql.service'
 * Upgrading database.
 * Upgraded OK.
WARNING: The configuration files were replaced by default configuration.
WARNING: The previous configuration and data are stored in folder
WARNING: /var/lib/pgsql/data.
 * See /var/lib/pgsql/upgrade_rh-postgresql95-postgresql.log for details.

# systemctl start rh-postgresql95-postgresql.service
# systemctl status rh-postgresql95-postgresql.service
● rh-postgresql95-postgresql.service - PostgreSQL database server
   Loaded: loaded (/usr/lib/systemd/system/rh-postgresql95-postgresql.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2018-01-25 15:34:50 CET; 3s ago
  Process: 21617 ExecStart=/opt/rh/rh-postgresql95/root/usr/libexec/postgresql-ctl start -D ${PGDATA} -s -w -t ${PGSTARTTIMEOUT} (code=exited, status=0/SUCCESS)
  Process: 21614 ExecStartPre=/opt/rh/rh-postgresql95/root/usr/libexec/postgresql-check-db-dir %N (code=exited, status=0/SUCCESS)
 Main PID: 21623 (postgres)
   CGroup: /system.slice/rh-postgresql95-postgresql.service
           ├─21623 /opt/rh/rh-postgresql95/root/usr/bin/postgres -D /var/opt/rh/rh-postgresql95/lib/pgsql/data
           ├─21624 postgres: logger process   
           ├─21626 postgres: checkpointer process   
           ├─21627 postgres: writer process   
           ├─21628 postgres: wal writer process   
           ├─21629 postgres: autovacuum launcher process   
           └─21630 postgres: stats collector process   

Jan 25 15:34:49 10-37-137-41.rhev.lab.eng.brq.redhat.com systemd[1]: Starting PostgreSQL database server...
Jan 25 15:34:49 10-37-137-41.rhev.lab.eng.brq.redhat.com postgresql-ctl[21617]: LOG:  redirecting log output to logging collect...ess
Jan 25 15:34:49 10-37-137-41.rhev.lab.eng.brq.redhat.com postgresql-ctl[21617]: HINT:  Future log output will appear in directo...g".
Jan 25 15:34:50 10-37-137-41.rhev.lab.eng.brq.redhat.com systemd[1]: Started PostgreSQL database server.
Hint: Some lines were ellipsized, use -l to show in full.

# su - postgres
$ . /opt/rh/rh-postgresql95/enable
$ psql
psql (9.5.9)
Type "help" for help.

postgres=# show data_directory ;
               data_directory               
--------------------------------------------
 /var/opt/rh/rh-postgresql95/lib/pgsql/data
(1 row)


- for both remote dbs: engine/remotedb, ovirt_engine_history/remoteodb

# su - postgres -c "scl enable rh-postgresql95 -- psql -d remoteodb"
psql (9.5.9)
Type "help" for help.

remoteodb=# DROP FUNCTION IF EXISTS uuid_generate_v1();
DROP FUNCTION
remoteodb=# CREATE EXTENSION "uuid-ossp";
CREATE EXTENSION

- edit postgresql 9.5 configuration

# vi /var/opt/rh/rh-postgresql95/lib/pgsql/data/pg_hba.conf 

host    remotedb             remotedb             0.0.0.0/0            md5
host    remotedb             remotedb             ::/32            md5
host    remotedb             remotedb             ::/128            md5
host    remoteodb             remoteodb             0.0.0.0/0            md5
host    remoteodb             remoteodb             ::/32            md5
host    remoteodb             remoteodb             ::/128            md5

# vi /var/opt/rh/rh-postgresql95/lib/pgsql/data/postgresql.conf 

autovacuum_vacuum_scale_factor='0.01'
autovacuum_analyze_scale_factor='0.075'
autovacuum_max_workers='6'
maintenance_work_mem='65536'
max_connections='150'
work_mem = 8192


# systemctl enable rh-postgresql95-postgresql.service
Created symlink from /etc/systemd/system/multi-user.target.wants/rh-postgresql95-postgresql.service to /usr/lib/systemd/system/rh-postgresql95-postgresql.service.
# systemctl status postgresql
● postgresql.service - PostgreSQL database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Thu 2018-01-25 15:31:16 CET; 15min ago
 Main PID: 3288 (code=exited, status=0/SUCCESS)

Jan 25 14:15:27 10-37-137-41.rhev.lab.eng.brq.redhat.com systemd[1]: Starting PostgreSQL database server...
Jan 25 14:15:28 10-37-137-41.rhev.lab.eng.brq.redhat.com systemd[1]: Started PostgreSQL database server.
Jan 25 15:31:15 10-37-137-41.rhev.lab.eng.brq.redhat.com systemd[1]: Stopping PostgreSQL database server...
Jan 25 15:31:16 10-37-137-41.rhev.lab.eng.brq.redhat.com systemd[1]: Stopped PostgreSQL database server.
# systemctl disable postgresql
Removed symlink /etc/systemd/system/multi-user.target.wants/postgresql.service.

-- engine node

# engine-setup

...and success.

Comment 15 Lucy Bopf 2018-04-13 01:53:59 UTC
Accepting into the GA program and assigning to Emma for review.

I'm bumping up the estimate by a couple of hours to account for any investigation required for the extra points that have been added to this bug since the initial estimate was made.

Comment 16 Lucy Bopf 2018-04-13 04:33:46 UTC
Adding dependency on repository update BZs for GA. The same repo information can be used in the upgrade procedures.

Comment 17 Emma Heftman 2018-04-15 09:36:37 UTC
Hi Didi
1. Do you have an end-to-end process ready for me to document including local and remote DBs?

2. Have you resolved all the open issues discussed above such as this step:

"Start the engine, dwh, etc., and see that all is ok. We need to decide if this step is allowed at all (can RHVM 4.1 work well with 9.5 PG?), optional, or even mandatory (or at least recommended)."

3. Has the entire procedure been successfully tested end-to-end by Jiri?

Thanks!

Comment 18 Emma Heftman 2018-04-15 10:17:40 UTC
Hey Didi
Another question:

4. Does the 4.1 manager have to be upgraded to the latest minor version before running the upgrade? (this was a condition in the 4.0 > 4.1 upgrade)
Thanks

Comment 19 Emma Heftman 2018-04-15 10:19:48 UTC
Hi Lucy
Do you have a separate RFE for reviewing the section called "Upgrading to RHVH While Preserving Local Storage"?

https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html-single/upgrade_guide/#Upgrading_RHVH_Local_Storage

Or should I check this out as part of this RFE?

I know that in the past I reviewed it as a separate task.
Thanks!

Comment 20 Lucy Bopf 2018-04-16 05:07:10 UTC
(In reply to Emma Heftman from comment #19)
> Hi Lucy
> Do you have a separate RFE for reviewing the section called "Upgrading to
> RHVH While Preserving Local Storage"?
> 
> https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/
> html-single/upgrade_guide/#Upgrading_RHVH_Local_Storage
> 
> Or should I check this out as part of this RFE?
> 
> I know that in the past I reviewed it as a separate task.
> Thanks!

Hey Emma,

Thanks for checking in. Because we generally write the upgrade procedures from scratch with each release, I removed all of the 4.1-specific content from the 'Upgrading to Red Hat Virtualization 4.x' chapter, including that section. The first step is probably to confirm whether that procedure is still needed for 4.2, and what the scope of changes would be. If it adds too much additional time to the estimate for this BZ, and it's not critical for a minimum viable upgrade procedure, we can split it out into a new RFE to schedule separately.

Comment 23 Emma Heftman 2018-05-03 10:13:21 UTC
Hi Jiri

Please confirm that this is the procedure that you recommend. 

In particular, there are some steps from 7 onwards that I need to explain to the customer exactly what we are doing, so please fill in the blanks.


Also, my understanding is that this has only been tested on internal channels. Could you please test this again on beta channels (i'll document the non-beta channel as I'm documenting for GA)?

Procedure for remote dbs:

1. -- engine node

# systemctl stop ovirt-engine

2.  Subscribe either to the RHV Manager channel:

# subscription-manager repos --enable=rhel-7-server-rhv-4-beta-rpms

or to the SCL channel:

# subscription-manager repos --enable rhel-server-rhscl-7-rpms


3.on db node > Install postresql 9.5. packages

# yum install rh-postgresql95 rh-postgresql95-postgresql-contrib

4. Inactive postgresql
# systemctl is-active postgresql
inactive

5. Upgrade postgresql.service' to 'rh-postgresql95-postgresql.service

# scl enable rh-postgresql95 -- postgresql-setup upgrade
WARNING: using obsoleted argument syntax, try --help
WARNING: arguments transformed to: postgresql-setup --upgrade --unit rh-postgresql95-postgresql
 * upgrading from 'postgresql.service' to 'rh-postgresql95-postgresql.service'
 * Upgrading database.
 * Upgraded OK.
WARNING: The configuration files were replaced by default configuration.
WARNING: The previous configuration and data are stored in folder
WARNING: /var/lib/pgsql/data.
 * See /var/lib/pgsql/upgrade_rh-postgresql95-postgresql.log for details.

6. start the rh-postgresql95-postgresql.service and check its status:

# systemctl start rh-postgresql95-postgresql.service
# systemctl status rh-postgresql95-postgresql.service
● rh-postgresql95-postgresql.service - PostgreSQL database server
   Loaded: loaded (/usr/lib/systemd/system/rh-postgresql95-postgresql.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2018-01-25 15:34:50 CET; 3s ago
  Process: 21617 ExecStart=/opt/rh/rh-postgresql95/root/usr/libexec/postgresql-ctl start -D ${PGDATA} -s -w -t ${PGSTARTTIMEOUT} (code=exited, status=0/SUCCESS)
  Process: 21614 ExecStartPre=/opt/rh/rh-postgresql95/root/usr/libexec/postgresql-check-db-dir %N (code=exited, status=0/SUCCESS)
 Main PID: 21623 (postgres)
   CGroup: /system.slice/rh-postgresql95-postgresql.service
           ├─21623 /opt/rh/rh-postgresql95/root/usr/bin/postgres -D /var/opt/rh/rh-postgresql95/lib/pgsql/data
           ├─21624 postgres: logger process   
           ├─21626 postgres: checkpointer process   
           ├─21627 postgres: writer process   
           ├─21628 postgres: wal writer process   
           ├─21629 postgres: autovacuum launcher process   
           └─21630 postgres: stats collector process   

Jan 25 15:34:49 10-37-137-41.rhev.lab.eng.brq.redhat.com systemd[1]: Starting PostgreSQL database server...
Jan 25 15:34:49 10-37-137-41.rhev.lab.eng.brq.redhat.com postgresql-ctl[21617]: LOG:  redirecting log output to logging collect...ess
Jan 25 15:34:49 10-37-137-41.rhev.lab.eng.brq.redhat.com postgresql-ctl[21617]: HINT:  Future log output will appear in directo...g".
Jan 25 15:34:50 10-37-137-41.rhev.lab.eng.brq.redhat.com systemd[1]: Started PostgreSQL database server.
Hint: Some lines were ellipsized, use -l to show in full.

7. Please explain what is happening here.....

# su - postgres
$ . /opt/rh/rh-postgresql95/enable
$ psql
psql (9.5.9)
Type "help" for help.

postgres=# show data_directory ;
               data_directory               
--------------------------------------------
 /var/opt/rh/rh-postgresql95/lib/pgsql/data
(1 row)

8. What does this do??

- for both remote dbs: engine/remotedb, ovirt_engine_history/remoteodb

# su - postgres -c "scl enable rh-postgresql95 -- psql -d remoteodb"
psql (9.5.9)
Type "help" for help.

remoteodb=# DROP FUNCTION IF EXISTS uuid_generate_v1();
DROP FUNCTION
remoteodb=# CREATE EXTENSION "uuid-ossp";
CREATE EXTENSION

9. Edit postgresql 9.5 configuration - what exactly did you change here and what for???

# vi /var/opt/rh/rh-postgresql95/lib/pgsql/data/pg_hba.conf 

host    remotedb             remotedb             0.0.0.0/0            md5
host    remotedb             remotedb             ::/32            md5
host    remotedb             remotedb             ::/128            md5
host    remoteodb             remoteodb             0.0.0.0/0            md5
host    remoteodb             remoteodb             ::/32            md5
host    remoteodb             remoteodb             ::/128            md5


10. Again, what did you change in the file and what for??? Did the file contain other params and these are the ones that need to be changed??

# vi /var/opt/rh/rh-postgresql95/lib/pgsql/data/postgresql.conf 

autovacuum_vacuum_scale_factor='0.01'
autovacuum_analyze_scale_factor='0.075'
autovacuum_max_workers='6'
maintenance_work_mem='65536'
max_connections='150'
work_mem = 8192

11. Enable the rh-postgresql95-postgresql.service and check that it is functioning:

# systemctl enable rh-postgresql95-postgresql.service
Created symlink from /etc/systemd/system/multi-user.target.wants/rh-postgresql95-postgresql.service to /usr/lib/systemd/system/rh-postgresql95-postgresql.service.

# systemctl status postgresql
● postgresql.service - PostgreSQL database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Thu 2018-01-25 15:31:16 CET; 15min ago
 Main PID: 3288 (code=exited, status=0/SUCCESS)

Jan 25 14:15:27 10-37-137-41.rhev.lab.eng.brq.redhat.com systemd[1]: Starting PostgreSQL database server...
Jan 25 14:15:28 10-37-137-41.rhev.lab.eng.brq.redhat.com systemd[1]: Started PostgreSQL database server.
Jan 25 15:31:15 10-37-137-41.rhev.lab.eng.brq.redhat.com systemd[1]: Stopping PostgreSQL database server...
Jan 25 15:31:16 10-37-137-41.rhev.lab.eng.brq.redhat.com systemd[1]: Stopped PostgreSQL database server.


12. Can you explain what this is for. Basically, we install the new version and check that it is functioning and then disable it in order to upgrade the manager?
# systemctl disable postgresql
Removed symlink /etc/systemd/system/multi-user.target.wants/postgresql.service.

13. On the Manager machine, run engine-setup to upgrade the Manager.

# engine-setup

Comment 24 Emma Heftman 2018-05-03 12:13:58 UTC
Shirly, as we discussed, I believe that this procedure must be reviewed by a DBA. 

Could you also review the procedure in the previous step and comment on any issues that you can think of.

Comment 25 Emma Heftman 2018-05-03 12:33:26 UTC
After discussing with Shirly, it seems we also need to stop the dwdh service

systemctl stop ovirt-engine-dwhd

which we don't have, so I'm going to add this as step 12 so that we stop both postgresl service and dwdh before upgrading the manager

Comment 26 Emma Heftman 2018-05-03 12:42:49 UTC
(In reply to Emma Heftman from comment #25)
> After discussing with Shirly, it seems we also need to stop the dwdh service
> 
> systemctl stop ovirt-engine-dwhd
> 
> which we don't have, so I'm going to add this as step 12 so that we stop
> both postgresl service and dwdh before upgrading the manager

Shirly, should both the dwhd and the postgresql services be restarted manually on the remote machines after upgrading the manager.

If yes:

Is this correct?
systemctl start ovirt-engine-dwhd
systemctl enable postgresql

Comment 28 Emma Heftman 2018-05-03 15:02:40 UTC
Hey Didi
In addition to reviewing the Upgrade procedure, I have some feedback with regard to the procedure itself.

Would it be possible to write a script to configure the 9.5 configure files rather than having to update the parameters manually?

Comment 32 Eli Mesika 2018-05-06 09:29:13 UTC
(In reply to Emma Heftman from comment #26)

> Is this correct?
> systemctl start ovirt-engine-dwhd
> systemctl enable postgresql

> systemctl restart ovirt-engine-dwhd
> systemctl restart postgresql

Comment 80 Raz Tamir 2018-05-11 09:44:49 UTC
Verified.
Executed the steps under 'Upgrading to Red Hat Virtualization Manager 4.2' section to upgrade rhevm-3 production environment

Comment 81 Emma Heftman 2018-05-11 12:13:24 UTC
Moving back to ON_QA. Verified is used by the documentation team.