Bug 1332463

Summary: [RFE] restore: ensure that 3.6 on el6 backup can be restored on 3.6 on el7
Product: [oVirt] ovirt-engine Reporter: Yedidyah Bar David <didi>
Component: Backup-Restore.EngineAssignee: Yedidyah Bar David <didi>
Status: CLOSED CURRENTRELEASE QA Contact: Jiri Belka <jbelka>
Severity: unspecified Docs Contact:
Priority: urgent    
Version: 3.6.5CC: bugs, didi, jbelka, lsvaty, mgoldboi, nicolas, sbonazzo, ylavi
Target Milestone: ovirt-3.6.6Keywords: FutureFeature
Target Release: ---Flags: rule-engine: ovirt-3.6.z+
ylavi: planning_ack+
sbonazzo: devel_ack+
pnovotny: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Feature: Allow engine-backup on el7 to restore backups taken on el6. Reason: engine 4.0 does not support el6. Users that want to upgrade from 3.6 on el6 to 4.0 on el7 have to do this by backing up the engine on 3.6/el6 and restore on 4.0/el7. This feature, backported from 4.0, allows to do such a migration also in 3.6. Result: Using this flow, it's possible to migrate a el6 setup to el7: On the existing engine machine run: 1. engine-backup --mode=backup --file=engine-3.6.bck --log=backup.log On a new el7 machine: 2. Install engine, including dwh if it was set up on el6. 3. Copy engine-3.6.bck to the el7 machine 4. engine-backup --mode=restore --file=engine-3.6.bck --log=restore.log --provision-db --no-restore-permissions 5. engine-setup Check engine-backup documentation for other options, including using remote databases, extra grants/permissions, etc.
Story Points: ---
Clone Of: 1318580 Environment:
Last Closed: 2016-05-30 10:53:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1318580, 1323201, 1332088    
Bug Blocks: 1302228    

Description Yedidyah Bar David 2016-05-03 09:21:56 UTC
+++ This bug was initially created as a clone of Bug #1318580 +++

Description of problem:
Since we're not supporting el6 anymore on 4.0 we should ensure that a migration from el6 to el7 can be done easily.
- backup 3.6 engine running on el6
- install 4.0 engine on el7
- restore engine in the new environment

--- Additional comment from Yedidyah Bar David on 2016-03-17 12:18:49 IST ---

One potential problem I noticed re this (thanks to testing by nsednev) is that engine-setup starts the engine *before* the upgrade in asynctasks.py . Not sure how well this will work.

--- Additional comment from Mike McCune on 2016-03-29 01:28:36 IDT ---

This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

--- Additional comment from Yedidyah Bar David on 2016-04-07 08:45:36 IDT ---



--- Additional comment from Yedidyah Bar David on 2016-04-12 13:23:14 IDT ---

(In reply to Yedidyah Bar David from comment #1)
> One potential problem I noticed re this (thanks to testing by nsednev) is
> that engine-setup starts the engine *before* the upgrade in asynctasks.py .
> Not sure how well this will work.

Two problems I currently see with this:

1. we'll need to patch the new jboss location prior to starting the engine. We can decide that this isn't a major issue and do it during restore (not setup, where we try hard to not touch the system before STAGE_TRANSACTION_BEGIN ("Stage: Transaction setup").

2. The inherent issue is that we start a new engine code with an old engine db. This is different from an upgrade flow, where this stage runs with the old engine code installed.

Some solutions/ideas:

1. Seems like we can already optionally skip this action by passing 'OVESETUP_ASYNC/clearTasks=bool:False'. This will also prevent running the taskcleaner, need to think about making either optional (but the core problem still applies to both of them, although taskcleaner is a much smaller piece of code, hopefully easier to make it compatible with older db).

2. We can add to engine-backup --mode=backup, a check whether there are pending tasks etc., and if so, abort, unless an option (say '--force-backup-pending-tasks') is passed. Or perhaps the opposite ('--prevent-backup-pending-tasks'), assuming that the normal flow is routine backup, and migration is the special case.

Sandro, what do you think?

--- Additional comment from Yedidyah Bar David on 2016-04-12 14:34:48 IDT ---

Ignore this. I now realized that we clean these on restore anyway, so should never need to handle them. The only problem is that we always start the engine. I'll fix.

--- Additional comment from Jiri Belka on 2016-04-25 19:46:57 IDT ---

Can this be merged to 3.6? Otherwise migration from 3.6 EL6 to 3.6 EL7 does fail.

--- Additional comment from Yedidyah Bar David on 2016-05-01 09:50:18 IDT ---

This is not a clean cherry-pick, but should be simple enough. But see my comment on bug 1323201 .

Comment 1 Red Hat Bugzilla Rules Engine 2016-05-03 09:22:12 UTC
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.

Comment 2 Yedidyah Bar David 2016-05-03 14:36:03 UTC
copied doc text from 4.0 bug 1318580 and edited.

Comment 3 Jiri Belka 2016-05-09 17:28:34 UTC
an issue about ssl.conf and SSLMutex https://bugzilla.redhat.com/show_bug.cgi?id=1323220 is still present, please check it so this RFE can work, thx.

ovirt-engine-tools-backup-3.6.6.2-1.el7.centos.noarch

[ ERROR ] Failed to execute stage 'Closing up': Failed to start service 'httpd'
[ INFO  ] Stage: Clean up
          Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20160509165905-rqz9uz.log
[ INFO  ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20160509171250-setup.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Execution of setup failed
[root@10-34-60-185 ~]# less /var/log/ovirt-engine/setup/ovirt-engine-setup-20160509165905-rqz9uz.log 
[root@10-34-60-185 ~]# systemctl status httpd
● httpd.service - The Apache HTTP Server
   Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Mon 2016-05-09 17:12:50 CEST; 58s ago
     Docs: man:httpd(8)
           man:apachectl(8)
  Process: 25221 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited, status=1/FAILURE)
  Process: 25209 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND (code=exited, status=1/FAILURE)
 Main PID: 25209 (code=exited, status=1/FAILURE)

May 09 17:12:49 10-34-60-185.example.com systemd[1]: Starting The Apache HTTP Server...
May 09 17:12:50 10-34-60-185.example.com httpd[25209]: [Mon May 09 17:12:50.267027 2016] [so:warn] [pid 25209] AH01574: module ssl_module is already loaded, skipping
May 09 17:12:50 10-34-60-185.example.com httpd[25209]: AH00526: Syntax error on line 42 of /etc/httpd/conf.d/ssl.conf:
May 09 17:12:50 10-34-60-185.example.com httpd[25209]: Invalid command 'SSLMutex', perhaps misspelled or defined by a module not included in the server configuration
May 09 17:12:50 10-34-60-185.example.com systemd[1]: httpd.service: main process exited, code=exited, status=1/FAILURE
May 09 17:12:50 10-34-60-185.example.com kill[25221]: kill: cannot find process ""
May 09 17:12:50 10-34-60-185.example.com systemd[1]: httpd.service: control process exited, code=exited status=1
May 09 17:12:50 10-34-60-185.example.com systemd[1]: Failed to start The Apache HTTP Server.
May 09 17:12:50 10-34-60-185.example.com systemd[1]: Unit httpd.service entered failed state.
May 09 17:12:50 10-34-60-185.example.com systemd[1]: httpd.service failed.

Comment 5 Red Hat Bugzilla Rules Engine 2016-05-10 01:03:01 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 6 Yedidyah Bar David 2016-05-10 09:04:19 UTC
The attached setup log indicates it upgraded from 3.6.5 to 3.6.6.

Which version was used to do the restore?

The fix is in 3.6.6, not 3.6.5.

Comment 7 Jiri Belka 2016-05-10 11:16:28 UTC
ok, ovirt-engine-tools-backup-3.6.6.2-1.el7.centos.noarch

it does work find if restore was done with 3.6.6 engine-backup tool and later engine-setup from 3.6.6.

Comment 8 Nicolas Ecarnot 2017-03-29 14:13:54 UTC
Didi,

What if I use your proposed workflow from 3.6 on el6 to 4.*****1.1***** on el7 ?

Comment 9 Yedidyah Bar David 2017-03-29 14:24:54 UTC
(In reply to Nicolas Ecarnot from comment #8)
> Didi,
> 
> What if I use your proposed workflow from 3.6 on el6 to 4.*****1.1***** on
> el7 ?

engine-backup will currently allow *upgrading* the *engine* only from 3.6 to 4.0. That is, the 4.0 version of engine-backup will agree to restore a backup taken by 3.6 engine-backup.

Current bug isn't about that, but is about restoring on el7/3.6 an engine backup taken on el6/3.6 - that is, engine is 3.6 on both.

At the time, it was done to allow users to try to migrate the engine OS before upgrading to 4.0.

Today you can also:
1. backup on 3.6/el6
2. restore on 4.0/el7
3. upgrade to 4.1

We decided to not allow direct 3.6->4.1, see bug 1425788.