Bug 1416050
Summary: | [downstream clone - 4.0.7] engine-setup refuses to run over a DB restored from an hosted-engine env if it wasn't in global maintenance mode at backup time | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | rhev-integ |
Component: | ovirt-engine | Assignee: | Simone Tiraboschi <stirabos> |
Status: | CLOSED ERRATA | QA Contact: | Artyom <alukiano> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | unspecified | CC: | alukiano, bgraveno, bugs, dfediuck, didi, lsurette, mavital, mgoldboi, nsednev, rbalakri, Rhev-m-bugs, srevivo, stirabos, ykaul |
Target Milestone: | ovirt-4.0.7 | Keywords: | Regression, Triaged, ZStream |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
This update fixes an issue where engine-setup would not run over a restored database if the backup was taken from a hosted-engine environment that was not in global maintenance mode.
|
Story Points: | --- |
Clone Of: | 1403903 | Environment: | |
Last Closed: | 2017-03-16 15:31:18 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Integration | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1403903 | ||
Bug Blocks: |
Description
rhev-integ
2017-01-24 13:41:57 UTC
Created attachment 1230829 [details]
ovirt-hosted-engine-setup-20161212162707-5ujgti.log
(Originally by Nikolai Sednev)
Adding sosreport from host here https://drive.google.com/a/redhat.com/file/d/0B85BEaDBcF88NDByaFhiZ3NwMW8/view?usp=sharing (Originally by Nikolai Sednev) 1. What's the difference between this bz and bug 1399053? 2. Which upgrade documentation are you following? (Originally by Doron Fediuck) (In reply to Doron Fediuck from comment #3) > 1. What's the difference between this bz and bug 1399053? > 2. Which upgrade documentation are you following? 1.I was trying to verify the 1399053, but as I've failed, so opened this bug. 2.I'm following the same steps for engine's db backup, which worked before, e.g. making engine db's backup, copying those to host on which I'm trying to make an upgrade, then during upgrade of the engine, providing the path to backup files on host. (Originally by Nikolai Sednev) (In reply to Nikolai Sednev from comment #4) > (In reply to Doron Fediuck from comment #3) > > 1. What's the difference between this bz and bug 1399053? > > 2. Which upgrade documentation are you following? > > 1.I was trying to verify the 1399053, but as I've failed, so opened this bug. This is indeed a different issue. The root cause here is: 2016-12-12 16:27:07 DEBUG otopi.plugins.otopi.packagers.dnfpackager dnfpackager._boot:163 Cannot initialize minidnf Traceback (most recent call last): File "/usr/share/otopi/plugins/otopi/packagers/dnfpackager.py", line 150, in _boot constants.PackEnv.DNF_DISABLED_PLUGINS File "/usr/share/otopi/plugins/otopi/packagers/dnfpackager.py", line 60, in _getMiniDNF from otopi import minidnf File "/usr/lib/python2.7/site-packages/otopi/minidnf.py", line 16, in <module> import dnf ImportError: No module named dnf > 2.I'm following the same steps for engine's db backup, which worked before, > e.g. making engine db's backup, copying those to host on which I'm trying to > make an upgrade, then during upgrade of the engine, providing the path to > backup files on host. We're now ensuring upgrade flows use the supported documentation, since there are multiple environments (el6/el7, HE/non-HE, 3.5/3.6). So unless this is a standard backup and restore (to the same version) please use only the relevant upgrade flow. (Originally by Doron Fediuck) (In reply to Doron Fediuck from comment #5) > (In reply to Nikolai Sednev from comment #4) > > (In reply to Doron Fediuck from comment #3) > > > 1. What's the difference between this bz and bug 1399053? > > > 2. Which upgrade documentation are you following? > > > > 1.I was trying to verify the 1399053, but as I've failed, so opened this bug. > > This is indeed a different issue. The root cause here is: > 2016-12-12 16:27:07 DEBUG otopi.plugins.otopi.packagers.dnfpackager > dnfpackager._boot:163 Cannot initialize minidnf > Traceback (most recent call last): > File "/usr/share/otopi/plugins/otopi/packagers/dnfpackager.py", line 150, > in _boot > constants.PackEnv.DNF_DISABLED_PLUGINS > File "/usr/share/otopi/plugins/otopi/packagers/dnfpackager.py", line 60, > in _getMiniDNF > from otopi import minidnf > File "/usr/lib/python2.7/site-packages/otopi/minidnf.py", line 16, in > <module> > import dnf > ImportError: No module named dnf > > > 2.I'm following the same steps for engine's db backup, which worked before, > > e.g. making engine db's backup, copying those to host on which I'm trying to > > make an upgrade, then during upgrade of the engine, providing the path to > > backup files on host. > We're now ensuring upgrade flows use the supported documentation, since > there are multiple environments (el6/el7, HE/non-HE, 3.5/3.6). So unless > this is a standard backup and restore (to the same version) please use only > the relevant upgrade flow. 2.In this case I've tried backing up and restoring the same db on the same appliance, just to see if /root issue was fixed, before getting a bit far and make the whole upgrade flow. (Originally by Nikolai Sednev) (In reply to Doron Fediuck from comment #5) > This is indeed a different issue. The root cause here is: > 2016-12-12 16:27:07 DEBUG otopi.plugins.otopi.packagers.dnfpackager > dnfpackager._boot:163 Cannot initialize minidnf > Traceback (most recent call last): > File "/usr/share/otopi/plugins/otopi/packagers/dnfpackager.py", line 150, > in _boot > constants.PackEnv.DNF_DISABLED_PLUGINS > File "/usr/share/otopi/plugins/otopi/packagers/dnfpackager.py", line 60, > in _getMiniDNF > from otopi import minidnf > File "/usr/lib/python2.7/site-packages/otopi/minidnf.py", line 16, in > <module> > import dnf > ImportError: No module named dnf This is not critical. In RHEL and CentOS there's no dnf and yum is used as fallback. (Originally by Sandro Bonazzola) As per instructions provided on the failure: RuntimeError: Engine setup failed on the appliance Please check its log on the appliance. Can you please attach engine setup logs from within the appliance? (Originally by Sandro Bonazzola) Created attachment 1231169 [details]
ovirt-engine-setup-20161212190249-4t6x44.log
(Originally by Nikolai Sednev)
The real issue is here: 2016-12-12 19:02:49 DEBUG otopi.ovirt_engine_setup.engine_common.database database.getCredentials:164 dbenv: {'OVESETUP_DWH_DB/database': 'ovirt_engine_history', 'OVESETUP_DWH_DB/host': 'localhost', 'OVESETUP_DWH_DB/port': 5432, 'OVESETUP_DWH_DB/securedHostValidation': False, 'OVESETUP_DWH_DB/secured': False, 'OVESETUP_DWH_DB/password': '1', 'OVESETUP_DWH_DB/user': 'ovirt_engine_history'} 2016-12-12 19:02:49 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:177 Database: 'None', Statement: ' select count(*) as count from pg_catalog.pg_tables where schemaname = 'public'; ', args: {} 2016-12-12 19:02:49 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:182 Creating own connection 2016-12-12 19:02:49 DEBUG otopi.ovirt_engine_setup.engine_common.database database.getCredentials:189 database connection failed Traceback (most recent call last): File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/engine_common/database.py", line 187, in getCredentials ] = self.isNewDatabase() File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/engine_common/database.py", line 370, in isNewDatabase transaction=False, File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/engine_common/database.py", line 191, in execute database=database, File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/engine_common/database.py", line 125, in connect sslmode=sslmode, File "/usr/lib64/python2.7/site-packages/psycopg2/__init__.py", line 164, in connect conn = _connect(dsn, connection_factory=connection_factory, async=async) OperationalError: could not connect to server: Connection refused Is the server running on host "localhost" (::1) and accepting TCP/IP connections on port 5432? could not connect to server: Connection refused Is the server running on host "localhost" (127.0.0.1) and accepting TCP/IP connections on port 5432? (Originally by Simone Tiraboschi) Nikolay, you uploaded the wrong engine-setup log file. The real issue was here on hosted-engine-setup logs: 2016-12-12 16:57:16 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND |- [ ERROR ] It seems that you are running your engine inside of the hosted-engine VM and are not in "Global Maintenance" mode. In that case you should put the system into the "Global Maintenance" mode before running engine-setup, or the hosted-engine HA agent might kill the machine, which might corrupt your data. hosted-engine-setup checked for global maintenance mode and it was fine: 2016-12-12 16:27:11 INFO otopi.plugins.gr_he_common.vm.misc misc._late_setup:65 Checking maintenance mode 2016-12-12 16:27:11 DEBUG otopi.plugins.gr_he_common.vm.misc misc._late_setup:68 hosted-engine-status: {'engine_vm_up': True, 'all_host_stats': {1: {'live-data': True, 'extra': 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=99379 (Mon Dec 12 16:26:40 2016)\nhost-id=1\nscore=3400\nmaintenance=False\nstate=GlobalMaintenance\nstopped=False\n', 'hostname': 'alma04.qa.lab.tlv.redhat.com', 'host-id': 1, 'engine-status': '{"health": "good", "vm": "up", "detail": "up"}', 'score': 3400, 'stopped': False, 'maintenance': False, 'crc32': 'c315b57c', 'host-ts': 99379}}, 'engine_vm_host': 'alma04.qa.lab.tlv.redhat.com', 'global_maintenance': True} The point is the engine-setup is checking for global-maintenance mode in the restored DB and not on the host. So, if you really took the DB when hosted-engine-setup asked you to create a DB you are fine. Nikolay, were you trying to restore a previous DB took outside global maintenance mode? The open gap is that engine-backup is not able to restore a DB took on a hosted-engine env when not in maintenance mode. (Originally by Simone Tiraboschi) I've manually created db backup on engine, using " engine-backup --mode=backup --file=nsednev1 --log=Log_1" I've took db backup from engine and copied it to host's none root folder, while host was not in global maintenance, then I've set host to global maintenance. When I was trying to make an upgrade, db had inside of it information about host, which was not in global maintenance. (Originally by Nikolai Sednev) If the user follows the instruction and he creates the backup when asked by hosted-engine-setup everything will be fine. (Originally by Simone Tiraboschi) (In reply to Simone Tiraboschi from comment #13) > If the user follows the instruction and he creates the backup when asked by > hosted-engine-setup everything will be fine. That's true for migration. For backup/restore, we do not require maint during backup. Perhaps we should, so far we tried not to. (Originally by didi) IMHO the docs should be updated at https://access.redhat.com/labs/rhevupgradehelper/, the first step there is "Step 1: Stop the ovirt engine service.", while "Step 4: Disable the high-availability agents on all the self-hosted engine hosts. To do this run the following command on any host in the cluster." should come first. (Originally by Nikolai Sednev) Also after engine's db was copied to host, the engine's service should get started back, otherwise it will fail on "hosted-engine --upgrade-appliance", as service must be up on engine during the upgrade. (Originally by Nikolai Sednev) Verified on: rhevm-4.0.7.4-0.1.el7ev.noarch # rpm -qa | grep hosted ovirt-hosted-engine-setup-2.0.4.3-3.el7ev.noarch ovirt-hosted-engine-ha-2.0.7-2.el7ev.noarch 1. Deploy HE environment 2. Add the storage domain to the engine(to start auto-import process) 3. Wait until the engine will have HE VM 4. Backup the engine: # engine-backup --mode=backup --file=engine.backup --log=engine-backup.log 5. Copy the backup file from the HE VM to the host 6. Clean host from HE deploy(reprovisioning) 7. Run the HE deployment again 8. Answer No on the question "Automatically execute engine-setup on the engine appliance on first boot (Yes, No)[Yes]? " 9. Enter to the HE VM and copy the backup file from the host to the HE VM 10. Run restore command: # engine-backup --mode=restore --scope=all --file=engine.backup --log=engine-restore.log --he-remove-storage-vm --he-remove-hosts --restore-permissions --provision-dwh-db --provision-db 11. Run engine setup: # engine-setup --offline 12. Finish HE deployment process Engine UP and have HE SD and HE VM in the active state Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0542.html |