In an internal discussion, we decided to support either engine+dwh+grafana all on the same machine, or: engine on one machine, dwh+grafana on another. Verify that everything works fine. Check upgrade, engine-setup rollback, backup/restore, etc. See also bug 1846279.
Another relevant flow that I tested: - Install and setup engine (only, no dwh or grafana) on machine A - Install and setup dwh on machine B - Run on machine A 'engine-setup --reconfigure-optional-components', now reply 'Yes' to 'Configure Grafana?' This flow might have an advantage of allowing splitting the load between the machines (similarly to comment 0), but have only a single machine accessed from the outside for management (so simpler firewall configuration etc., also a bit simpler to setup). Main disadvantage is for people that want to split stuff for extra security - if e.g. a bug is found in grafana that allows taking over the machine it's running on, also the engine can be attacked this way.
fresh install of dwh 4.4.1 - OK backup-restore - failed can't start postgres service Jul 11 11:06:23 10-37-140-71 systemd[1]: postgresql.service: Start request repeated too quickly. Jul 11 11:06:23 10-37-140-71 systemd[1]: postgresql.service: Failed with result 'start-limit-hit'. Jul 11 11:06:23 10-37-140-71 systemd[1]: Failed to start PostgreSQL database server. upgrade from beta 4.4.0.2-1.el8ev - failed can't enrol certificates Failed to execute stage 'Environment customization': str, bytes or bytearray expected, not NoneType Tested in ovirt-engine-4.4.1.8-0.7.el8ev.noarch with ovirt-engine-dwh-setup-4.4.1.2-1.el8ev.noarch
Reproduction steps for backup-restore: 1. install dwh on separate machine 2. yum install ovirt-engine-tools-backup 3. engine-backup --file=/tmp/dwh.bck --log=/tmp/backup.log 4. engine-cleanup 5. engine-backup --mode=restore --provision-all-databases --file=/tmp/dwh.bck
Upgrade 4.3 -> 4.4 worked OK.
(In reply to Lucie Leistnerova from comment #4) > Reproduction steps for backup-restore: > 1. install dwh on separate machine > 2. yum install ovirt-engine-tools-backup > 3. engine-backup --file=/tmp/dwh.bck --log=/tmp/backup.log > 4. engine-cleanup > 5. engine-backup --mode=restore --provision-all-databases --file=/tmp/dwh.bck There is no bug in the code; this failed for you simply because the machine was too fast :-) systemd limits service restarts, by default, to up to 5 times per 10 seconds. Due to various reasons, some of which are probably irrelevant anymore, we restart PG quite many times in db/user provisioning, and adding creation of grafana user was probably what "broke the camel's back". Not sure what's the best solution, but IMO it's not here (in dwh), so opening for now a bug on (and pushing a patch for) the engine, to call 'systemctl reset-failed postgresql' after we restart it. No problem making current depend on new, but in principle you can verify it without any change, if it's a little bit slower. This was a bit hard to diagnose, because only partial logs were provided. comment 2 does clarify the reason, but the error there does not appear in any of the attached logs. restore/postgresql-07.log indeed does show that last 6 restarts happened during 4 seconds (first of which at 09:06:19.847, last at 09:06:23.054).
Merged the patch for bug 1856677, moving to MODIFIED.
Verified in ovirt-engine-4.4.2.3-0.6.el8ev.noarch ovirt-engine-dwh-4.4.2.1-1.el8ev.noarch grafana-6.3.6-2.el8_2.x86_64 Setup: engine on machine A, DWH+Grafana on machine B. Fresh installation - passed. Upgrade from 4.4.1 to 4.4.2 - passed. Backup & restore of DWH+Grafana - passed. oVirt SSO - failed (I will file separate bug for it as it's not a blocker. Currently I am investigating if it's caused by upgrade or backup&restore or if it's broken after fresh install).
This bugzilla is included in oVirt 4.4.2 release, published on September 17th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.