Description of problem: In certain flows, e.g. engine+dwh+grafana restore, we restart postgresql quite a lot. If the machine is fast enough, we are hit by systemd's default limit of up to 5 restarts per 10 seconds. Version-Release number of selected component (if applicable): Current master How reproducible: Always, probably, on a fast-enough machine Steps to Reproduce: 1. Take a backup of 4.4 engine+dwh+grafana 2. Restore the backup 3. Actual results: If the machine is fast enough, one of the restarts will fail, e.g.: Jul 11 11:06:23 10-37-140-71 systemd[1]: postgresql.service: Start request repeated too quickly. Jul 11 11:06:23 10-37-140-71 systemd[1]: postgresql.service: Failed with result 'start-limit-hit'. Jul 11 11:06:23 10-37-140-71 systemd[1]: Failed to start PostgreSQL database server. Expected results: I think we want this to always succeed, and without requiring permanent changes to postgresql's configuration - so I think we want our code to call 'systemctl reset-failed postgresql' after restarting it. Additional info:
Workaround: If you try restore and it fails due to this bug, you can change systemd to allow more restarts: 1. Edit /usr/lib/systemd/system/postgresql.service: Under section '[Unit]', add a line: StartLimitBurst=20 2. systemctl daemon-reload 3. Stop and clean PostgreSQL: systemctl stop postgresql rm -rf /var/lib/pgsql/data/* Then try restore again.
Verified in ovirt-engine-4.4.2.3-0.6.el8ev Engine backup & restore with full scope succeeded, no PostgreSQL errors.
This bugzilla is included in oVirt 4.4.2 release, published on September 17th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.