Bug 1856677
Summary: | postgresql restarts too much, eventually fails | ||
---|---|---|---|
Product: | [oVirt] ovirt-engine | Reporter: | Yedidyah Bar David <didi> |
Component: | Setup.EngineCommon | Assignee: | Yedidyah Bar David <didi> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Pavel Novotny <pnovotny> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.4.1 | CC: | bugs, lleistne |
Target Milestone: | ovirt-4.4.2 | Flags: | sbonazzo:
ovirt-4.4?
sbonazzo: planning_ack? sbonazzo: devel_ack+ lleistne: testing_ack+ |
Target Release: | 4.4.2.1 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | ovirt-engine-4.4.2.1 | Doc Type: | Bug Fix |
Doc Text: |
Note to doc team: In principle this bug could happen also earlier, in 4.3 etc., if things are quick enough. But in 4.4, with grafana integration and backup/restore of (also) the grafana db user, it's much more likely, see dependent bug. Since grafana is new in 4.4, it's likely that people didn't actually run into this bug so far, so feel free to mark 'requires doc text' '-'. Actual suggested doc text, in case you do want it, follows:
engine-setup and ovirt-engine-provisiondb (used by engine-backup when provisioning databases) need to restart postgresql, several times, depending on exact flow. Under certain circumstances, and if this happened quickly enough, we could run into systemd's default maximum allowed restarts, which is 5 times every 10 seconds, and thus fail starting postgresql again and failing engine-setup/engine-backup. With this release, we run 'systemctl reset-failed postgresql' after every restart of postgresql, thus preventing running into systemd's limit.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2020-09-18 07:12:01 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Integration | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1847963 |
Description
Yedidyah Bar David
2020-07-14 08:20:23 UTC
Workaround: If you try restore and it fails due to this bug, you can change systemd to allow more restarts: 1. Edit /usr/lib/systemd/system/postgresql.service: Under section '[Unit]', add a line: StartLimitBurst=20 2. systemctl daemon-reload 3. Stop and clean PostgreSQL: systemctl stop postgresql rm -rf /var/lib/pgsql/data/* Then try restore again. Verified in ovirt-engine-4.4.2.3-0.6.el8ev Engine backup & restore with full scope succeeded, no PostgreSQL errors. This bugzilla is included in oVirt 4.4.2 release, published on September 17th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. |