Bug 1342255
| Summary: | Replication stops if network connection is lost for over 60s | |||
|---|---|---|---|---|
| Product: | Red Hat CloudForms Management Engine | Reporter: | Nick Carboni <ncarboni> | |
| Component: | Replication | Assignee: | Nick Carboni <ncarboni> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Alex Newman <anewman> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 5.6.0 | CC: | cpelland, greartes, jdeubel, jhardy, obarenbo, simaishi | |
| Target Milestone: | GA | Keywords: | TestOnly, ZStream | |
| Target Release: | 5.7.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | distributed | |||
| Fixed In Version: | 5.7.0.0 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1344050 (view as bug list) | Environment: | ||
| Last Closed: | 2017-01-11 19:52:49 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1344050 | |||
|
Description
Nick Carboni
2016-06-02 18:45:10 UTC
Also, even if the WAL sender times out, it doesn't stop accumulating WAL logs. We would still need to remove the replication slot if the global server goes away for real. Given that, for now it seems like there is no downside to disabling the timeout, so I'll make that change for now, and it will still be configurable if we run into any issues with it. New commit detected on ManageIQ/manageiq-appliance/master: https://github.com/ManageIQ/manageiq-appliance/commit/cc675e11f71dfa4bbf26952094cdf1ee91c7c532 commit cc675e11f71dfa4bbf26952094cdf1ee91c7c532 Author: Nick Carboni <ncarboni> AuthorDate: Thu Jun 2 15:17:26 2016 -0400 Commit: Nick Carboni <ncarboni> CommitDate: Fri Jun 3 14:53:28 2016 -0400 Disable WAL sender timeout behavior The default behavior is to disable replication (the wal sender) after the destination is unreachable for 60 seconds This behavior is controlled by the wal_sender_timeout parameter in postgresql.conf (https://www.postgresql.org/docs/current/static/runtime-config-replication.html#GUC-WAL-SENDER-TIMEOUT) Setting this to 0 would disable the disconnect behavior and allow replication to continue when the connection was restored https://bugzilla.redhat.com/show_bug.cgi?id=1342255 TEMPLATE/var/opt/rh/rh-postgresql94/lib/pgsql/data/postgresql.conf.erb | 1 + 1 file changed, 1 insertion(+) |