Bug 1413320
Summary: | keepalived starts too soon | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Steve Bennett <s.bennett> | ||||||
Component: | keepalived | Assignee: | Ryan O'Hara <rohara> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 25 | CC: | athmanem, bperkins, matthias, rohara, ruben, s.bennett | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | keepalived-1.3.5-1.fc26 keepalived-1.3.5-1.fc24 keepalived-1.3.5-1.fc25 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2017-04-12 14:50:07 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1425828 | ||||||||
Attachments: |
|
Created attachment 1240832 [details]
keepalived config file used to reproduce the problem
Thank you for reporting this. I am now seeing similar errors with keepalived-1.3.2 in F25 virtual machines. I am working on verifying your suggested fix. Once confirmed I will get an update out ASAP. Hi, Steve. I changed the keepalived.service file on my development machines to have "After=network.target" instead of "After=network-online.target" and it did not solve the problem. Are you using the LSB network service or NetworkManager? My machines were using NetworkManager. I came across this link [1] and it seems quite useful. There are two approaches we can take. First, users could enable NetworkManager-wait-online.service or systemd-networkd-wait-online.service, depending on use of network or NetworkManager. The other solution is to add "Wants=network-online.target" just below the "After=network-online.target". I tried this and is appears to work. Would you mind adding "Wants=network-online.target" to the keepalived.service file and testing? I want to make sure that this also works in your environment before doing a new build. Be sure to do 'systemctl daemon-reload' after you edit the service file. Thanks. [1] https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ Hi Ryan, Adding "Wants=network-online.target" seems to work for me too, and both the 'failing' behaviour (before making that change) and the 'succeeding' behaviour (after the change) make sense too - without that entry there's no guarantee that the network-online target will be pulled in. It also explains why the problem is difficult to reproduce: you need a fast/simple service configuration (I only have keepalived and httpd), a 'slow' network configuration (e.g. I'm using DHCP) and no other services that request the network-online target. Thanks! keepalived-1.3.5-1.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-a5c4484334 keepalived-1.3.5-1.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-4f7cb7b55f keepalived-1.3.5-1.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-5337793e7c keepalived-1.3.5-1.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report. keepalived-1.3.5-1.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report. keepalived-1.3.5-1.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report. |
Created attachment 1240831 [details] journal log showing that keepalived starts too soon Description of problem: Service dependency on keepalived appears to be incorrect. Keepalived attempts to start before network initialisation is complete, and in some circumstances this will cause service startup to fail Version-Release number of selected component (if applicable): 1.3.2-1 How reproducible: For me, it happens on every reboot (YMMV) Steps to Reproduce: 1. Configure keepalived to use VRRP 2. enable keepalived 3. reboot Actual results: keepalived fails to start, reporting error similar to: Keepalived_vrrp[642]: (VI_1): Cannot find an IP address to use for interface ens192 Expected results: keepalived starts successfully Additional info: Manually starting the service works as expected. Changing the systemd service dependency from 'network-online.target' to 'network.target' appears to fix the problem. I believe that the behaviour of 'network.target' vs 'network-online.target' is the opposite of the behaviour implied by the systemd documentation, but many other (working) network services also appear to use 'network.target' rather than 'network-online.target'. In my environment (VMs + fast storage) I'm seeing reasonably quick boot times (<5s from kernel start to network available), perhaps that's a contributory factor in the manifestation of this problem. I've attached a journal log showing that keepalived is being started before the network is ready (and also before any other network-dependent services)