Description of problem: There should be a way to configure the routing daemon to do a final configure 'sync' command on the LTM when it's done doing it's updates. This will then use the F5's clustering (if configured) to sync configuration between the load balancers. This is important when you have a pair of LTMs setup for HA. This allows them to share the same configuration. Additional info: As a current workaround, multiple daemons should be used to configure multiple load balancers. Because we use a Topic, to send massages, the when you publish a message it goes to all the subscribers who are interested (routing daemons), so zero to many subscribers will receive a copy of the message. Note: Only subscribers who had an active subscription at the time the broker receives the message will get a copy of the message. (So its possible for loadbalancers to get out of sycn or configuration if connections are to the message broker are lost). And thus the driver for this request.
The iControl REST User Guides for 11.4, 11.5, or 11.6 do not document any config-sync API, but I found a report that such an API does in fact exist in 11.5 and later: https://devcentral.f5.com/questions/rest-api-and-config-sync-question Per the comments at the above link, the following should work: curl -svku admin:password https://bigip_host/mgmt/tm/util/tmsh -X POST -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{"apiOptions":"to-group sync-failover-1"}' I tried this command and got back an HTTP 400 Bad Request with the error message "Missing name." The reason for this error may be that the F5 deployment I have available for testing is not a clustered environment, so I have no device groups configured. Is there any chance you could test the above curl command in a clustered F5 deployment to confirm that it works and does what the customer is expecting? If it does, we can add the REST call to the routing daemon and add two settings in routing-daemon.conf for the device-group name ("sync-failover-1") and the interval at which the config-sync should be performed.
(In reply to Miciah Dashiel Butler Masters from comment #2) > the interval at which the config-sync should be performed. Shouldn't this just be run after each update event?
I get the impression that a config-sync is a somewhat heavy-weight operation, so we would want to rate-limit it. Whether the rate limit should be on the order of once per second, once per minute, or once per hour is something we may need to research or get some advice on from an F5 engineer.
(In reply to Miciah Dashiel Butler Masters from comment #2) > The iControl REST User Guides for 11.4, 11.5, or 11.6 do not document any > config-sync API, but I found a report that such an API does in fact exist in > 11.5 and later: > > https://devcentral.f5.com/questions/rest-api-and-config-sync-question > > Per the comments at the above link, the following should work: > > curl -svku admin:password https://bigip_host/mgmt/tm/util/tmsh -X POST > -H 'Content-Type: application/json' -H 'Accept: application/json' -d > '{"apiOptions":"to-group sync-failover-1"}' > > I tried this command and got back an HTTP 400 Bad Request with the error > message "Missing name." The reason for this error may be that the F5 > deployment I have available for testing is not a clustered environment, so I > have no device groups configured. > > Is there any chance you could test the above curl command in a clustered F5 > deployment to confirm that it works and does what the customer is expecting? > If it does, we can add the REST call to the routing daemon and add two > settings in routing-daemon.conf for the device-group name > ("sync-failover-1") and the interval at which the config-sync should be > performed. I get an HTTP 400 when running that command in a Sync-Failover configuration. However, I did find that this command successfully does the sync: curl -svku 'admin:password' https://bigip_host/mgmt/tm/cm -H 'Content-Type: application/json' -H 'Accept: application/json' -X POST -d '{"command":"run","utilCmdArgs":"config-sync to-group Sync_Failover"}'
Thanks, Nicholas Schuetz! What version of F5 did you test against? I'm going to guess that you have 11.4, and that the command that works for you is using an old API that works in 11.4 whereas the command that I provided is the new API in 11.5 onwards. If we're lucky, the old API also works on newer F5 versions. I tried your curl command against 11.6.0 and got "01070734:3: Configuration error: Device group (Sync_Failover) not found in device group sync", which suggests to me that we have indeed hit on the correct API.
(In reply to Miciah Dashiel Butler Masters from comment #6) > Thanks, Nicholas Schuetz! What version of F5 did you test against? I tested this on 11.5.2.
PR: https://github.com/openshift/origin-server/pull/6154 I will need to perform some manual testing and get the PR merged before I can mark this report ON_QA.
QE is starting to setting up F5 BIG-IP instance now in AWS. We could foresee need a lot of help from you. Thanks in advance. To reduce noise, about issues blocking QE setting up env, will discuss via email. Once env is set up, QE would verify this bug.
Commit pushed to master at https://github.com/openshift/origin-server https://github.com/openshift/origin-server/commit/f783dfab3a0555dab6170d1e7a39e4e5dcf37f1c routing-daemon: F5: Sync device-group on update routing-daemon.conf: Add commented-out example BIGIP_DEVICE_GROUP setting. F5IControlRestLoadBalancerModel#read_config: Read BIGIP_DEVICE_GROUP setting and assign it to @device_group. F5IControlRestLoadBalancerModel: Add update method that synchronizes the device group if @device_group is set. This commit fixes bug 1217572.
Created attachment 1071598 [details] Failed to create openshift_application_aliases during initialization
Do you encounter any problems if you `yum install ruby193-rubygem-rest-client` before starting the daemon?
Actually, some of those errors are coming from a problem that this pull request fixes: https://github.com/openshift/origin-server/pull/6234 Thanks!
That is great. I had created openshift_application_aliases manually. it is not a blocker.
When the update is called? It look like the update is in endless loop.
update is called by the daemon in its listen loop: https://github.com/openshift/origin-server/blob/f783dfab3a0555dab6170d1e7a39e4e5dcf37f1c/routing-daemon/lib/openshift/routing/daemon.rb#L243 The UPDATE_INTERVAL setting in /etc/openshift/routing-daemon.conf (with a default value of 5) specifies the interval at which the daemon calls update: https://github.com/openshift/origin-server/blob/f783dfab3a0555dab6170d1e7a39e4e5dcf37f1c/routing-daemon/conf/routing-daemon.conf#L6 https://github.com/openshift/origin-server/blob/f783dfab3a0555dab6170d1e7a39e4e5dcf37f1c/routing-daemon/lib/openshift/routing/daemon.rb#L87 If you are trying to trace the callpath, note that SimpleLoadBalancerController inherits its update method from LoadBalancerController, and these controllers' update method simply calls the update method of the model: https://github.com/openshift/origin-server/blob/f783dfab3a0555dab6170d1e7a39e4e5dcf37f1c/routing-daemon/lib/openshift/routing/controllers/simple.rb#L14 https://github.com/openshift/origin-server/blob/f783dfab3a0555dab6170d1e7a39e4e5dcf37f1c/routing-daemon/lib/openshift/routing/controllers/load_balancer.rb#L143 https://github.com/openshift/origin-server/blob/f783dfab3a0555dab6170d1e7a39e4e5dcf37f1c/routing-daemon/lib/openshift/routing/models/f5-icontrol-rest.rb#L337
The fix works, The ConfigSync Status from 'Awaiting Initial Sync' to 'in Sync after new application was created in openshift.
The F5 cluster can sync configure data now. I download the code and made some testing, the bug fix works well. Waiting for new puddle to do more verification.
Does the sync save and persist the F5 configuration as well?
(In reply to Nicholas Schuetz from comment #23) > Does the sync save and persist the F5 configuration as well? Yes, The data are saved and persist on the fail over F5 instance.
Move to Modified status. once the puddle is kick off, I will verify it.
rubygem-openshift-origin-routing-daemon-0.25.1.1-1.el6op wasn't in puddle-2-2-2015-09-17. Another puddle is required.
rubygem-openshift-origin-gear-placement-0.0.2.1-1.el6op should not have been in fixed_in for this Bugzilla report; rather, that package is related to bug 1241750. I am fixing that mistake in this Bugzilla report and in bug 1241750. I'll look into why you are still seeing that syntax error.
Verified and pass. The configuration can be sync to the standby F5 instance. 1 Create an Sync-Failover Manual Device Group 2 Fill the device group in routing-daemon.log 3 Create scale applications and add alias and etc. 4.The new pool/pool member are synced to standby F5 instance.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1844.html