Bug 1576434
Summary: | Amphorae ACTIVE_STANDBY topology fail to recover when the amphora-agent stops working | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Nir Magnezi <nmagnezi> | ||||||
Component: | openstack-octavia | Assignee: | Carlos Goncalves <cgoncalves> | ||||||
Status: | CLOSED DUPLICATE | QA Contact: | Bruna Bonguardo <bbonguar> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 13.0 (Queens) | CC: | amuller, cgoncalves, fiezzi, ihrachys, lpeer, majopela | ||||||
Target Milestone: | z8 | Keywords: | Triaged, ZStream | ||||||
Target Release: | 13.0 (Queens) | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2019-10-01 12:42:36 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1698576 | ||||||||
Attachments: |
|
The end result: $ openstack loadbalancer show nir_ha | grep provisioning_status | provisioning_status | ERROR Created attachment 1433833 [details]
lb deletion fails
Also fails to delete the ERROR state loadbalancer
(In reply to Nir Magnezi from comment #2) > Created attachment 1433833 [details] > lb deletion fails > > Also fails to delete the ERROR state loadbalancer Fixed in https://review.opendev.org/#/c/574215/ Looking at the log in comment #0, I see this: 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker Traceback (most recent call last): 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker File "/usr/lib/python2.7/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker result = task.execute(**arguments) 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker File "/usr/lib/python2.7/site-packages/octavia/controller/worker/tasks/amphora_driver_tasks.py", line 219, in execute 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker amphora, loadbalancer, amphorae_network_config) 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker File "/usr/lib/python2.7/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py", line 137, in post_vip_plug 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker net_info) 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker File "/usr/lib/python2.7/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py", line 388, in plug_vip 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker json=net_info) 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker File "/usr/lib/python2.7/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py", line 255, in request 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker _url = self._base_url(amp.lb_network_ip) + path 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker File "/usr/lib/python2.7/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py", line 241, in _base_url 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker if utils.is_ipv6_lla(ip): 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker File "/usr/lib/python2.7/site-packages/octavia/common/utils.py", line 64, in is_ipv6_lla 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker ip = netaddr.IPAddress(ip_address) 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker File "/usr/lib/python2.7/site-packages/netaddr/ip/__init__.py", line 306, in __init__ 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker 'address from %r' % addr) 2018-05-09 12:29:02.250 22 ERROR octavia.controller.worker.controller_worker AddrFormatError: failed to detect a valid IP address from None I believe this was fixed in BZ #1577976. Closing as duplicate. *** This bug has been marked as a duplicate of bug 1577976 *** |
Created attachment 1433832 [details] Failover attempt logs Description of problem: ======================= Amphorae ACTIVE_STANDBY topology fail to recover when the amphora-agent stops working. Tested this with a single controller topology. Version-Release number of selected component (if applicable): ============================================================= OSP13 openstack-octavia-common-2.0.1-4.el7ost.noarch openstack-octavia-health-manager-2.0.1-4.el7ost.noarch python-octavia-2.0.1-4.el7ost.noarch openstack-octavia-api-2.0.1-4.el7ost.noarch openstack-octavia-housekeeping-2.0.1-4.el7ost.noarch openstack-octavia-worker-2.0.1-4.el7ost.noarch Steps to Reproduce: =================== 1. Change amphora topology to ACTIVE_STANDBY 2. Restart Octavia services 3. Create a loadbalancer 4. Switch off the amphora-agent on the MASTER amphora Actual results: =============== Loadbalancer ends up in an ERROR state Expected results: ================= Should failover to the BACKUP amphora and spawn a new amphora as BACKUP. Additional info: ================ Attaching logs.