1655315 – Health manager throws AddrFormatError when failing over amphorae

Bug 1655315 - Health manager throws AddrFormatError when failing over amphorae

Summary: Health manager throws AddrFormatError when failing over amphorae

Keywords:
Status:	CLOSED DUPLICATE of bug 1655431
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-octavia
Sub Component:
Version:	14.0 (Rocky)
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	Carlos Goncalves
QA Contact:	Alexander Stafeyev
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-12-02 17:06 UTC by Alexander Stafeyev
Modified:	2019-09-10 14:10 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-12-03 07:20:56 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Alexander Stafeyev 2018-12-02 17:06:27 UTC

Description of problem:
In order to test that the amphora will be recreated if health monitor fails I executed "ifdown eth0" in amphora to kill the management interface. 


How reproducible:
100%

Steps to Reproduce:
1. deploy Octavia
2. workaround -  Manually configure "amphora-image" tag in the octavia.conf on all controllers and restart octavia dockers ( all) 
3. create LB 
4. Login into the amphora
5. # ifdown eth0

Actual results:
LB goes into error state.
Amphora recreated but goes into error state.

Expected results:
Amphora should be recreated and NOT go to error state . 
LB should NOT go into error state. 

Additional info:
logs :http://pastebin.test.redhat.com/677403

Comment 2 Carlos Goncalves 2018-12-02 17:31:03 UTC

2018-12-02 17:02:51.792 22 ERROR octavia.controller.worker.controller_worker AddrFormatError: failed to detect a valid IP address from None

This seems to be a duplicate of [1] which you opened a while ago against OSP 13, and a patch has been up for review for some time now [2].

Could you please check if amp_boot_network_list is set and contains a list of IP addresses pointing to controller/network nodes?. Also please attach sosreports from controller/networker nodes so that we can confirm this is indeed the same bug?

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1577976
[2] https://review.openstack.org/#/c/596373/

Comment 3 Alexander Stafeyev 2018-12-02 21:53:41 UTC

(In reply to Carlos Goncalves from comment #2)
> 2018-12-02 17:02:51.792 22 ERROR octavia.controller.worker.controller_worker
> AddrFormatError: failed to detect a valid IP address from None
> 
> This seems to be a duplicate of [1] which you opened a while ago against OSP
> 13, and a patch has been up for review for some time now [2].
> 
> Could you please check if amp_boot_network_list is set and contains a list
> of IP addresses pointing to controller/network nodes?. Also please attach
> sosreports from controller/networker nodes so that we can confirm this is
> indeed the same bug?
> 
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1577976
> [2] https://review.openstack.org/#/c/596373/

[root@controller-0 ~]# grep -ir amp_boot_network_list /var/lib/config-data/puppet-generated/octavia/
/var/lib/config-data/puppet-generated/octavia/etc/octavia/octavia.conf:#  - - amp_boot_network_list = 22222222-3333-4444-5555-666666666666
/var/lib/config-data/puppet-generated/octavia/etc/octavia/octavia.conf:#  - - amp_boot_network_list = 11111111-2222-33333-4444-555555555555, 22222222-3333-4444-5555-666666666666
/var/lib/config-data/puppet-generated/octavia/etc/octavia/octavia.conf:# amp_boot_network_list =
/var/lib/config-data/puppet-generated/octavia/etc/octavia/conf.d/octavia-worker/worker-post-deploy.conf:amp_boot_network_list = 1b21f0a9-b2c2-4cb1-b16b-a1e6ee9b796d
[root@controller-0 ~]# 



[stack@undercloud-0 ~]$ . overcloudrc
(overcloud) [stack@undercloud-0 ~]$ openstack network list 
+--------------------------------------+-------------+--------------------------------------+
| ID                                   | Name        | Subnets                              |
+--------------------------------------+-------------+--------------------------------------+
| 1b21f0a9-b2c2-4cb1-b16b-a1e6ee9b796d | lb-mgmt-net | a81e607d-66dd-4951-960a-79d92338d9fe |
+--------------------------------------+-------------+--------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ 
 


I see it is set unless you meant something else :)

Comment 4 Carlos Goncalves 2018-12-03 07:20:56 UTC

Ok, this confirms that configuration option amp_boot_network_list is only loaded by the worker service. The health manager needs to consume this option too and it is not set, hence this is a duplicate of rhbz #1577976. Please reopen if you think this is a different issue.

I cloned #1577976 (OSP 13) for OSP 14: https://bugzilla.redhat.com/show_bug.cgi?id=1655431

*** This bug has been marked as a duplicate of bug 1655431 ***

Note You need to log in before you can comment on or make changes to this bug.