Description of problem: In order to test that the amphora will be recreated if health monitor fails I executed "ifdown eth0" in amphora to kill the management interface. How reproducible: 100% Steps to Reproduce: 1. deploy Octavia 2. workaround - Manually configure "amphora-image" tag in the octavia.conf on all controllers and restart octavia dockers ( all) 3. create LB 4. Login into the amphora 5. # ifdown eth0 Actual results: LB goes into error state. Amphora recreated but goes into error state. Expected results: Amphora should be recreated and NOT go to error state . LB should NOT go into error state. Additional info: logs :http://pastebin.test.redhat.com/677403
2018-12-02 17:02:51.792 22 ERROR octavia.controller.worker.controller_worker AddrFormatError: failed to detect a valid IP address from None This seems to be a duplicate of [1] which you opened a while ago against OSP 13, and a patch has been up for review for some time now [2]. Could you please check if amp_boot_network_list is set and contains a list of IP addresses pointing to controller/network nodes?. Also please attach sosreports from controller/networker nodes so that we can confirm this is indeed the same bug? [1] https://bugzilla.redhat.com/show_bug.cgi?id=1577976 [2] https://review.openstack.org/#/c/596373/
(In reply to Carlos Goncalves from comment #2) > 2018-12-02 17:02:51.792 22 ERROR octavia.controller.worker.controller_worker > AddrFormatError: failed to detect a valid IP address from None > > This seems to be a duplicate of [1] which you opened a while ago against OSP > 13, and a patch has been up for review for some time now [2]. > > Could you please check if amp_boot_network_list is set and contains a list > of IP addresses pointing to controller/network nodes?. Also please attach > sosreports from controller/networker nodes so that we can confirm this is > indeed the same bug? > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1577976 > [2] https://review.openstack.org/#/c/596373/ [root@controller-0 ~]# grep -ir amp_boot_network_list /var/lib/config-data/puppet-generated/octavia/ /var/lib/config-data/puppet-generated/octavia/etc/octavia/octavia.conf:# - - amp_boot_network_list = 22222222-3333-4444-5555-666666666666 /var/lib/config-data/puppet-generated/octavia/etc/octavia/octavia.conf:# - - amp_boot_network_list = 11111111-2222-33333-4444-555555555555, 22222222-3333-4444-5555-666666666666 /var/lib/config-data/puppet-generated/octavia/etc/octavia/octavia.conf:# amp_boot_network_list = /var/lib/config-data/puppet-generated/octavia/etc/octavia/conf.d/octavia-worker/worker-post-deploy.conf:amp_boot_network_list = 1b21f0a9-b2c2-4cb1-b16b-a1e6ee9b796d [root@controller-0 ~]# [stack@undercloud-0 ~]$ . overcloudrc (overcloud) [stack@undercloud-0 ~]$ openstack network list +--------------------------------------+-------------+--------------------------------------+ | ID | Name | Subnets | +--------------------------------------+-------------+--------------------------------------+ | 1b21f0a9-b2c2-4cb1-b16b-a1e6ee9b796d | lb-mgmt-net | a81e607d-66dd-4951-960a-79d92338d9fe | +--------------------------------------+-------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ I see it is set unless you meant something else :)
Ok, this confirms that configuration option amp_boot_network_list is only loaded by the worker service. The health manager needs to consume this option too and it is not set, hence this is a duplicate of rhbz #1577976. Please reopen if you think this is a different issue. I cloned #1577976 (OSP 13) for OSP 14: https://bugzilla.redhat.com/show_bug.cgi?id=1655431 *** This bug has been marked as a duplicate of bug 1655431 ***