Description of problem: Happened after Amphora VMs were shut-down (due to power outage), which did not update load balancer and Octavia service status. Version-Release number of selected component (if applicable): OSP: 13 Puddle: 2018-05-04.1 How reproducible: Always Steps to Reproduce: 1) Shutdown Amphora VMs, and watch for its status to become SHUTOFF. 2) Check loadbalancer status. 3) Check loadbalancer amphora list status. Actual results: (overcloud) 04:52:09.296 [stack@undercloud-0 ~]$ nova list --all +--------------------------------------+----------------------------------------------+----------------------------------+---------+------------+-------------+--------------------------------------------------+ | ID | Name | Tenant ID | Status | Task State | Power State | Networks | +--------------------------------------+----------------------------------------------+----------------------------------+---------+------------+-------------+--------------------------------------------------+ | cf6efbc5-d02f-4af7-9c70-4c488986a23a | amphora-9b983ed7-5a3b-4197-8981-1695cc8a0897 | 4ee04fbdad964bda99e6ac3b16c2398f | SHUTOFF | - | Shutdown | int_net=172.16.0.218; lb-mgmt-net=192.168.199.56 | | c51fdaff-16c3-4989-b291-511d199b05b9 | amphora-f3ad1038-e6ff-4868-a78d-1a6c7fcec3a1 | 4ee04fbdad964bda99e6ac3b16c2398f | SHUTOFF | - | Shutdown | int_net=172.16.0.225; lb-mgmt-net=192.168.199.57 | | 5cff72b9-8b46-4236-bd56-c7163a560849 | vm-rht-1 | f421dd896bcb47d28f692036f687fcd8 | SHUTOFF | - | Shutdown | int_net=172.16.0.216, 10.0.0.219 | | 9cb2f05b-ee90-4940-b191-17dc248dbf94 | vm-rht-2 | f421dd896bcb47d28f692036f687fcd8 | SHUTOFF | - | Shutdown | int_net=172.16.0.219, 10.0.0.214 | +--------------------------------------+----------------------------------------------+----------------------------------+---------+------------+-------------+--------------------------------------------------+ (overcloud) 04:55:11.452 [stack@undercloud-0 ~]$ openstack loadbalancer show LB +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2018-05-01T08:30:17 | | description | | | flavor | | | id | 88b46da5-1685-4e78-a7d3-85ed8942d183 | | listeners | 446539b4-e35c-4e62-81b9-b7f94ba27eb0 | | | ef3681ee-5de2-4546-a35f-5145e16ee235 | | name | LB | | operating_status | ONLINE | | pools | a7d69658-ddfe-4087-9665-f6be92ba2ccd | | | d001cc2c-a349-4495-a9b2-13b865676245 | | project_id | f421dd896bcb47d28f692036f687fcd8 | | provider | octavia | | provisioning_status | ACTIVE | | updated_at | 2018-05-06T13:57:32 | | vip_address | 172.16.0.220 | | vip_network_id | a898f074-05ad-4882-afd9-1564a08ad18a | | vip_port_id | a8b388f2-9b1a-4c7e-a85e-9dc4b0759436 | | vip_qos_policy_id | None | | vip_subnet_id | 34e5abbf-b084-40c0-8c62-846ae64968e0 | +---------------------+--------------------------------------+ (overcloud) 04:52:21.700 [stack@undercloud-0 ~]$ openstack loadbalancer amphora list +--------------------------------------+--------------------------------------+-----------+--------+----------------+--------------+ | id | loadbalancer_id | status | role | lb_network_ip | ha_ip | +--------------------------------------+--------------------------------------+-----------+--------+----------------+--------------+ | 9b983ed7-5a3b-4197-8981-1695cc8a0897 | 88b46da5-1685-4e78-a7d3-85ed8942d183 | ALLOCATED | MASTER | 192.168.199.56 | 172.16.0.220 | | f3ad1038-e6ff-4868-a78d-1a6c7fcec3a1 | 88b46da5-1685-4e78-a7d3-85ed8942d183 | ALLOCATED | BACKUP | 192.168.199.57 | 172.16.0.220 | +--------------------------------------+--------------------------------------+-----------+--------+----------------+--------------+ Expected results: Loadbalancer and Octavia services should show OFFLINE / SHUTDOWN status.
(On Network DFG triage call), Noam can you check if Octavia spawned another Amphora to replace the one that was shutoff?
(In reply to Assaf Muller from comment #1) > (On Network DFG triage call), Noam can you check if Octavia spawned another > Amphora to replace the one that was shutoff? On puddle 2018-05-15.2 - Octavia operating status after undercloud reboot is "DEGRADED", and amphora list is empty: (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer show LB +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2018-05-27T14:38:59 | | description | | | flavor | | | id | 65882eb6-3620-4afb-839e-e855128897bf | | listeners | d3570f1e-4ba7-4325-b197-84f9a466b4b7 | | | 8b9722f8-fdd3-4ecb-885c-3aefe3383ff9 | | name | LB | | operating_status | DEGRADED | | pools | 0bb0c497-8d18-4016-aa1e-2bf8e5426c7a | | project_id | 200299aeef93499b9453a212d3fdd7cd | | provider | octavia | | provisioning_status | ACTIVE | | updated_at | 2018-06-04T05:47:08 | | vip_address | 192.168.2.35 | | vip_network_id | 986e1796-a83b-473b-bf5b-c9267d66f555 | | vip_port_id | 17402591-2db8-4866-88dc-00f1dc270164 | | vip_qos_policy_id | None | | vip_subnet_id | a2729003-bf3f-48f9-9501-3273dac65048 | +---------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list (overcloud) [stack@undercloud-0 ~]$
Noam, Can you please attach logs from all Octavia services so we can what Octavia tried to do to recover from this? I think you can use the SOS report tool to capture everything Octavia related.
(In reply to Nir Magnezi from comment #3) > Noam, > Can you please attach logs from all Octavia services so we can what Octavia > tried to do to recover from this? > > I think you can use the SOS report tool to capture everything Octavia > related. Which sosreport do you need ? Also reproduced on 13 -p 2018-05-24.2 : [08:34:36.202] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list +--------------------------------------+--------------------------------------+-----------+------------+---------------+--------------+ | id | loadbalancer_id | status | role | lb_network_ip | ha_ip | +--------------------------------------+--------------------------------------+-----------+------------+---------------+--------------+ | 97683590-bbbf-400b-ac26-b0bcfe432980 | dfd907da-114b-4610-84c3-5875c2942a70 | ALLOCATED | STANDALONE | 172.24.0.14 | 192.168.2.39 | +--------------------------------------+--------------------------------------+-----------+------------+---------------+--------------+ [08:34:41.431] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer list +--------------------------------------+------+----------------------------------+--------------+---------------------+----------+ | id | name | project_id | vip_address | provisioning_status | provider | +--------------------------------------+------+----------------------------------+--------------+---------------------+----------+ | dfd907da-114b-4610-84c3-5875c2942a70 | LB | a27a2e1887e947be8ecd433091282850 | 192.168.2.39 | ACTIVE | octavia | +--------------------------------------+------+----------------------------------+--------------+---------------------+----------+ [08:37:32.187] (overcloud) [stack@undercloud-0 ~]$ openstack server list --all +--------------------------------------+----------------------------------------------+--------+-----------------------------------------------------------+---------------------------------+--------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+----------------------------------------------+--------+-----------------------------------------------------------+---------------------------------+--------+ | dd107506-3aa7-4068-9b3d-05acbb021390 | amphora-97683590-bbbf-400b-ac26-b0bcfe432980 | ACTIVE | int_net=192.168.2.35; lb-mgmt-net=172.24.0.14, 10.0.0.220 | octavia-amphora-13.0-20180524.1 | | | 8d59874e-ef2f-4433-9c02-1a96ba1cd893 | vm-rht-2 | ACTIVE | int_net=192.168.2.43, 10.0.0.218 | rhel75 | rhel7 | | 504d3190-f94e-4910-aac0-cf5f4efde0c8 | vm-rht-1 | ACTIVE | int_net=192.168.2.40, 10.0.0.210 | rhel75 | rhel7 | +--------------------------------------+----------------------------------------------+--------+-----------------------------------------------------------+---------------------------------+--------+ [08:38:19.314] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer show LB +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2018-06-14T09:52:27 | | description | | | flavor | | | id | dfd907da-114b-4610-84c3-5875c2942a70 | | listeners | d92fe753-71c6-4974-9e6e-46bb45ecfaac | | name | LB | | operating_status | ONLINE | | pools | | | project_id | a27a2e1887e947be8ecd433091282850 | | provider | octavia | | provisioning_status | ACTIVE | | updated_at | 2018-06-14T09:55:22 | | vip_address | 192.168.2.39 | | vip_network_id | fa6b2bdc-8f64-49b2-9f60-5215deb3ca2c | | vip_port_id | db036f3d-b22c-4784-9899-36004dcfb3bd | | vip_qos_policy_id | None | | vip_subnet_id | aa1f6061-a47e-4158-b055-ac239442d02e | +---------------------+--------------------------------------+ [08:39:14.317] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list [ NO Output ] [08:40:12.857] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer show LB +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2018-06-14T09:52:27 | | description | | | flavor | | | id | dfd907da-114b-4610-84c3-5875c2942a70 | | listeners | d92fe753-71c6-4974-9e6e-46bb45ecfaac | | name | LB | | operating_status | ONLINE | | pools | | | project_id | a27a2e1887e947be8ecd433091282850 | | provider | octavia | | provisioning_status | ERROR | | updated_at | 2018-06-18T12:39:33 | | vip_address | 192.168.2.39 | | vip_network_id | fa6b2bdc-8f64-49b2-9f60-5215deb3ca2c | | vip_port_id | db036f3d-b22c-4784-9899-36004dcfb3bd | | vip_qos_policy_id | None | | vip_subnet_id | aa1f6061-a47e-4158-b055-ac239442d02e | +---------------------+--------------------------------------+ [08:40:27.056] (overcloud) [stack@undercloud-0 ~]$ cat /etc/yum.repos.d/latest-installed 13 -p 2018-05-24.2 [08:41:37.569] (overcloud) [stack@undercloud-0 ~]$
Created attachment 1452949 [details] Amphora shutoff -> loadbalancer ONLINE -> Amphora not restarted Amphora VM shutoff shows loadbalancer operating_status ONLINE, but Amphora instance is not restarted.
Please see attachment above. When initiating openstack server stop to the Amphora (which is like operating system shutdown, not poweroff), then loadbalancer shows provisioning_status ERROR (not ONLINE, as previously happened on poweroff). Perhaps that's a good indication. However, don't we expect the loadbalancer to restart the Amphora after a while ? [09:29:39.498] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer show LB-0 +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2018-06-19T12:06:34 | | description | | | flavor | | | id | ccbc588d-ccc0-4f4a-8b82-55de8a941450 | | listeners | | | name | LB-0 | | operating_status | ONLINE | | pools | | | project_id | 8ea5e0946ce54d448df01a77d058b2df | | provider | octavia | | provisioning_status | ERROR | | updated_at | 2018-06-19T13:29:21 | | vip_address | 192.168.2.33 | | vip_network_id | fa6b2bdc-8f64-49b2-9f60-5215deb3ca2c | | vip_port_id | 000d2651-aafa-4b5b-96a2-97d59c2e49ae | | vip_qos_policy_id | None | | vip_subnet_id | aa1f6061-a47e-4158-b055-ac239442d02e | +---------------------+--------------------------------------+
(In reply to Noam Manos from comment #7) > Please see attachment above. > > When initiating openstack server stop to the Amphora (which is like > operating system shutdown, not poweroff), then loadbalancer shows > provisioning_status ERROR (not ONLINE, as previously happened on poweroff). > Perhaps that's a good indication. > > However, don't we expect the loadbalancer to restart the Amphora after a > while ? > > > > > [09:29:39.498] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer > show LB-0 > +---------------------+--------------------------------------+ > | Field | Value | > +---------------------+--------------------------------------+ > | admin_state_up | True | > | created_at | 2018-06-19T12:06:34 | > | description | | > | flavor | | > | id | ccbc588d-ccc0-4f4a-8b82-55de8a941450 | > | listeners | | > | name | LB-0 | > | operating_status | ONLINE | > | pools | | > | project_id | 8ea5e0946ce54d448df01a77d058b2df | > | provider | octavia | > | provisioning_status | ERROR | > | updated_at | 2018-06-19T13:29:21 | > | vip_address | 192.168.2.33 | > | vip_network_id | fa6b2bdc-8f64-49b2-9f60-5215deb3ca2c | > | vip_port_id | 000d2651-aafa-4b5b-96a2-97d59c2e49ae | > | vip_qos_policy_id | None | > | vip_subnet_id | aa1f6061-a47e-4158-b055-ac239442d02e | > +---------------------+--------------------------------------+ From what I've tested, Octavia spawned a new Amphora instance and deleted the one that I powered off. If it does not happen in your setup, I'll appreciate an opportunity to log in and have a look at it. Thanks!
Hi Noam, I tested this scenario on your environment (thanks for allowing this btw), and I was not able to reproduce. Here's what I did: 0. I Monitored Octavia logs for all services on all three controller nodes. 1. I Created a loadbalancer and waited for it to become ACTIVE. 2. I manually stopped the running amphora instance via 'openstack server stop <id>' Result: Octavia noticed the stale amphora, so it deleted it and spawned a new one instead. See the log: http://paste.openstack.org/show/724175/
(In reply to Nir Magnezi from comment #10) > Hi Noam, > > I tested this scenario on your environment (thanks for allowing this btw), > and I was not able to reproduce. > > Here's what I did: > 0. I Monitored Octavia logs for all services on all three controller nodes. > 1. I Created a loadbalancer and waited for it to become ACTIVE. > 2. I manually stopped the running amphora instance via 'openstack server > stop <id>' > > Result: > Octavia noticed the stale amphora, so it deleted it and spawned a new one > instead. > See the log: http://paste.openstack.org/show/724175/ OK, is that the expected behaviour also if having a non-HA deployment with a single controller ?
(In reply to Noam Manos from comment #11) > > OK, is that the expected behaviour also if having a non-HA deployment with a > single controller ? Yes. The amount of controllers does not play a role here. I have tested it on a single controller as well, but that was using the master code, not stable/queens.
As a followup to comment #10, Closing this bug. Feel free to reopen if the problem reproduces.