Bug 1575912 - Amphora VM shutoff does not update loadbalancer and octavia service status
Summary: Amphora VM shutoff does not update loadbalancer and octavia service status
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-octavia
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Nir Magnezi
QA Contact: Noam Manos
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-08 09:31 UTC by Noam Manos
Modified: 2019-09-10 14:09 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-26 08:21:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Amphora shutoff -> loadbalancer ONLINE -> Amphora not restarted (25.60 KB, text/plain)
2018-06-19 13:48 UTC, Noam Manos
no flags Details

Description Noam Manos 2018-05-08 09:31:27 UTC
Description of problem:
Happened after Amphora VMs were shut-down (due to power outage), which did not update load balancer and Octavia service status.

Version-Release number of selected component (if applicable):
OSP: 13   
Puddle: 2018-05-04.1

How reproducible:
Always


Steps to Reproduce:

1) Shutdown Amphora VMs, and watch for its status to become SHUTOFF.
2) Check loadbalancer status.
3) Check loadbalancer amphora list status.


Actual results:

(overcloud) 04:52:09.296 [stack@undercloud-0 ~]$ nova list --all
+--------------------------------------+----------------------------------------------+----------------------------------+---------+------------+-------------+--------------------------------------------------+
| ID                                   | Name                                         | Tenant ID                        | Status  | Task State | Power State | Networks                                         |
+--------------------------------------+----------------------------------------------+----------------------------------+---------+------------+-------------+--------------------------------------------------+
| cf6efbc5-d02f-4af7-9c70-4c488986a23a | amphora-9b983ed7-5a3b-4197-8981-1695cc8a0897 | 4ee04fbdad964bda99e6ac3b16c2398f | SHUTOFF | -          | Shutdown    | int_net=172.16.0.218; lb-mgmt-net=192.168.199.56 |
| c51fdaff-16c3-4989-b291-511d199b05b9 | amphora-f3ad1038-e6ff-4868-a78d-1a6c7fcec3a1 | 4ee04fbdad964bda99e6ac3b16c2398f | SHUTOFF | -          | Shutdown    | int_net=172.16.0.225; lb-mgmt-net=192.168.199.57 |
| 5cff72b9-8b46-4236-bd56-c7163a560849 | vm-rht-1                                     | f421dd896bcb47d28f692036f687fcd8 | SHUTOFF | -          | Shutdown    | int_net=172.16.0.216, 10.0.0.219                 |
| 9cb2f05b-ee90-4940-b191-17dc248dbf94 | vm-rht-2                                     | f421dd896bcb47d28f692036f687fcd8 | SHUTOFF | -          | Shutdown    | int_net=172.16.0.219, 10.0.0.214                 |
+--------------------------------------+----------------------------------------------+----------------------------------+---------+------------+-------------+--------------------------------------------------+

(overcloud) 04:55:11.452 [stack@undercloud-0 ~]$ openstack loadbalancer show LB
+---------------------+--------------------------------------+
| Field               | Value                                |
+---------------------+--------------------------------------+
| admin_state_up      | True                                 |
| created_at          | 2018-05-01T08:30:17                  |
| description         |                                      |
| flavor              |                                      |
| id                  | 88b46da5-1685-4e78-a7d3-85ed8942d183 |
| listeners           | 446539b4-e35c-4e62-81b9-b7f94ba27eb0 |
|                     | ef3681ee-5de2-4546-a35f-5145e16ee235 |
| name                | LB                                   |
| operating_status    | ONLINE                              |
| pools               | a7d69658-ddfe-4087-9665-f6be92ba2ccd |
|                     | d001cc2c-a349-4495-a9b2-13b865676245 |
| project_id          | f421dd896bcb47d28f692036f687fcd8     |
| provider            | octavia                              |
| provisioning_status | ACTIVE                               |
| updated_at          | 2018-05-06T13:57:32                  |
| vip_address         | 172.16.0.220                         |
| vip_network_id      | a898f074-05ad-4882-afd9-1564a08ad18a |
| vip_port_id         | a8b388f2-9b1a-4c7e-a85e-9dc4b0759436 |
| vip_qos_policy_id   | None                                 |
| vip_subnet_id       | 34e5abbf-b084-40c0-8c62-846ae64968e0 |
+---------------------+--------------------------------------+

(overcloud) 04:52:21.700 [stack@undercloud-0 ~]$ openstack loadbalancer amphora list
+--------------------------------------+--------------------------------------+-----------+--------+----------------+--------------+
| id                                   | loadbalancer_id                      | status    | role   | lb_network_ip  | ha_ip        |
+--------------------------------------+--------------------------------------+-----------+--------+----------------+--------------+
| 9b983ed7-5a3b-4197-8981-1695cc8a0897 | 88b46da5-1685-4e78-a7d3-85ed8942d183 | ALLOCATED | MASTER | 192.168.199.56 | 172.16.0.220 |
| f3ad1038-e6ff-4868-a78d-1a6c7fcec3a1 | 88b46da5-1685-4e78-a7d3-85ed8942d183 | ALLOCATED | BACKUP | 192.168.199.57 | 172.16.0.220 |
+--------------------------------------+--------------------------------------+-----------+--------+----------------+--------------+


Expected results:
Loadbalancer and Octavia services should show OFFLINE / SHUTDOWN status.

Comment 1 Assaf Muller 2018-05-30 13:53:40 UTC
(On Network DFG triage call), Noam can you check if Octavia spawned another Amphora to replace the one that was shutoff?

Comment 2 Noam Manos 2018-06-05 15:09:20 UTC
(In reply to Assaf Muller from comment #1)
> (On Network DFG triage call), Noam can you check if Octavia spawned another
> Amphora to replace the one that was shutoff?

On puddle 2018-05-15.2 - Octavia operating status after undercloud reboot is "DEGRADED", and amphora list is empty:

(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer show LB
+---------------------+--------------------------------------+
| Field               | Value                                |
+---------------------+--------------------------------------+
| admin_state_up      | True                                 |
| created_at          | 2018-05-27T14:38:59                  |
| description         |                                      |
| flavor              |                                      |
| id                  | 65882eb6-3620-4afb-839e-e855128897bf |
| listeners           | d3570f1e-4ba7-4325-b197-84f9a466b4b7 |
|                     | 8b9722f8-fdd3-4ecb-885c-3aefe3383ff9 |
| name                | LB                                   |
| operating_status    | DEGRADED                             |
| pools               | 0bb0c497-8d18-4016-aa1e-2bf8e5426c7a |
| project_id          | 200299aeef93499b9453a212d3fdd7cd     |
| provider            | octavia                              |
| provisioning_status | ACTIVE                               |
| updated_at          | 2018-06-04T05:47:08                  |
| vip_address         | 192.168.2.35                         |
| vip_network_id      | 986e1796-a83b-473b-bf5b-c9267d66f555 |
| vip_port_id         | 17402591-2db8-4866-88dc-00f1dc270164 |
| vip_qos_policy_id   | None                                 |
| vip_subnet_id       | a2729003-bf3f-48f9-9501-3273dac65048 |
+---------------------+--------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list

(overcloud) [stack@undercloud-0 ~]$

Comment 3 Nir Magnezi 2018-06-11 13:34:33 UTC
Noam,
Can you please attach logs from all Octavia services so we can what Octavia tried to do to recover from this?

I think you can use the SOS report tool to capture everything Octavia related.

Comment 4 Noam Manos 2018-06-18 12:45:39 UTC
(In reply to Nir Magnezi from comment #3)
> Noam,
> Can you please attach logs from all Octavia services so we can what Octavia
> tried to do to recover from this?
> 
> I think you can use the SOS report tool to capture everything Octavia
> related.

Which sosreport do you need ?

Also reproduced on 13   -p 2018-05-24.2 :


[08:34:36.202] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list
+--------------------------------------+--------------------------------------+-----------+------------+---------------+--------------+
| id                                   | loadbalancer_id                      | status    | role       | lb_network_ip | ha_ip        |
+--------------------------------------+--------------------------------------+-----------+------------+---------------+--------------+
| 97683590-bbbf-400b-ac26-b0bcfe432980 | dfd907da-114b-4610-84c3-5875c2942a70 | ALLOCATED | STANDALONE | 172.24.0.14   | 192.168.2.39 |
+--------------------------------------+--------------------------------------+-----------+------------+---------------+--------------+

[08:34:41.431] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer list
+--------------------------------------+------+----------------------------------+--------------+---------------------+----------+
| id                                   | name | project_id                       | vip_address  | provisioning_status | provider |
+--------------------------------------+------+----------------------------------+--------------+---------------------+----------+
| dfd907da-114b-4610-84c3-5875c2942a70 | LB   | a27a2e1887e947be8ecd433091282850 | 192.168.2.39 | ACTIVE              | octavia  |
+--------------------------------------+------+----------------------------------+--------------+---------------------+----------+

[08:37:32.187] (overcloud) [stack@undercloud-0 ~]$ openstack server list --all
+--------------------------------------+----------------------------------------------+--------+-----------------------------------------------------------+---------------------------------+--------+
| ID                                   | Name                                         | Status | Networks                                                  | Image                           | Flavor |
+--------------------------------------+----------------------------------------------+--------+-----------------------------------------------------------+---------------------------------+--------+
| dd107506-3aa7-4068-9b3d-05acbb021390 | amphora-97683590-bbbf-400b-ac26-b0bcfe432980 | ACTIVE | int_net=192.168.2.35; lb-mgmt-net=172.24.0.14, 10.0.0.220 | octavia-amphora-13.0-20180524.1 |        |
| 8d59874e-ef2f-4433-9c02-1a96ba1cd893 | vm-rht-2                                     | ACTIVE | int_net=192.168.2.43, 10.0.0.218                          | rhel75                          | rhel7  |
| 504d3190-f94e-4910-aac0-cf5f4efde0c8 | vm-rht-1                                     | ACTIVE | int_net=192.168.2.40, 10.0.0.210                          | rhel75                          | rhel7  |
+--------------------------------------+----------------------------------------------+--------+-----------------------------------------------------------+---------------------------------+--------+

[08:38:19.314] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer show LB
+---------------------+--------------------------------------+
| Field               | Value                                |
+---------------------+--------------------------------------+
| admin_state_up      | True                                 |
| created_at          | 2018-06-14T09:52:27                  |
| description         |                                      |
| flavor              |                                      |
| id                  | dfd907da-114b-4610-84c3-5875c2942a70 |
| listeners           | d92fe753-71c6-4974-9e6e-46bb45ecfaac |
| name                | LB                                   |
| operating_status    | ONLINE                               |
| pools               |                                      |
| project_id          | a27a2e1887e947be8ecd433091282850     |
| provider            | octavia                              |
| provisioning_status | ACTIVE                               |
| updated_at          | 2018-06-14T09:55:22                  |
| vip_address         | 192.168.2.39                         |
| vip_network_id      | fa6b2bdc-8f64-49b2-9f60-5215deb3ca2c |
| vip_port_id         | db036f3d-b22c-4784-9899-36004dcfb3bd |
| vip_qos_policy_id   | None                                 |
| vip_subnet_id       | aa1f6061-a47e-4158-b055-ac239442d02e |
+---------------------+--------------------------------------+
[08:39:14.317] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list
[ NO Output ]

[08:40:12.857] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer show LB
+---------------------+--------------------------------------+
| Field               | Value                                |
+---------------------+--------------------------------------+
| admin_state_up      | True                                 |
| created_at          | 2018-06-14T09:52:27                  |
| description         |                                      |
| flavor              |                                      |
| id                  | dfd907da-114b-4610-84c3-5875c2942a70 |
| listeners           | d92fe753-71c6-4974-9e6e-46bb45ecfaac |
| name                | LB                                   |
| operating_status    | ONLINE                               |
| pools               |                                      |
| project_id          | a27a2e1887e947be8ecd433091282850     |
| provider            | octavia                              |
| provisioning_status | ERROR                                |
| updated_at          | 2018-06-18T12:39:33                  |
| vip_address         | 192.168.2.39                         |
| vip_network_id      | fa6b2bdc-8f64-49b2-9f60-5215deb3ca2c |
| vip_port_id         | db036f3d-b22c-4784-9899-36004dcfb3bd |
| vip_qos_policy_id   | None                                 |
| vip_subnet_id       | aa1f6061-a47e-4158-b055-ac239442d02e |
+---------------------+--------------------------------------+
   
[08:40:27.056] (overcloud) [stack@undercloud-0 ~]$ cat /etc/yum.repos.d/latest-installed 
13   -p 2018-05-24.2

[08:41:37.569] (overcloud) [stack@undercloud-0 ~]$

Comment 6 Noam Manos 2018-06-19 13:48:54 UTC
Created attachment 1452949 [details]
Amphora shutoff -> loadbalancer ONLINE -> Amphora not restarted

Amphora VM shutoff shows loadbalancer operating_status ONLINE, but Amphora instance is not restarted.

Comment 7 Noam Manos 2018-06-19 13:58:17 UTC
Please see attachment above.

When initiating openstack server stop to the Amphora (which is like operating system shutdown, not poweroff), then loadbalancer shows provisioning_status ERROR (not ONLINE, as previously happened on poweroff). Perhaps that's a good indication.

However, don't we expect the loadbalancer to restart the Amphora after a while ?




[09:29:39.498] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer show LB-0
+---------------------+--------------------------------------+
| Field               | Value                                |
+---------------------+--------------------------------------+
| admin_state_up      | True                                 |
| created_at          | 2018-06-19T12:06:34                  |
| description         |                                      |
| flavor              |                                      |
| id                  | ccbc588d-ccc0-4f4a-8b82-55de8a941450 |
| listeners           |                                      |
| name                | LB-0                                 |
| operating_status    | ONLINE                               |
| pools               |                                      |
| project_id          | 8ea5e0946ce54d448df01a77d058b2df     |
| provider            | octavia                              |
| provisioning_status | ERROR                                |
| updated_at          | 2018-06-19T13:29:21                  |
| vip_address         | 192.168.2.33                         |
| vip_network_id      | fa6b2bdc-8f64-49b2-9f60-5215deb3ca2c |
| vip_port_id         | 000d2651-aafa-4b5b-96a2-97d59c2e49ae |
| vip_qos_policy_id   | None                                 |
| vip_subnet_id       | aa1f6061-a47e-4158-b055-ac239442d02e |
+---------------------+--------------------------------------+

Comment 8 Nir Magnezi 2018-06-21 21:00:43 UTC
(In reply to Noam Manos from comment #7)
> Please see attachment above.
> 
> When initiating openstack server stop to the Amphora (which is like
> operating system shutdown, not poweroff), then loadbalancer shows
> provisioning_status ERROR (not ONLINE, as previously happened on poweroff).
> Perhaps that's a good indication.
> 
> However, don't we expect the loadbalancer to restart the Amphora after a
> while ?
> 
> 
> 
> 
> [09:29:39.498] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer
> show LB-0
> +---------------------+--------------------------------------+
> | Field               | Value                                |
> +---------------------+--------------------------------------+
> | admin_state_up      | True                                 |
> | created_at          | 2018-06-19T12:06:34                  |
> | description         |                                      |
> | flavor              |                                      |
> | id                  | ccbc588d-ccc0-4f4a-8b82-55de8a941450 |
> | listeners           |                                      |
> | name                | LB-0                                 |
> | operating_status    | ONLINE                               |
> | pools               |                                      |
> | project_id          | 8ea5e0946ce54d448df01a77d058b2df     |
> | provider            | octavia                              |
> | provisioning_status | ERROR                                |
> | updated_at          | 2018-06-19T13:29:21                  |
> | vip_address         | 192.168.2.33                         |
> | vip_network_id      | fa6b2bdc-8f64-49b2-9f60-5215deb3ca2c |
> | vip_port_id         | 000d2651-aafa-4b5b-96a2-97d59c2e49ae |
> | vip_qos_policy_id   | None                                 |
> | vip_subnet_id       | aa1f6061-a47e-4158-b055-ac239442d02e |
> +---------------------+--------------------------------------+

From what I've tested, Octavia spawned a new Amphora instance and deleted the one that I powered off.

If it does not happen in your setup, I'll appreciate an opportunity to log in and have a look at it. Thanks!

Comment 10 Nir Magnezi 2018-06-24 07:44:34 UTC
Hi Noam,

I tested this scenario on your environment (thanks for allowing this btw), and I was not able to reproduce.

Here's what I did:
0. I Monitored Octavia logs for all services on all three controller nodes.
1. I Created a loadbalancer and waited for it to become ACTIVE.
2. I manually stopped the running amphora instance via 'openstack server stop <id>'

Result:
Octavia noticed the stale amphora, so it deleted it and spawned a new one instead.
See the log: http://paste.openstack.org/show/724175/

Comment 11 Noam Manos 2018-06-24 07:54:02 UTC
(In reply to Nir Magnezi from comment #10)
> Hi Noam,
> 
> I tested this scenario on your environment (thanks for allowing this btw),
> and I was not able to reproduce.
> 
> Here's what I did:
> 0. I Monitored Octavia logs for all services on all three controller nodes.
> 1. I Created a loadbalancer and waited for it to become ACTIVE.
> 2. I manually stopped the running amphora instance via 'openstack server
> stop <id>'
> 
> Result:
> Octavia noticed the stale amphora, so it deleted it and spawned a new one
> instead.
> See the log: http://paste.openstack.org/show/724175/


OK, is that the expected behaviour also if having a non-HA deployment with a single controller ?

Comment 12 Nir Magnezi 2018-06-24 08:48:16 UTC
(In reply to Noam Manos from comment #11)
> 
> OK, is that the expected behaviour also if having a non-HA deployment with a
> single controller ?

Yes.
The amount of controllers does not play a role here.
I have tested it on a single controller as well, but that was using the master code, not stable/queens.

Comment 13 Nir Magnezi 2018-06-26 08:21:52 UTC
As a followup to comment #10, Closing this bug.
Feel free to reopen if the problem reproduces.


Note You need to log in before you can comment on or make changes to this bug.