Bug 1814391
| Summary: | [Active/Standby] When rebooting the compute host that hosts the MASTER amphora, master amphora enters ERROR state | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Bruna Bonguardo <bbonguar> |
| Component: | openstack-octavia | Assignee: | Assaf Muller <amuller> |
| Status: | CLOSED NOTABUG | QA Contact: | Bruna Bonguardo <bbonguar> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 16.0 (Train) | CC: | ihrachys, lpeer, majopela, michjohn, scohen |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-03-17 22:51:29 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
The above information is an example of the active/standby feature working correctly. The backup assumed responsibility for the traffic flows and the load balancer continued to work even though one of the compute instances has failed. That said, I am guessing you have concerns about the LB being in ERROR and one of the amphora being in ERROR state. This is expected in your configuration: You have anti-affinity enabled for nova and only have two compute instances in your cloud. As one compute instance was FAILED during the reboot, the controller cannot rebuild the failed amphora and maintain the anti-affinity rule configured. This is because the only healthy instance of a nova compute host is the one already hosting healthy and active amphora on it. The health manager log reflects this in the revert logging when it was unable to rebuild the failed amphora due to nova: raise exception.NoValidHost(reason=reason)\n\n nova.exception.NoValidHost: No valid host was found. There are not enough hosts available. This is the failover flow doing it's job and unrelated to the active/standby capability of the amphora, which was successful. |
Octavia Active/Backup deployment, with anti_affinity=true. Version: 16-trunk -p RHOS_TRUNK-16.0-RHEL-8-20200226.n.1 When rebooting the compute host that hosts the MASTER amphora, load balancer and MASTER amphora both enter ERROR state. - Master amphora status turns to ERROR. - Load balancer's provisioning status turns to ERROR. - Backup amphora works- Traffic works and is load balanced properly. Master compute host: (overcloud) [stack@undercloud-0 ~]$ openstack server show 9768caee-b618-4cd3-b6fb-e8819b065c62 | grep hypervisor_hostname | OS-EXT-SRV-ATTR:hypervisor_hostname | compute-0.redhat.local Rebooting compute host: [2020-03-17 14:14:38] (undercloud) [stack@undercloud-0 ~]$ openstack server reboot compute-0 [2020-03-17 14:14:53] (undercloud) [stack@undercloud-0 ~]$ After reboot: Compute-0 node looks fine: (undercloud) [stack@undercloud-0 ~]$ openstack server list | grep compute-0 | ec912669-dd96-4b3f-85b9-4830e2e1a3a9 | compute-0 | ACTIVE | ctlplane=192.168.24.12 | overcloud-full | compute | Amphora is in error state: [2020-03-17 14:22:36] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list +--------------------------------------+--------------------------------------+-----------+--------+---------------+-----------+ | id | loadbalancer_id | status | role | lb_network_ip | ha_ip | +--------------------------------------+--------------------------------------+-----------+--------+---------------+-----------+ | f6c2d8c8-cf1b-4129-ba47-fabca029aac1 | 234ab679-db41-440c-8ec6-d704ec0a4c48 | ALLOCATED | BACKUP | 172.24.1.218 | 10.0.1.95 | | 6bf8a9d9-a777-48a6-b94c-87d5b751d681 | 234ab679-db41-440c-8ec6-d704ec0a4c48 | ERROR | MASTER | 172.24.2.142 | 10.0.1.95 | +--------------------------------------+--------------------------------------+-----------+--------+---------------+-----------+ [2020-03-17 14:22:46] (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer status show lb2 { "loadbalancer": { "id": "234ab679-db41-440c-8ec6-d704ec0a4c48", "name": "lb2", "operating_status": "ONLINE", "provisioning_status": "ERROR", "listeners": [ { "id": "933a3d73-058d-4a5c-9128-190b5dfd996e", "name": "listener2", "operating_status": "ONLINE", "provisioning_status": "ACTIVE", "pools": [ { "id": "82d83915-4a8c-4153-a6d9-9b12a117b12b", "name": "pool2", "provisioning_status": "ACTIVE", "operating_status": "ONLINE", "health_monitor": { "id": "bbf65f71-142c-4875-bf1b-09147e9a10f5", "name": "", "type": "HTTP", "provisioning_status": "ACTIVE", "operating_status": "ONLINE" }, "members": [ { "id": "7c87f849-156a-413d-b077-105ab9171280", "name": "", "operating_status": "ONLINE", "provisioning_status": "ACTIVE", "address": "10.0.1.153", "protocol_port": 8080 }, { "id": "ec502257-652f-49d7-82f0-e6399e1626f1", "name": "", "operating_status": "ONLINE", "provisioning_status": "ACTIVE", "address": "10.0.1.20", "protocol_port": 8080 }