Bug 1936529 - [RFE] Mechanism to fence service API endpoints off
Summary: [RFE] Mechanism to fence service API endpoints off
Keywords:
Status: NEW
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo
Version: 17.0 (Wallaby)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: James Slagle
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-08 17:09 UTC by Carlos Goncalves
Modified: 2023-08-17 01:20 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1927169 1 unspecified CLOSED DBAPIError exception wrapped from (pymysql.err.InternalError) (1054, u"Unknown column 'pool.tls_certificate_id' in 'fiel... 2021-03-08 18:16:14 UTC
Red Hat Bugzilla 1936018 1 urgent CLOSED Error Occured During Migration OVS to OVN 2022-10-03 14:31:02 UTC
Red Hat Issue Tracker OSP-2736 0 None None None 2023-05-08 14:21:31 UTC

Description Carlos Goncalves 2021-03-08 17:09:33 UTC
Cloud operators may need to fence all or a subset of service API endpoints off from their users.

  * Neutron ML2/OVS to ML2/OVN migration requires Neutron API fencing
    Race conditions can happen because users accessing the Neutron API may trigger operations on objects, and the migration tool then tries to create those resources in OVN and that causes some mismatches which result in a failed migration (see BZ #1936018).

  * Octavia-enabled OSP 13 z12 or older to OSP 13 z13 or newer updates
    Octavia was upgraded to the Train version in OSP 13 z13. The upgrade process requires a database schema update. For this reason, users must not be able to do changing operations (create, update, delete) to Octavia load balancing resources. This note was added to the documentation but we've had cases where the Octavia API was not fenced by the cloud operator because either they were not aware of such note or because they expected Director to seamlessly handle it (see BZ #1927169).

  * Faulty service instances not detected by the service health checks (tripleo-common) will still be in the HAProxy server list and thus traffic will be forwarded to them.

User stories:
  * As a cloud operator, I want to fence all or a subset of service API endpoints off so API users have no access to them.
  * As a cloud operator, I want to fence all or a subset of service API endpoints off so API users have limited access (e.g. read-only).

A possible fencing mechanism could be by means of adjusting the HAProxy configuration.
For example, taking servers out of rotation (user story #1) or allowing requests to HTTP 2xx (user story #2).

Comment 1 Carlos Goncalves 2021-03-09 09:39:41 UTC
The last sentence in comment #0 should read instead as follows:
  "For example, taking servers out of rotation (user story #1) or allowing GET requests to the API endpoints (user story #2)."

Comment 2 Daniel Alvarez Sanchez 2021-03-09 10:01:25 UTC
Echoing the need for this RFE from the Neutron side.

I like the HAProxy way for this fencing mechanism because:

- I believe it's faster to restart haproxy than all the services to honor the fencing.
- Allowing GET operations is probably going to cope better with any potential monitoring mechanisms in place.
- Seems like a global solution for all the endpoints.


+1 from me :)

Another solution that we briefly discussed yesterday is through policies but I think it's less flexible than through haproxy.

Comment 3 Sofer Athlan-Guyot 2021-03-15 17:00:46 UTC
Hi,

+1 on HAproxy but note that changing haproxy configuration on a live env will be "reset" during an update to the configured default. 

Just let me know if some more information is needed around that use case.

Thanks,


Note You need to log in before you can comment on or make changes to this bug.