Bug 1840088

Summary: Unable to delete Load Balancer pool with ERROR status
Product: Red Hat OpenStack Reporter: Carlos Goncalves <cgoncalves>
Component: openstack-octaviaAssignee: Carlos Goncalves <cgoncalves>
Status: CLOSED ERRATA QA Contact: Bruna Bonguardo <bbonguar>
Severity: high Docs Contact:
Priority: medium    
Version: 13.0 (Queens)CC: bbonguar, ihrachys, lpeer, majopela, mdemaced, michjohn, scohen
Target Milestone: z13Keywords: Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-octavia-5.0.2-0.20200626104656.2f7adfd Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1837033 Environment:
Last Closed: 2020-10-28 18:33:16 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1837033    
Bug Blocks:    

Description Carlos Goncalves 2020-05-26 11:01:22 UTC
+++ This bug was initially created as a clone of Bug #1837033 +++

Description of problem:

When attempting to delete a load balancer pool with ERROR status, that pool would move from PENDING_DELETE to ERROR status again.

The worker showed the following traceback:

2020-05-18 15:36:14.243 22 ERROR octavia.amphorae.drivers.haproxy.exceptions [req-6ce6dfd2-0fc7-4fef-bdd4-e834b14bd8c1 - f6b6420743ce45ac868c102a523ffde6 - - -] Amphora agent returned unexpected result code 500 with response {'message': '
Error reloading haproxy', 'details': 'Redirecting to /bin/systemctl reload haproxy-6218ba3d-0290-4c25-a5e1-502f374f8b6c.service\nJob for haproxy-6218ba3d-0290-4c25-a5e1-502f374f8b6c.service failed because the control process exited with e
rror code.\nSee "systemctl status haproxy-6218ba3d-0290-4c25-a5e1-502f374f8b6c.service" and "journalctl -xe" for details.\n'}
2020-05-18 15:36:14.246 22 WARNING octavia.controller.worker.v1.controller_worker [-] Task 'octavia.controller.worker.v1.tasks.amphora_driver_tasks.ListenersUpdate' (25cef1bd-5857-4c5b-acdf-6e96278d082b) transitioned into state 'FAILURE'
from state 'RUNNING'
5 predecessors (most recent first):
  Atom 'octavia.controller.worker.v1.tasks.model_tasks.DeleteModelObject' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {'object': <octavia.common.data_models.Pool object at 0x7f260c6fcb38>}, 'provides': None}
  |__Atom 'octavia.controller.worker.v1.tasks.database_tasks.CountPoolChildrenForQuota' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {'pool': <octavia.common.data_models.Pool object at 0x7f260c6fcb38>}, 'provides': {'HM': 0, '
member': 0}}
     |__Atom 'octavia.controller.worker.v1.tasks.database_tasks.MarkPoolPendingDeleteInDB' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {'pool': <octavia.common.data_models.Pool object at 0x7f260c6fcb38>}, 'provides': None}
        |__Atom 'octavia.controller.worker.v1.tasks.lifecycle_tasks.PoolToErrorOnRevertTask' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {'pool': <octavia.common.data_models.Pool object at 0x7f260c6fcb38>, 'listeners': [<octa
via.common.data_models.Listener object at 0x7f260c6fcbe0>], 'loadbalancer': <octavia.common.data_models.LoadBalancer object at 0x7f260c6fc198>}, 'provides': None}
           |__Flow 'octavia-delete-pool-flow': octavia.amphorae.drivers.haproxy.exceptions.InternalServerError: Internal Server Error
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker Traceback (most recent call last):
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker   File "/usr/lib/python3.6/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker     result = task.execute(**arguments)
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker   File "/usr/lib/python3.6/site-packages/octavia/controller/worker/v1/tasks/amphora_driver_tasks.py", line 76, in execute
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker     self.amphora_driver.update(loadbalancer)
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker   File "/usr/lib/python3.6/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py", line 251, in update
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker     self.update_amphora_listeners(loadbalancer, amphora)
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker   File "/usr/lib/python3.6/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py", line 224, in update_amphora_listeners
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker     amphora, loadbalancer.id, timeout_dict=timeout_dict)
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker   File "/usr/lib/python3.6/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py", line 863, in _action
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker     return exc.check_exception(r)
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker   File "/usr/lib/python3.6/site-packages/octavia/amphorae/drivers/haproxy/exceptions.py", line 43, in check_exception
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker     raise responses[status_code]()
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker octavia.amphorae.drivers.haproxy.exceptions.InternalServerError: Internal Server Error
2020-05-18 15:36:14.246 22 ERROR octavia.controller.worker.v1.controller_worker 


The HAProxy service failed to reload:

[cloud-user@amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ~]$ sudo journalctl -u haproxy-6218ba3d-0290-4c25-a5e1-502f374f8b6c.service 
-- Logs begin at Sat 2020-05-16 19:36:19 EDT, end at Mon 2020-05-18 12:08:44 EDT. --
May 16 19:37:12 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f systemd[1]: Starting HAProxy Load Balancer...
May 16 19:37:12 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]: [WARNING] 136/193712 (5085) : [/usr/sbin/haproxy.main()] Cannot raise FD limit to 2500031, limit is 2097152.
May 16 19:37:12 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]: [WARNING] 136/193712 (5085) : [/usr/sbin/haproxy.main()] FD limit (2097152) too low for maxconn=1000000/maxsock=2500031. Please raise 'ulimit-n' to 2500031 or more to>
May 16 19:37:12 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f systemd[1]: Started HAProxy Load Balancer.
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f systemd[1]: Reloading HAProxy Load Balancer.
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f haproxy[5141]: Configuration file is valid
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f systemd[1]: haproxy-6218ba3d-0290-4c25-a5e1-502f374f8b6c.service: Main process exited, code=exited, status=1/FAILURE
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]: [WARNING] 136/193712 (5085) : Reexecuting Master process
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]: Usage : haproxy [-f <cfgfile|cfgdir>]* [ -vdVD ] [ -n <maxconn> ] [ -N <maxpconn> ]
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         [ -p <pidfile> ] [ -m <max megs> ] [ -C <dir> ] [-- <cfgfile>*]
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -v displays version ; -vv shows known build options.
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -d enters debug mode ; -db only disables background mode.
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -dM[<byte>] poisons memory with <byte> (defaults to 0x50)
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -V enters verbose mode (disables quiet mode)
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -D goes daemon ; -C changes to <dir> before loading files.
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -W master-worker mode.
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -Ws master-worker mode with systemd notify support.
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -q quiet mode : don't display messages
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -c check mode : only check config files and exit
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -n sets the maximum total # of connections (2000)
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -m limits the usable amount of memory (in MB)
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -N sets the default, per-proxy maximum # of connections (2000)
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -L set local peer name (default to hostname)
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -p writes pids of all children to this file
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -de disables epoll() usage even when available
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -dp disables poll() usage even when available
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -dS disables splice usage (broken on old kernels)
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -dG disables getaddrinfo() usage
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -dR disables SO_REUSEPORT usage
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -dr ignores server address resolution failures
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -dV disables SSL verify on servers side
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -sf/-st [pid ]* finishes/terminates old pids.
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]:         -x <unix_socket> get listening sockets from a unix socket
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]: HA-Proxy version 1.8.15 2018/12/13
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f ip[5085]: Copyright 2000-2018 Willy Tarreau <willy>
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f systemd[1]: haproxy-6218ba3d-0290-4c25-a5e1-502f374f8b6c.service: Killing process 5087 (haproxy) with signal SIGKILL.
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f systemd[1]: haproxy-6218ba3d-0290-4c25-a5e1-502f374f8b6c.service: Failed with result 'exit-code'.
May 16 19:37:16 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f systemd[1]: Reload failed for HAProxy Load Balancer.
May 16 19:37:17 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f systemd[1]: haproxy-6218ba3d-0290-4c25-a5e1-502f374f8b6c.service: Service RestartSec=100ms expired, scheduling restart.
May 16 19:37:17 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f systemd[1]: haproxy-6218ba3d-0290-4c25-a5e1-502f374f8b6c.service: Scheduled restart job, restart counter is at 1.
May 16 19:37:17 amphora-4db4a3cf-9fef-4057-b1fd-b2afbf7a8a0f systemd[1]: Stopped HAProxy Load Balancer.

Octavia team mentioned the issue seems to be related to the prefix on the instance ID '-x', as a conversion of the amphora ID is needed to overcome a lenght limitation of the haproxy id.

Process: 25479 ExecStart=/sbin/ip netns exec amphora-haproxy /usr/sbin/haproxy -Ws -f $CONFIG -f $USERCONFIG -p $PIDFILE -L -xqntK8jJ_gE3QEmh-D1-XgCW_E (code=exited, status=1/FAILURE)

Version-Release number of selected component (if applicable):

Red Hat OpenStack Platform release 16.0.2 (Train)

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 15 errata-xmlrpc 2020-10-28 18:33:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (openstack-neutron bug fix advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4397