Bug 1449938 - Octavia active/standby config+ pool with sourceip session persistence configuration- Service is not available and LB is not deleted after test
Summary: Octavia active/standby config+ pool with sourceip session persistence configu...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-octavia
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ga
: 12.0 (Pike)
Assignee: Nir Magnezi
QA Contact: Alexander Stafeyev
URL:
Whiteboard:
Depends On: 1433537
Blocks: 1433523
TreeView+ depends on / blocked
 
Reported: 2017-05-11 08:19 UTC by Alexander Stafeyev
Modified: 2019-09-10 14:12 UTC (History)
6 users (show)

Fixed In Version: openstack-octavia-1.0.0-0.20170710195317.e37c0c7.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-13 21:25:26 UTC
Target Upstream Version:


Attachments (Terms of Use)
Amphora log (10.87 KB, text/plain)
2017-05-11 14:31 UTC, Alexander Stafeyev
no flags Details
Deeper debugging info (18.24 KB, text/plain)
2017-05-15 12:54 UTC, Alexander Stafeyev
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1681623 0 None None None 2017-07-13 13:54:23 UTC
Launchpad 1690812 0 None None None 2017-05-15 13:00:04 UTC
OpenStack gerrit 455569 0 'None' MERGED Fix error 500 when using SOURCE_IP and APP_COOKIE 2020-05-14 14:02:56 UTC
Red Hat Product Errata RHEA-2017:3462 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 12.0 Enhancement Advisory 2018-02-16 01:43:25 UTC

Description Alexander Stafeyev 2017-05-11 08:19:02 UTC
Description of problem:
I run octavia (rhel amphora) in active_standby mode. 
The test is lbaas scenario test: 



Version-Release number of selected component (if applicable):
rhos11

How reproducible:

100
Steps to Reproduce:
1.Deploy setup with octavia support ( 2 compute nodes better), run all needed post deployment.
2. run the following neutron tempest test 


Actual results:
The test fails and the LB can not be deleted ( pls see log file attached) 

Expected results:
The test should be successful - on a "single" and not "active_backup"  octavia config it runs well.

Additional info:
Logs will be attached soon

Comment 1 Alexander Stafeyev 2017-05-11 14:31:02 UTC
Created attachment 1277912 [details]
Amphora log

Comment 2 Alexander Stafeyev 2017-05-15 12:54:22 UTC
Created attachment 1278975 [details]
Deeper debugging info

In this attachment we you can find rhel amphora debugging results

Comment 3 Nir Magnezi 2017-06-19 10:46:01 UTC
Debugging shows that this is indeed an issue, probably in how Octavia configures HAProxy.

Looking at the logs inside the amphora instance i noticed the following:

host-192-168-199-59 haproxy: [ALERT] 134/082747 (11416) : Proxy 'cce34da1-7b4d-4659-bb0b-6cf01ffbcd68': unable to find local peer 'amphora-b8928a25-ca71-4389-8753-6ab3b2fb3d2c.localdomain' in peers section '9c530de5653d474181b73fe70c398ad5_peers'.
host-192-168-199-59 haproxy: [ALERT] 134/082747 (11416) : Fatal errors found in configuration.

Log here: https://launchpadlibrarian.net/319721688/bug2rhelamphora.txt
Upstream Bug: https://bugs.launchpad.net/octavia/+bug/1690812

Comment 4 Assaf Muller 2017-07-07 19:30:08 UTC
@Nir, https://review.openstack.org/#/c/455569/ just merged, but I can't tell if that patch is a fix for this bug?

Comment 5 Nir Magnezi 2017-07-11 12:09:28 UTC
(In reply to Assaf Muller from comment #4)
> @Nir, https://review.openstack.org/#/c/455569/ just merged, but I can't tell
> if that patch is a fix for this bug?

@Assaf, just re-tested this with master and sadly it looks like it didn't resolve the issue.
I posted my analysis here https://bugs.launchpad.net/octavia/+bug/1690812/comments/8

I'll follow-up with the upstream core team for a resolution for this.

Comment 6 Nir Magnezi 2017-07-13 13:53:28 UTC
Looks like https://review.openstack.org/#/c/455569/ resolved this issue after all.
My testing failed because of a cached amphora image, running an older version of the amphora agent without this fix.

As soon as I generated a new one, test_session_persistence worked with loadbalancer_topology = ACTIVE_STANDBY

{0} neutron_lbaas.tests.tempest.v2.scenario.test_session_persistence.TestSessionPersistence.test_session_persistence [246.013035s] ... ok

Comment 19 errata-xmlrpc 2017-12-13 21:25:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462


Note You need to log in before you can comment on or make changes to this bug.