1449938 – Octavia active/standby config+ pool with sourceip session persistence configuration- Service is not available and LB is not deleted after test

Bug 1449938 - Octavia active/standby config+ pool with sourceip session persistence configuration- Service is not available and LB is not deleted after test

Summary: Octavia active/standby config+ pool with sourceip session persistence configu...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-octavia
Sub Component:
Version:	11.0 (Ocata)
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	ga
Target Release:	12.0 (Pike)
Assignee:	Nir Magnezi
QA Contact:	Alexander Stafeyev
Docs Contact:
URL:
Whiteboard:
Depends On:	1433537
Blocks:	1433523
TreeView+	depends on / blocked

Reported:	2017-05-11 08:19 UTC by Alexander Stafeyev
Modified:	2019-09-10 14:12 UTC (History)
CC List:	6 users (show)
Fixed In Version:	openstack-octavia-1.0.0-0.20170710195317.e37c0c7.el7ost
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-12-13 21:25:26 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Amphora log (10.87 KB, text/plain) 2017-05-11 14:31 UTC, Alexander Stafeyev	no flags	Details
Deeper debugging info (18.24 KB, text/plain) 2017-05-15 12:54 UTC, Alexander Stafeyev	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1681623	None	None	None	2017-07-13 13:54:23 UTC
Launchpad	1690812	None	None	None	2017-05-15 13:00:04 UTC
OpenStack gerrit	455569	'None'	MERGED	Fix error 500 when using SOURCE_IP and APP_COOKIE	2020-05-14 14:02:56 UTC
Red Hat Product Errata	RHEA-2017:3462	normal	SHIPPED_LIVE	Red Hat OpenStack Platform 12.0 Enhancement Advisory	2018-02-16 01:43:25 UTC

Description Alexander Stafeyev 2017-05-11 08:19:02 UTC

Description of problem:
I run octavia (rhel amphora) in active_standby mode. 
The test is lbaas scenario test: 



Version-Release number of selected component (if applicable):
rhos11

How reproducible:

100
Steps to Reproduce:
1.Deploy setup with octavia support ( 2 compute nodes better), run all needed post deployment.
2. run the following neutron tempest test 


Actual results:
The test fails and the LB can not be deleted ( pls see log file attached) 

Expected results:
The test should be successful - on a "single" and not "active_backup"  octavia config it runs well.

Additional info:
Logs will be attached soon

Comment 1 Alexander Stafeyev 2017-05-11 14:31:02 UTC

Created attachment 1277912 [details]
Amphora log

Comment 2 Alexander Stafeyev 2017-05-15 12:54:22 UTC

Created attachment 1278975 [details]
Deeper debugging info

In this attachment we you can find rhel amphora debugging results

Comment 3 Nir Magnezi 2017-06-19 10:46:01 UTC

Debugging shows that this is indeed an issue, probably in how Octavia configures HAProxy.

Looking at the logs inside the amphora instance i noticed the following:

host-192-168-199-59 haproxy: [ALERT] 134/082747 (11416) : Proxy 'cce34da1-7b4d-4659-bb0b-6cf01ffbcd68': unable to find local peer 'amphora-b8928a25-ca71-4389-8753-6ab3b2fb3d2c.localdomain' in peers section '9c530de5653d474181b73fe70c398ad5_peers'.
host-192-168-199-59 haproxy: [ALERT] 134/082747 (11416) : Fatal errors found in configuration.

Log here: https://launchpadlibrarian.net/319721688/bug2rhelamphora.txt
Upstream Bug: https://bugs.launchpad.net/octavia/+bug/1690812

Comment 4 Assaf Muller 2017-07-07 19:30:08 UTC

@Nir, https://review.openstack.org/#/c/455569/ just merged, but I can't tell if that patch is a fix for this bug?

Comment 5 Nir Magnezi 2017-07-11 12:09:28 UTC

(In reply to Assaf Muller from comment #4)
> @Nir, https://review.openstack.org/#/c/455569/ just merged, but I can't tell
> if that patch is a fix for this bug?

@Assaf, just re-tested this with master and sadly it looks like it didn't resolve the issue.
I posted my analysis here https://bugs.launchpad.net/octavia/+bug/1690812/comments/8

I'll follow-up with the upstream core team for a resolution for this.

Comment 6 Nir Magnezi 2017-07-13 13:53:28 UTC

Looks like https://review.openstack.org/#/c/455569/ resolved this issue after all.
My testing failed because of a cached amphora image, running an older version of the amphora agent without this fix.

As soon as I generated a new one, test_session_persistence worked with loadbalancer_topology = ACTIVE_STANDBY

{0} neutron_lbaas.tests.tempest.v2.scenario.test_session_persistence.TestSessionPersistence.test_session_persistence [246.013035s] ... ok

Comment 19 errata-xmlrpc 2017-12-13 21:25:26 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462

Note You need to log in before you can comment on or make changes to this bug.