Bug 1449938

Summary: Octavia active/standby config+ pool with sourceip session persistence configuration- Service is not available and LB is not deleted after test
Product: Red Hat OpenStack Reporter: Alexander Stafeyev <astafeye>
Component: openstack-octaviaAssignee: Nir Magnezi <nmagnezi>
Status: CLOSED ERRATA QA Contact: Alexander Stafeyev <astafeye>
Severity: high Docs Contact:
Priority: high    
Version: 11.0 (Ocata)CC: amuller, ihrachys, jschluet, lpeer, majopela, nyechiel
Target Milestone: gaKeywords: AutomationBlocker, Triaged
Target Release: 12.0 (Pike)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-octavia-1.0.0-0.20170710195317.e37c0c7.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-13 21:25:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1433537    
Bug Blocks: 1433523    
Attachments:
Description Flags
Amphora log
none
Deeper debugging info none

Description Alexander Stafeyev 2017-05-11 08:19:02 UTC
Description of problem:
I run octavia (rhel amphora) in active_standby mode. 
The test is lbaas scenario test: 



Version-Release number of selected component (if applicable):
rhos11

How reproducible:

100
Steps to Reproduce:
1.Deploy setup with octavia support ( 2 compute nodes better), run all needed post deployment.
2. run the following neutron tempest test 


Actual results:
The test fails and the LB can not be deleted ( pls see log file attached) 

Expected results:
The test should be successful - on a "single" and not "active_backup"  octavia config it runs well.

Additional info:
Logs will be attached soon

Comment 1 Alexander Stafeyev 2017-05-11 14:31:02 UTC
Created attachment 1277912 [details]
Amphora log

Comment 2 Alexander Stafeyev 2017-05-15 12:54:22 UTC
Created attachment 1278975 [details]
Deeper debugging info

In this attachment we you can find rhel amphora debugging results

Comment 3 Nir Magnezi 2017-06-19 10:46:01 UTC
Debugging shows that this is indeed an issue, probably in how Octavia configures HAProxy.

Looking at the logs inside the amphora instance i noticed the following:

host-192-168-199-59 haproxy: [ALERT] 134/082747 (11416) : Proxy 'cce34da1-7b4d-4659-bb0b-6cf01ffbcd68': unable to find local peer 'amphora-b8928a25-ca71-4389-8753-6ab3b2fb3d2c.localdomain' in peers section '9c530de5653d474181b73fe70c398ad5_peers'.
host-192-168-199-59 haproxy: [ALERT] 134/082747 (11416) : Fatal errors found in configuration.

Log here: https://launchpadlibrarian.net/319721688/bug2rhelamphora.txt
Upstream Bug: https://bugs.launchpad.net/octavia/+bug/1690812

Comment 4 Assaf Muller 2017-07-07 19:30:08 UTC
@Nir, https://review.openstack.org/#/c/455569/ just merged, but I can't tell if that patch is a fix for this bug?

Comment 5 Nir Magnezi 2017-07-11 12:09:28 UTC
(In reply to Assaf Muller from comment #4)
> @Nir, https://review.openstack.org/#/c/455569/ just merged, but I can't tell
> if that patch is a fix for this bug?

@Assaf, just re-tested this with master and sadly it looks like it didn't resolve the issue.
I posted my analysis here https://bugs.launchpad.net/octavia/+bug/1690812/comments/8

I'll follow-up with the upstream core team for a resolution for this.

Comment 6 Nir Magnezi 2017-07-13 13:53:28 UTC
Looks like https://review.openstack.org/#/c/455569/ resolved this issue after all.
My testing failed because of a cached amphora image, running an older version of the amphora agent without this fix.

As soon as I generated a new one, test_session_persistence worked with loadbalancer_topology = ACTIVE_STANDBY

{0} neutron_lbaas.tests.tempest.v2.scenario.test_session_persistence.TestSessionPersistence.test_session_persistence [246.013035s] ... ok

Comment 19 errata-xmlrpc 2017-12-13 21:25:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462