Bug 1879389

Summary: When creating a service of type "LoadBalancer" (Kuryr,OVN) communication through this loadbalancer failes after 2-5 minutes.
Product: Red Hat OpenStack Reporter: Luis Tomas Bolivar <ltomasbo>
Component: python-networking-ovnAssignee: ffernand <ffernand>
Status: CLOSED ERRATA QA Contact: GenadiC <gcheresh>
Severity: medium Docs Contact:
Priority: high    
Version: 16.1 (Train)CC: apevec, dalvarez, gcheresh, jaeichle, jlibosva, lhh, majopela, rheinzma, rlobillo, scohen, svmichel
Target Milestone: z3Keywords: Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: python-networking-ovn-7.3.1-1.20200902233413.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1875806 Environment:
Last Closed: 2020-12-15 18:36:32 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1875806    
Attachments:
Description Flags
log from controllers when fip vip gets removed none

Comment 2 ffernand 2020-09-16 09:26:57 UTC
Can you please provide sos reports? We need to understand why the ovn nb db is changing from:


WORKING:
external_ids        : {enabled=True, listener_33d808ee-781b-4d69-842c-6d4a3b44f7fd="443:pool_0c7e84f6-9da4-4839-a0b6-a720730de6c5", listener_7f388eab-ebf4-46e5-8142-2ada37478be4="80:pool_ea99e8e9-5148-40ca-872f-b9e1d53104fc", lr_ref=neutron-86c3b602-eafa-4cb3-991a-1c953716350a, ls_refs="{\"neutron-8741ffd1-120f-4c1b-b84d-22c14b28078d\": 1, \"neutron-392cea35-4fcd-46ce-95ee-826a9afd796e\": 4}", "neutron:vip"="172.30.36.185", "neutron:vip_fip"="10.46.44.122", "neutron:vip_port_id"="463de45f-b5f5-47a6-8c5f-56f391eb27b1", pool_0c7e84f6-9da4-4839-a0b6-a720730de6c5="member_2dc01275-b79b-460b-b2c4-6dbc34ba852b_10.197.1.137:443_4f2034dc-1357-4939-856b-f155c52bb856,member_ea53999e-5543-4b91-bdb0-6fa2dfd223e3_10.197.2.29:443_4f2034dc-1357-4939-856b-f155c52bb856", pool_ea99e8e9-5148-40ca-872f-b9e1d53104fc="member_7a37ea3a-c461-46fa-b36d-a6c26ac3c1ae_10.197.1.137:80_4f2034dc-1357-4939-856b-f155c52bb856,member_d396fcf0-8409-4333-9f72-4389f647c509_10.197.2.29:80_4f2034dc-1357-4939-856b-f155c52bb856"}
vips                : {"10.46.44.122:443"="10.197.1.137:443,10.197.2.29:443", "10.46.44.122:80"="10.197.1.137:80,10.197.2.29:80", "172.30.36.185:443"="10.197.1.137:443,10.197.2.29:443", "172.30.36.185:80"="10.197.1.137:80,10.197.2.29:80"}

TO:

After some time, stop working:
external_ids        : {enabled=True, listener_33d808ee-781b-4d69-842c-6d4a3b44f7fd="443:pool_0c7e84f6-9da4-4839-a0b6-a720730de6c5", listener_7f388eab-ebf4-46e5-8142-2ada37478be4="80:pool_ea99e8e9-5148-40ca-872f-b9e1d53104fc", lr_ref=neutron-86c3b602-eafa-4cb3-991a-1c953716350a, ls_refs="{\"neutron-8741ffd1-120f-4c1b-b84d-22c14b28078d\": 1, \"neutron-392cea35-4fcd-46ce-95ee-826a9afd796e\": 4}", "neutron:vip"="172.30.36.185", "neutron:vip_port_id"="463de45f-b5f5-47a6-8c5f-56f391eb27b1", pool_0c7e84f6-9da4-4839-a0b6-a720730de6c5="member_2dc01275-b79b-460b-b2c4-6dbc34ba852b_10.197.1.137:443_4f2034dc-1357-4939-856b-f155c52bb856,member_ea53999e-5543-4b91-bdb0-6fa2dfd223e3_10.197.2.29:443_4f2034dc-1357-4939-856b-f155c52bb856", pool_ea99e8e9-5148-40ca-872f-b9e1d53104fc="member_7a37ea3a-c461-46fa-b36d-a6c26ac3c1ae_10.197.1.137:80_4f2034dc-1357-4939-856b-f155c52bb856,member_d396fcf0-8409-4333-9f72-4389f647c509_10.197.2.29:80_4f2034dc-1357-4939-856b-f155c52bb856"}
vips                : {"172.30.36.185:443"="10.197.1.137:443,10.197.2.29:443", "172.30.36.185:80"="10.197.1.137:80,10.197.2.29:80"}

Comment 6 ffernand 2020-09-23 21:46:36 UTC
Created attachment 1716150 [details]
log from controllers when fip vip gets removed

Log with events on controllers that depicts this issue.

2020-09-23 08:40:37.828 [N2] 49 DEBUG networking_ovn.common.maintenance [req-ece4077c-7621-4c18-b22d-f96b6d1e3921 - - - - -] Maintenance task: Fixing resource 32be228a-3700-42cc-83ec-1bfc59315709 (type: ports)

to

2020-09-23 08:40:38.081 [O0] 26 INFO networking_ovn.octavia.ovn_driver [-] XXX net_ovn vip_port_update_handler lb_id 9da5245e-d9dd-474b-9573-9376f254a0c1 port_name ovn-lb-vip-9da5245e-d9dd-474b-9573-9376f254a0c1 fip None lb_vip_fip 10.46.44.95 removing ext_id OVN_PORT_FIP_EXT_ID_KEY

Comment 7 ffernand 2020-09-25 13:17:25 UTC
Posted on  https://review.opendev.org/#/c/753833/
To be backported to stable/train.

Comment 9 ffernand 2020-10-12 22:42:10 UTC
Fixed in python-networking-ovn-7.3.1-1.20200902233413.el8ost

https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=1338151

Comment 13 rlobillo 2020-11-23 07:34:31 UTC
Verified on OSP16.1 with OVN octavia (compose: RHOS-16.1-RHEL-8-20201110.n.1).

Please refer to the verification procedure described on https://bugzilla.redhat.com/show_bug.cgi?id=1875806

Comment 21 errata-xmlrpc 2020-12-15 18:36:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.3 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:5413