Bug 2125843 - Connections to the OVN databases are unstable
Summary: Connections to the OVN databases are unstable
Keywords:
Status: CLOSED DUPLICATE of bug 2128914
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-ovsdbapp
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: ---
Assignee: Terry Wilson
QA Contact: Toni Freger
URL:
Whiteboard:
: 2125868 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-09-11 01:14 UTC by Jakub Libosvar
Modified: 2023-01-20 10:52 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-27 18:29:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-18650 0 None None None 2022-09-11 01:26:37 UTC

Description Jakub Libosvar 2022-09-11 01:14:44 UTC
Description of problem:
The server often disconnects from the client because of probes are not being replied to. It's not because of the load on the server since server is the one waiting for the reply and it happens in both NB and SB dbs and only to connections to the neutron server.

2022-09-11T00:01:36.310Z|132111|reconnect|ERR|tcp:10.0.37.88:39878: no response to inactivity probe after 361 seconds, disconnecting
2022-09-11T00:02:09.779Z|132127|reconnect|ERR|tcp:10.0.37.88:45798: no response to inactivity probe after 360 seconds, disconnecting
2022-09-11T00:02:48.809Z|132148|reconnect|ERR|tcp:10.0.37.88:45630: no response to inactivity probe after 361 seconds, disconnecting
2022-09-11T00:05:16.381Z|132186|reconnect|ERR|tcp:10.0.37.88:50510: no response to inactivity probe after 436 seconds, disconnecting
2022-09-11T00:07:14.981Z|132238|reconnect|ERR|tcp:10.0.37.88:58688: no response to inactivity probe after 370 seconds, disconnecting
2022-09-11T00:07:38.006Z|132245|reconnect|ERR|tcp:10.0.37.88:33770: no response to inactivity probe after 360 seconds, disconnecting
2022-09-11T00:08:25.866Z|132279|reconnect|ERR|tcp:10.0.37.88:37142: no response to inactivity probe after 365 seconds, disconnecting
2022-09-11T00:08:50.498Z|132280|reconnect|ERR|tcp:10.0.37.88:38800: no response to inactivity probe after 360 seconds, disconnecting
2022-09-11T00:09:31.489Z|132281|reconnect|ERR|tcp:10.0.37.88:37446: no response to inactivity probe after 361 seconds, disconnecting
2022-09-11T00:11:16.994Z|132321|reconnect|ERR|tcp:10.0.37.88:44330: no response to inactivity probe after 361 seconds, disconnecting

It doesn't happen when neutron_api container is downgraded to 16.1.7 version. There is fairly large number of TRY_AGAIN messages.

[ctrl-net-d-01.infra.prod.upshift.rdu2.redhat.com] [01:08:21 AM]
[root@ctrl-net-d-01 jlibosva_logs]# grep TRY_AGAIN server.log.1 -c
2069103


Version-Release number of selected component (if applicable):
16.1.8

How reproducible:
Always

Steps to Reproduce:
1. Start neutron server
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 ldenny 2022-09-12 17:40:44 UTC
*** Bug 2125868 has been marked as a duplicate of this bug. ***

Comment 3 ldenny 2022-09-12 17:44:59 UTC
Updating the inactivity probe to 0 in the neutron ovn config file under the [ovn] section on all 3 controllers and restarting neutron_api worked around this issue for now.

```
[root@controller-0 ~]# grep "ovsdb_probe_interval" /var/lib/config-data/puppet-generated/neutron/etc/neutron/plugins/ml2/ml2_conf.ini
ovsdb_probe_interval = 0
```

Comment 9 Jakub Libosvar 2022-09-27 18:29:36 UTC

*** This bug has been marked as a duplicate of bug 2128914 ***


Note You need to log in before you can comment on or make changes to this bug.