RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1593804 - ovn-controller: report when was the most recent successful communication with central
Summary: ovn-controller: report when was the most recent successful communication with...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: openvswitch
Version: 7.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: lorenzo bianconi
QA Contact: haidong li
URL:
Whiteboard:
Depends On:
Blocks: 1216991
TreeView+ depends on / blocked
 
Reported: 2018-06-21 15:31 UTC by Dan Kenigsberg
Modified: 2020-01-14 22:30 UTC (History)
9 users (show)

Fixed In Version: openvswitch-2.9.0-58.el7fdn
Doc Type: Enhancement
Doc Text:
With this update the ovs-appctl connection-status command has been introduced to the ovs-appctl utility. The command enables to monitor hypervisor (HV) south bound database (SBDB) connection status. Layered products can now check if the ovn-controller is properly connected to a central node.
Clone Of:
Environment:
Last Closed: 2018-11-05 14:59:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Dan Kenigsberg 2018-06-21 15:31:54 UTC
RHV would like to know whether an ovn chassis is properly connected to ovn-central.

One way to achieve that is if ovn-controller touches a file after a successful communication with central. RHV would consider it an error condition if the file is older than X minutes.

Other implementations are welcome, too.

Comment 1 lorenzo bianconi 2018-07-04 15:16:05 UTC
If a CMS wants to know if a configuration change has been applied or if a controller is reachable it will possible to use nb_cfg/sb_cfg/hv_cfg columns in the NB_Global table of NB db.
According to the ovn-architecture man page, when the CMS updates the configuration in the northbound database, as part of the same transaction, it can increment the value of the nb_cfg column in the NB_Global table. ovn-northd copies nb_cfg from NB db to the SB_Global table of SB db as part of the same transaction. Each ovn-controller updates ovs-vswitchd configuration and nb_cfg value is propagated to the corresponding columns in the Chassis table of the SB db. Moreover ovn-northd monitors the nb_cfg column in all of the Chassis records in the southbound database. It keeps track of the minimum value among all the records and copies it into the hv_cfg column in the northbound NB_Global table. The CMS or another observer can determine when all of the hypervisors have caught up to the northbound configuration and if they are reachable

Comment 2 Marcin Mirecki 2018-07-11 13:12:17 UTC
Connecting from the host where ovn-controller is located to SB to check if the installation/configuration succeeded would be somewhat problematic. 

Is it possible to validate if the ovn-controller connected and registerd itself correctly with OVN central, without accessesing SB, but only by looking at ovn-controller resources available on the host?

>  Each ovn-controller updates ovs-vswitchd configuration
what is updated? Could we make use of this?

Is external_ids:ovn-chassis-id generated by ovn-controller, or is this value generated on SB, and returned back to ovn-controller? If on SB, we could use that as an indicator.

Comment 3 Marcin Mirecki 2018-07-11 13:49:49 UTC
After a talk with Lorenzo, let me narrow down the question:

We need to know from the ovn-controller, if the ovn-controller ever successfully connected to SB.

By saying "from ovn-controller", I mean from the host where ovn-controller is installed, either by querying the local ovs db or ovn-controller directly.

Comment 4 Dan Kenigsberg 2018-07-12 15:27:52 UTC
What we'd like to have is basically option 2 of https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047025.html

Comment 5 Mark Michelson 2018-08-13 12:24:12 UTC
Dan asked me to comment about the change here. Lorenzo added a new command:

`ovs-appctl -t ovn-controller connection-status`

If ovn-controller is connected to the southbound database, then the command returns "connected". Otherwise, it returns "not connected".

Comment 7 haidong li 2018-09-30 09:16:43 UTC
I have tested with this command on the latest ovs version,the command works well  
if the ovn-controller is connected to SB.Then I changed the address of "external-ids:ovn-remote" to a inexistent address,so the ovn-controller can't connect to the SB and the "ovs-appctl -t" command displayed "not connected" as expected.But after that I restarted the ovn-controller,then the "ovs-appctl -t" command hang there and didn't print anything.Is it expected?

[root@hp-dl580g7-01 ~]# uname -a
Linux hp-dl580g7-01.rhts.eng.pek2.redhat.com 3.10.0-954.el7.x86_64.debug #1 SMP Mon Sep 24 16:24:23 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@hp-dl580g7-01 ~]# rpm -qa | grep openvswitch
openvswitch-ovn-central-2.9.0-70.el7fdp.x86_64
openvswitch-ovn-common-2.9.0-70.el7fdp.x86_64
openvswitch-ovn-host-2.9.0-70.el7fdp.x86_64
openvswitch-selinux-extra-policy-1.0-3.el7fdp.noarch
openvswitch-2.9.0-70.el7fdp.x86_64
[root@hp-dl580g7-01 ~]#
[root@hp-dl580g7-01 ~]# ovs-vsctl set Open_vSwitch . external-ids:ovn-remote=tcp:20.0.0.25:6642
[root@hp-dl580g7-01 ~]# systemctl restart ovn-controller
[root@hp-dl580g7-01 ~]# ovs-appctl -t ovn-controller connection-status
connected
[root@hp-dl580g7-01 ~]# ovs-vsctl set Open_vSwitch . external-ids:ovn-remote=tcp:20.0.0.28:6642
[root@hp-dl580g7-01 ~]# ovs-appctl -t ovn-controller connection-status
not connected
[root@hp-dl580g7-01 ~]# systemctl restart ovn-controller
[root@hp-dl580g7-01 ~]# ovs-appctl -t ovn-controller connection-status
^C2018-09-30T09:10:11Z|00001|fatal_signal|WARN|terminating with signal 2 (Interrupt)                                              <----hang there

[root@hp-dl580g7-01 ~]#

Comment 8 lorenzo bianconi 2018-10-04 15:07:06 UTC
Hi,

Yes, that is the expected behavior since at bootstrap ovn-controller tries to get an initial snapshot of sbdb and blocks until it gets it (forever if ovn remote is invalid). The following series has been proposed by Ben Pfaff in order to improve this limitation but it has not been merged yet
- https://patchwork.ozlabs.org/patch/931134/
- https://patchwork.ozlabs.org/patch/931135/

I double-checked that this series fixes the reported issue

Comment 9 haidong li 2018-10-08 06:36:13 UTC
Change the status to verified according to comment7 and comment8.

Comment 15 errata-xmlrpc 2018-11-05 14:59:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3500


Note You need to log in before you can comment on or make changes to this bug.