Bug 2078793 - Unable to delete an LB member after OSP16.1->16.2 upgrade
Summary: Unable to delete an LB member after OSP16.1->16.2 upgrade
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 16.2 (Train)
Hardware: All
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: kgilliga
QA Contact: RHOS Documentation Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-26 08:39 UTC by Michał Dulko
Modified: 2024-01-31 16:22 UTC (History)
26 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
Do not alter OVN DB entries or the OVN DB schema during an update from RHOSP 16.1 to RHOSP 16.2. Doing so can result in misconfiguration and data loss. + If you alter the OVN DB during an update of an environment with OpenShift, Kuryr, and the Load-balancing service (octavia), you might not be able to delete Load-balancing members. + *Workaround:* If you altered the OVN DB during an update of an environment with OpenShift, Kuryr, and the Load-balancing service and you cannot delete Load-balancing members, perform the following steps: . Access the mysql octavia DB. . Change the member's provisioning_status to DELETED. + If experience other networking issues after altering the OVN DB during an update, run the `neutron-db-sync tool`.
Clone Of:
Environment:
Last Closed: 2024-01-31 16:22:05 UTC
Target Upstream Version:
Embargoed:
mdemaced: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-14872 0 None None None 2022-04-26 08:47:49 UTC

Internal Links: 2100704

Description Michał Dulko 2022-04-26 08:39:39 UTC
Description of problem:
After running OSP 16.1->16.2 upgrade we get one LB member that is undeletable. The request to delete it is fine, but member stays in ERROR state. Following log is seen on octavia-api:

2022-04-26 08:23:11.574 14 DEBUG networking_ovn.octavia.ovn_driver [-] Handling request handle_member_dvr with info {'id': '5e519f2b-1c2f-491c-a7b9-6f2f7418d5fe', 'address': '10.128.68.36', 'pool_id': 'fdc8f9a6-84a3-4480-9cb9-e1f2415ac9ee', 'subnet_id': 'ea99cfc8-cd8e-40b3-ad23-4ac6b7961368', 'action': 'member_deleted'} request_handler /usr/lib/python3.6/site-packages/networking_ovn/octavia/ovn_driver.py:478
10:25         2022-04-26 08:23:11.582 14 DEBUG networking_ovn.octavia.ovn_driver [-] LB ffb12823-2ad6-4383-88fc-af7dba53bb4c has no FIP on VIP configured. There is no need to centralize member 5e519f2b-1c2f-491c-a7b9-6f2f7418d5fe traffic. handle_member_dvr /usr/lib/python3.6/site-packages/networking_ovn/octavia/ovn_driver.py:1912
10:25         2022-04-26 08:23:36.330 15 INFO octavia.api.v2.controllers.member [req-9c897022-3f2b-4d22-8d89-6a5e33b47c77 - a4e841133a324128b1ab429cac4be135 - default default] Sending delete Member 5e519f2b-1c2f-491c-a7b9-6f2f7418d5fe to provider ovn
10:25         2022-04-26 08:23:36.331 15 DEBUG networking_ovn.octavia.ovn_driver [-] Handling request member_delete with info {'id': '5e519f2b-1c2f-491c-a7b9-6f2f7418d5fe', 'address': '10.128.68.36', 'protocol_port': 8443, 'pool_id': 'fdc8f9a6-84a3-4480-9cb9-e1f2415ac9ee', 'subnet_id': 'ea99cfc8-cd8e-40b3-ad23-4ac6b7961368'} request_handler /usr/lib/python3.6/site-packages/networking_ovn/octavia/ovn_driver.py:478
10:25         2022-04-26 08:23:36.340 15 ERROR networking_ovn.octavia.ovn_driver [-] Exception occurred during deletion of member: octavia_lib.api.drivers.exceptions.DriverError: Member 5e519f2b-1c2f-491c-a7b9-6f2f7418d5fe not found in the pool
10:25         2022-04-26 08:23:36.340 15 ERROR networking_ovn.octavia.ovn_driver octavia_lib.api.drivers.exceptions.DriverError: Member 5e519f2b-1c2f-491c-a7b9-6f2f7418d5fe not found in the pool
10:25         2022-04-26 08:23:36.341 15 DEBUG networking_ovn.octavia.ovn_driver [-] Updating status to octavia: {'pools': [{'id': 'fdc8f9a6-84a3-4480-9cb9-e1f2415ac9ee', 'provisioning_status': 'ACTIVE'}], 'members': [{'id': '5e519f2b-1c2f-491c-a7b9-6f2f7418d5fe', 'provisioning_status': 'ERROR'}], 'loadbalancers': [{'id': '7cdaf995-0ddb-4116-aefc-9611c4a3311f', 'provisioning_status': 'ACTIVE'}]} _update_status_to_octavia /usr/lib/python3.6/site-packages/networking_ovn/octavia/ovn_driver.py:505

Version-Release number of selected component (if applicable):
16.2, but upgraded from 16.1. OVN is used as an Octavia provider.

How reproducible:
Most likely randomly, I'll provide steps we've did to encounter it.

Steps to Reproduce:
1. Deploy OSP 16.1.
2. Deploy OCP 4.10 with Kuryr on top.
3. Upgrade underlying OSP to 16.2.
4. Run Kuryr tempest tests.

Actual results:
kuryr-controller will keep restarting being unable to delete a member in ERROR state. The request is successful, but member stays.

Expected results:
Member is deletable.

Additional info:

Comment 15 Sofer Athlan-Guyot 2022-06-06 13:43:51 UTC
Hi,

Sofer from Update squad.

> 04-25 16:12 c0 (backup, schema v0), c1 (backup, schema v0), c2 (active, schema v0)
> 04-25 20:14 c1 (backup, schema v0), c2 (active, schema v0) // c0 shutdown - update process
> 04-25 20:14 c1 (backup, schema v0), c2 (active, schema v0) // LB member under analysis created, c1 replicate from c2
> 04-25 20:35 c0 (backup (stale db), schema v1), c1 (backup, schema v0), c2 (active, schema v0) // c0 alive, it was upgraded to schema v1 but it can not replicate from c2 (mismatch schema)
> 04-25 20:57 c0 (backup (stale db), schema v1), c2 (active, schema v0) // c1 shutdown - update process
> 04-25 21:15 c0 (backup (stale db), schema v1), c1 (backup (stale db), schema v1), c2 (active, schema v0) // c1 alive, it was upgraded to schema v1 but it can not replicate from c2 (mismatch schema db) 
> 04-25 21:31 c0 (active (stale db), schema v1), c1 (backup (stale db), schema v1) // c2 shutdown - update process - PROBLEM!!! c0 becomes active with a big stale DB
> 04-25 21:48 c0 (active (stale db), schema v1), c1 (backup (stale db), schema v1), c2 (backup, schema v1) // c2 alive, all nodes with same schema version
> 04-26 00:55 c1 (backup (stale db), schema v1), c2 (active, schema v1) // c0 shutdown - reboot process, c2 becomes active
> 04-26 00:56 c0 (backup (stale db), schema v1), c1 (backup (stale db), schema v1), c2 (active, schema v1) // c0 alive
> 04-26 00:57 c0 (backup (stale db), schema v1), c2 (active, schema v1) // c1 shutdown - reboot process, c2 becomes active
> 04-26 00:58 c0 (backup (stale db), schema v1), c1 (backup (stale db), schema v1), c2 (active, schema v1) // c1 alive
> 04-26 00:59 c0 (active (stale db), schema v1), c1 (backup (stale db), schema v1) // c2 shutdown - reboot process, c0 becomes active - PROBLEM!!! c0 becomes active with a big stale DB
> 04-26 01:00 c0 (active (stale db), schema v1), c1 (backup (stale db), schema v1), c2 (backup, schema v1) // c2 alive

Here my understanding of the sequence of events. Controllers were
updated in that order:
 - c2 master
 - c0 updated; c2 master, c0 cannot replicate from c2
 - c1 updated; c2 master, c1 cannot replicate from c2
 - c2 updated; c0 master with a stale db, why doesn't it see that new data in c2, is it some sort of split brain situation where ovs cannot tell if c0 or c2 is the right database?
 - c0 master with stale db ... new changes come there... will c2 receive those new change or is the sync broken altogether now ?

Running

    [1] command 'ovsdb-client convert <DB> <SCHEMA>' will do the upgrade of schema version

"could reduce the likehood of the issue."

I'm unclear when this should be run. For each controllers we can trigger command:
 a. before cluster shutdown (update_tasks step<=1, rpm not yet updated)
 b. after cluster restart (update_tasks, step>5, rpm updated, ovndb updated, ovs updated but not yet restarted)
 c. after the node has been entirely updated (post_update_tasks, ovndb updated, ovs updated but not yet restarted)

In those places I don't see where we should run that command.

Note, we also have a "new" hook for 16.X version related to OVN
update: before the entire update starts for the overcloud we run
command for ovn controller update. But, it doesn't seem to help for
this.

Eventually, I've been through the update documentation from 2.14 to
2.15. Do we need another issue for trying to implement this during
update?  I can see some challenge in implementing this is the tripleo
framework context.

Overall, let me know if you need help from our DFG.

Thanks,

Comment 20 Sofer Athlan-Guyot 2022-06-17 13:09:58 UTC
Hi,

Before jumping in, two comments. First, I'm moving myself out of
ownership of this issue, but I'll stay for support in testing and
finding the best way to solve this in the update process.

Second, it has been mentionned that "the only possible solution for this
issue would be to make RAFT available in OSP 16.2?"[0]. Is that
solution has an issue tracker already as it seems that whatever we
come up with here will be a crutch not a complete solution as far as I
understood - correct me if I'm wrong -.

With that out of the way here are details of the update process in
director.

> 
> If I'm missing something related to the upgrade process that would not
> be able to follow those steps (most than likely) feel free to correct
> me

Oki, so here are the main hooks of the update process[1]:

1. before we have updated anything on the overcloud but with undercloud updated;

    We can run command on all overcloud at that time. It adds
    complexity for the customer so it's not desirable, but it's a
    option. That's what we do to update ovn-controller.
    
    The state of the overcloud at that time is:
    
    - repos are updated for the oc hosts but yum upgrade hasn't run, so old
      binaries (ovs) are running;
    
    - images definition are updated in the playbooks but old
      containers are still running, except for the ovn-controller
      containers which have been updated on the entire overcloud
      (controller and compute roles)
    
    - ovn-northdb is thus running on old container definition
    
2. Then, we run "update run --limit controller": this will update
   packages and containers on the nodes having the Controller
   role. That's the relevant steps for the issue at end (naming can
   vary depending on the role definition, but let's not dwell on this)

During that stage, we do a rolling update of the Controller's role
nodes one after another to ensure continuity of service. The steps are
roughly:

    I. "update_steps" from the tripleo heat templates:
    
        a. remove the current node from pacemaker control (pcs cluster stop)
        
        b. pull in the new images for all containers under pacemaker
           controller and tag them properly (pcmklatest) to be able to
           update then properly.
    
       c. run yum upgrade with special handling for ovs to avoid cut
          in network;
    
       d. start pacemaker back: at that moment ovn-northdb is running
          with the new image, ie it has been updated on that node.
    
    II. on the same node we continue with "common deploy steps": update
        the containers managed using paunch or paunch api (through ansible role)
    
       a. Pacemaker resources are an exception. A restart of pacemaker
          container resources (so ovn-northdb) can happen during that
          time if configuration has changed, but it's handled
          specially by specific commands (ie not paunch)
    
       b. calculate new hash of the configuration for every container;

       b. destroy/recreate container if configuration hash has changed
          or if image has changed: this is when all containers not
          under pacemaker control are updated;
    
    This happens one node at a time for controllers (update_serial = 1)
    and we *cannot* predict by which node the process will start, ie we
    don't know if ovn-northdb is master or not on the starting node.
    
With this context about update, let's go back to your need:

> 1 - Determine active node
> 2 - Determina backup nodes
> 3 - Run command ovsdb-client to update the db schema on backup nodes

This cannot happen during "2. update run --limit controller" as there 
is currently no constraint around the starting node.  The only place 
would be "1. before we have...", ie an extra step before update. But
ovs hasn't been updated nor has ovn-northdb container. Only ovn-controller has. 

Could this be enough to run your command?

Note that we could lessen the burden to the customer by adding the
necessary commands in their ansible with the ovn-container special
stage: code here [2] and documentation there [3].

Could you describe more precisely what is expected for this command to
do its job and would it fit the above constraints?  Should it be run in
a container or from the host ?

> 4 - Run command ovsdb-client to update the db schema on active node

Same questions there.

> 5 - Run the actual upgrade process 

hope this helps.

[0] https://bugzilla.redhat.com/show_bug.cgi?id=2078793#c9
[1] more details there from a developer point of view: https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/train/deployment/README.rst#L158
[2] https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/train/deployment/ovn/ovn-controller-container-puppet.yaml#L381-L428
[3] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/keeping_red_hat_openstack_platform_updated/index#proc_updating-ovn-controller-container_updating-overcloud

Comment 22 Fernando Royo 2022-06-17 16:12:55 UTC
Hi Sofer, 

Just to reply to your questions and to thank you for your elaborate response, though a bit out of my expertise and probably some colleague could validate my comments inline:

(In reply to Sofer Athlan-Guyot from comment #20)
> Hi,
> 
> Before jumping in, two comments. First, I'm moving myself out of
> ownership of this issue, but I'll stay for support in testing and
> finding the best way to solve this in the update process.
> 
> Second, it has been mentionned that "the only possible solution for this
> issue would be to make RAFT available in OSP 16.2?"[0]. Is that
> solution has an issue tracker already as it seems that whatever we
> come up with here will be a crutch not a complete solution as far as I
> understood - correct me if I'm wrong -.

Not exactly. Moving to an ovsdb-server clustered (RAFT) version will be implemented for versions > 17.0, so we need a workaround for 16.X versions, with the main focus on the upgrade from 16.1 to 16.2, where this situation is more than likely to occur, but totally agree that this should reduce the probability of problems but not a totally safe solution. Because if we have a failover during the process we could have the issue again, but as said before, we are reducing likehood, and this is a workaround.

> 
> With that out of the way here are details of the update process in
> director.
> 
> > 
> > If I'm missing something related to the upgrade process that would not
> > be able to follow those steps (most than likely) feel free to correct
> > me
> 
> Oki, so here are the main hooks of the update process[1]:
> 
> 1. before we have updated anything on the overcloud but with undercloud
> updated;
> 
>     We can run command on all overcloud at that time. It adds
>     complexity for the customer so it's not desirable, but it's a
>     option. That's what we do to update ovn-controller.
>     
>     The state of the overcloud at that time is:
>     
>     - repos are updated for the oc hosts but yum upgrade hasn't run, so old
>       binaries (ovs) are running;
>     
>     - images definition are updated in the playbooks but old
>       containers are still running, except for the ovn-controller
>       containers which have been updated on the entire overcloud
>       (controller and compute roles)
>     
>     - ovn-northdb is thus running on old container definition
>     
> 2. Then, we run "update run --limit controller": this will update
>    packages and containers on the nodes having the Controller
>    role. That's the relevant steps for the issue at end (naming can
>    vary depending on the role definition, but let's not dwell on this)
> 
> During that stage, we do a rolling update of the Controller's role
> nodes one after another to ensure continuity of service. The steps are
> roughly:
> 
>     I. "update_steps" from the tripleo heat templates:
>     
>         a. remove the current node from pacemaker control (pcs cluster stop)
>         
>         b. pull in the new images for all containers under pacemaker
>            controller and tag them properly (pcmklatest) to be able to
>            update then properly.
>     
>        c. run yum upgrade with special handling for ovs to avoid cut
>           in network;
>     
>        d. start pacemaker back: at that moment ovn-northdb is running
>           with the new image, ie it has been updated on that node.
>     
>     II. on the same node we continue with "common deploy steps": update
>         the containers managed using paunch or paunch api (through ansible
> role)
>     
>        a. Pacemaker resources are an exception. A restart of pacemaker
>           container resources (so ovn-northdb) can happen during that
>           time if configuration has changed, but it's handled
>           specially by specific commands (ie not paunch)
>     
>        b. calculate new hash of the configuration for every container;
> 
>        b. destroy/recreate container if configuration hash has changed
>           or if image has changed: this is when all containers not
>           under pacemaker control are updated;
>     
>     This happens one node at a time for controllers (update_serial = 1)
>     and we *cannot* predict by which node the process will start, ie we
>     don't know if ovn-northdb is master or not on the starting node.
>     
> With this context about update, let's go back to your need:
> 
> > 1 - Determine active node
> > 2 - Determina backup nodes
> > 3 - Run command ovsdb-client to update the db schema on backup nodes
> 
> This cannot happen during "2. update run --limit controller" as there 
> is currently no constraint around the starting node.  The only place 
> would be "1. before we have...", ie an extra step before update. But
> ovs hasn't been updated nor has ovn-northdb container. Only ovn-controller
> has. 
> 
> Could this be enough to run your command?

I think so, but after reading your well explained lines above, I think we should need three tasks:

Task 1 - We should need to get the new schema files somehow
Task 2 - Update schema over backup servers (still need confirmation if we can easily identify via ovsdb-client)
Task 3 - Update schema over backup servers (still need confirmation if we can easily identify via ovsdb-client)

Task 2 and 3 could run in all controllers indepentently of the role (backup or active), the "script" will ensure that Task 2 just works over the backup and Task 3 over the active ones.

> 
> Note that we could lessen the burden to the customer by adding the
> necessary commands in their ansible with the ovn-container special
> stage: code here [2] and documentation there [3].
> 
> Could you describe more precisely what is expected for this command to
> do its job and would it fit the above constraints?  Should it be run in
> a container or from the host ?

Basically we can add to the process (Task 2 and Task 3) with "ovsdb-client needs-conversion <server> <schema.json>" more robust to decide to do the "ovsdb-client convert <server> <schema.json>" one. It connects to server and requests to convert the database specified in the schema file (get on Task 1). This conversion can be done online, clients connected will received the notification of a reconnection required or in case they doesn't support that notif, they will be directly disconnect and detect the change after reconnect. More details are shown in [1]. As any DB change is it strongly recommend to do a backup previously.

> 
> > 4 - Run command ovsdb-client to update the db schema on active node
> 
> Same questions there.
> 
> > 5 - Run the actual upgrade process 
> 

Here, if I understand well, will continue with "update run --limit controller" where several failover may happen but if we already have the schema updated in all nodes, they do not worry us too much.
 
> hope this helps.

Sure, in any case I will join #rhos-upgrades for further details.

> 
> [0] https://bugzilla.redhat.com/show_bug.cgi?id=2078793#c9
> [1] more details there from a developer point of view:
> https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/train/
> deployment/README.rst#L158
> [2]
> https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/train/
> deployment/ovn/ovn-controller-container-puppet.yaml#L381-L428
> [3]
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.
> 2/html-single/keeping_red_hat_openstack_platform_updated/index#proc_updating-
> ovn-controller-container_updating-overcloud

[1] https://man7.org/linux/man-pages/man1/ovsdb-client.1.html

Comment 23 Sofer Athlan-Guyot 2022-06-20 15:17:05 UTC
Hi,

I've edited the replies for clarity.

(In reply to Fernando Royo from comment #22)
> Hi Sofer, 
> 
> > Second, it has been mentionned that "the only possible solution for this
> > issue would be to make RAFT available in OSP 16.2?"[0]. Is that
> > solution has an issue tracker already as it seems that whatever we
> > come up with here will be a crutch not a complete solution as far as I
> > understood - correct me if I'm wrong -.
> 
> Not exactly. Moving to an ovsdb-server clustered (RAFT) version will be
> implemented for versions > 17.0, so we need a workaround for 16.X versions,
> with the main focus on the upgrade from 16.1 to 16.2, where this situation
> is more than likely to occur, but totally agree that this should reduce the
> probability of problems but not a totally safe solution. Because if we have
> a failover during the process we could have the issue again, but as said
> before, we are reducing likehood, and this is a workaround.

Noted, so we won't expect a backport of the RAFT db. As a side note we
don't really know in the 16.2 code if we running a 16.1->16.2 update
or an update of 16.2. Ideally the code would run equally well for both
scenarios.

> > Could this be enough to run your command?
> 
> I think so, but after reading your well explained lines above, I think we
> should need three tasks:

So we would run it before the update starts so that all db schema are
updated before ovs/ovndb are updated, correct?

> Task 1 - We should need to get the new schema files somehow

Oki, so on the undercloud (which is already updated) we have the file:

/usr/share/openvswitch/vswitch.ovsschema and
/usr/share/openvswitch/vtep.ovsschema. I guess those are the needed
files?

Through ansible we could send them to the Overcloud's nodes and use
them.

> Task 2 - Update schema over backup servers (still need confirmation if we
> can easily identify via ovsdb-client)
> Task 3 - Update schema over backup servers (still need confirmation if we
> can easily identify via ovsdb-client)

hum ... it's twice "backup" :) But latter it becomes clear that it's
first backup and then active.

At that time the ovs binary will not have been updated, is that ok to
update the schema without the new ovs binary ?

Same question knowing that ovn-northdb would not be updated neither at
that time.

Now a question about the sequencing.  The order (whatever it is) seems
important and backup needs to be done first.

That mean running ansible with a serial 1 over the controller
nodes. Here some ansible prototype:

 - 0. send the schema(s?) file over to the controller's nodes from the undercloud;
 - 1. collect info over all controllers to get the status of ovndb there (active or backup) somehow.
 - 2. find the active one
 - 3. run the command with serial 1 (one after the other) on the other nodes to update its ovs database, while ovs and ovn-northdb aren't yet updated, but ovn-controller is.
 - 4. run the same command over the active node

Should the step 3. "block" somehow waiting for something or just
running the command is ok and we can move on to the other nodes, or
can we do all backup at the same time ?
  
> Task 2 and 3 could run in all controllers indepentently of the role (backup
> or active), the "script" will ensure that Task 2 just works over the backup
> and Task 3 over the active ones.

Oki, I'm a bit lost here.  Does it mean that order doesn't matter or
that the command needed on the backup node are different from the one
needed on the active node and that the script should take that into
account ?

> > > 5 - Run the actual upgrade process 
> > 
> 
> Here, if I understand well, will continue with "update run --limit
> controller" where several failover may happen but if we already have the
> schema updated in all nodes, they do not worry us too much.

Ok, nice. So when we actually update the ovn-northdb container it
won't complain and allow synchronisation over the board because the
schema version will be the same everywhere, correct?

I emphasis again that ovs rpm is updated but not restarted during that
process, ie the previous binary is still running in memory until
reboot.

As an "out of this bugzilla context" question, we used to need that
"non reboot ovs" state because ovs would cut the connections if we
restarted it, is that still the case, do you know?

Thanks,

Comment 24 Fernando Royo 2022-06-21 13:58:30 UTC
Hi, 

Basically to confirm points.

(In reply to Sofer Athlan-Guyot from comment #23)
> Hi,
> 
> I've edited the replies for clarity.
> 
> (In reply to Fernando Royo from comment #22)
> > Hi Sofer, 
> > 
> > > Second, it has been mentionned that "the only possible solution for this
> > > issue would be to make RAFT available in OSP 16.2?"[0]. Is that
> > > solution has an issue tracker already as it seems that whatever we
> > > come up with here will be a crutch not a complete solution as far as I
> > > understood - correct me if I'm wrong -.
> > 
> > Not exactly. Moving to an ovsdb-server clustered (RAFT) version will be
> > implemented for versions > 17.0, so we need a workaround for 16.X versions,
> > with the main focus on the upgrade from 16.1 to 16.2, where this situation
> > is more than likely to occur, but totally agree that this should reduce the
> > probability of problems but not a totally safe solution. Because if we have
> > a failover during the process we could have the issue again, but as said
> > before, we are reducing likehood, and this is a workaround.
> 
> Noted, so we won't expect a backport of the RAFT db. As a side note we
> don't really know in the 16.2 code if we running a 16.1->16.2 update
> or an update of 16.2. Ideally the code would run equally well for both
> scenarios.
> 

Yeah.

> > > Could this be enough to run your command?
> > 
> > I think so, but after reading your well explained lines above, I think we
> > should need three tasks:
> 
> So we would run it before the update starts so that all db schema are
> updated before ovs/ovndb are updated, correct?
> 
> > Task 1 - We should need to get the new schema files somehow
> 
> Oki, so on the undercloud (which is already updated) we have the file:
> 
> /usr/share/openvswitch/vswitch.ovsschema and
> /usr/share/openvswitch/vtep.ovsschema. I guess those are the needed
> files?

No, OVN NB DB is linked to /usr/share/ovn/ovn-nb.ovsschema and OVN SB DB to /usr/share/ovn/ovn-sb.ovsschema, are those files also available on undercloud?

> 
> Through ansible we could send them to the Overcloud's nodes and use
> them.
> 
> > Task 2 - Update schema over backup servers (still need confirmation if we
> > can easily identify via ovsdb-client)
> > Task 3 - Update schema over backup servers (still need confirmation if we
> > can easily identify via ovsdb-client)
> 
> hum ... it's twice "backup" :) But latter it becomes clear that it's
> first backup and then active.
> 

ouch, copy-paste ...

> At that time the ovs binary will not have been updated, is that ok to
> update the schema without the new ovs binary ?
> 
> Same question knowing that ovn-northdb would not be updated neither at
> that time.
> 

I think so, in any case I will confirm with @dceara 

> Now a question about the sequencing.  The order (whatever it is) seems
> important and backup needs to be done first.
> 
> That mean running ansible with a serial 1 over the controller
> nodes. Here some ansible prototype:
> 
>  - 0. send the schema(s?) file over to the controller's nodes from the
> undercloud;
>  - 1. collect info over all controllers to get the status of ovndb there
> (active or backup) somehow.
>  - 2. find the active one
>  - 3. run the command with serial 1 (one after the other) on the other nodes
> to update its ovs database, while ovs and ovn-northdb aren't yet updated,
> but ovn-controller is.
>  - 4. run the same command over the active node
> 
> Should the step 3. "block" somehow waiting for something or just
> running the command is ok and we can move on to the other nodes, or
> can we do all backup at the same time ?

I think is not neccesary, I think that we can run step 3 in paralell in all backups nodes (different ones to the found in step 2), but @dceara could confirm this point :D

>   
> > Task 2 and 3 could run in all controllers indepentently of the role (backup
> > or active), the "script" will ensure that Task 2 just works over the backup
> > and Task 3 over the active ones.
> 
> Oki, I'm a bit lost here.  Does it mean that order doesn't matter or
> that the command needed on the backup node are different from the one
> needed on the active node and that the script should take that into
> account ?

Sorry for the mix here. I was thinking to have the same script file to be run on all controllers, i.e "update_db_schema.sh /tmp/ovn/new_ovn-nb.ovsschema backup" and just the backups ones will do the magic, and after that run "update_db_schema.sh /tmp/ovn/new_ovn-nb.ovsschema active" to run finally on the active one, but I prefer your options to first determine active/backup ones and run according to that, just to be cover for a failover during the steps 3 and 4 and finally we get one node did it twice and one did not do it at all

> 
> > > > 5 - Run the actual upgrade process 
> > > 
> > 
> > Here, if I understand well, will continue with "update run --limit
> > controller" where several failover may happen but if we already have the
> > schema updated in all nodes, they do not worry us too much.
> 
> Ok, nice. So when we actually update the ovn-northdb container it
> won't complain and allow synchronisation over the board because the
> schema version will be the same everywhere, correct?

happy to see that I was understood, right!!!

> 
> I emphasis again that ovs rpm is updated but not restarted during that
> process, ie the previous binary is still running in memory until
> reboot.

ok! just let see if @dceara could confirm that it is not an issue.

> 
> As an "out of this bugzilla context" question, we used to need that
> "non reboot ovs" state because ovs would cut the connections if we
> restarted it, is that still the case, do you know?
> 

No, another point that @dceara could give us some light

> Thanks,

Comment 25 Sofer Athlan-Guyot 2022-06-21 15:56:41 UTC
Hi,

> No, OVN NB DB is linked to /usr/share/ovn/ovn-nb.ovsschema and OVN
> SB DB to /usr/share/ovn/ovn-sb.ovsschema, are those files also
> available on undercloud?

No, ovn is not installed by default on the undercloud, either rpm or
container.

Furthermore the ovn-controller container doesn't have the necessary
rpm:

    The rpm is ovn22.03-central-22.03.0-52.el9fdp.x86_64 :
    
    [heat-admin@controller-0 ~]$ sudo podman exec -ti ovn-dbs-bundle-podman-1 rpm -ql ovn22.03-central-22.03.0-52.el9fdp.x86_64 |grep ovn-nb.ovsschema
    /usr/share/ovn/ovn-nb.ovsschema
    
It's not in ovn_controller (which is updated at that time):

    [heat-admin@controller-0 ~]$ sudo podman exec -ti ovn_controller rpm -qa |grep  'ovn.*central'
    => nothing
    
List of ovn related rpm in that updated container just to make sure:

    [heat-admin@controller-0 ~]$ sudo podman exec -ti ovn_controller rpm -qa |grep  'ovn'
    ovn22.03-22.03.0-52.el9fdp.x86_64
    rhosp-ovn-22.03-5.el9ost.noarch
    ovn22.03-host-22.03.0-52.el9fdp.x86_64
    rhosp-ovn-host-22.03-5.el9ost.noarch

That means that the script will have to get the file somehow. Here a
"somehow" that could work:

    mkdir /var/tmp/ovn-db-schema
    cd /var/tmp/ovn-db-schema
    dnf download dnf download ovn22.03-central-22.03.0-52.el9fdp.x86_64
    rpm2cpio dnf download ovn22.03-central-22.03.0-52.el9fdp.x86_64|cpio -vimd
    
The main problem seems to have the correct name... it seems the
version is embedded in the name.

Furthermore, in the testing env I've got:

    dnf whatprovides /usr/share/ovn/ovn-nb.ovsschema

returns different naming scheme:

    ovn-2021-central-21.12.0-30.el9fdp.x86_64 : Open Virtual Network support
    Repo        : rhosp-rhel-9.0-fdp-nightly
    Matched from:
    Filename    : /usr/share/ovn/ovn-nb.ovsschema
    
    ovn22.03-central-22.03.0-22.el9fdp.x86_64 : Open Virtual Network support
    Repo        : rhosp-rhel-9.0-fdp-cdn
    Matched from:
    Filename    : /usr/share/ovn/ovn-nb.ovsschema
    
    ovn22.03-central-22.03.0-52.el9fdp.x86_64 : Open Virtual Network support
    Repo        : rhosp-rhel-9.0-fdp-nightly
    Matched from:
    Filename    : /usr/share/ovn/ovn-nb.ovsschema
    
Not sure how it looks on a customer env.

So getting the right rpm might be an issue.  If that's too complicated
then we will have to pull the image of the
rhosp17-openstack-ovn-northd container (container repo are updated at
that time) and get it out of there. But that's a complicated solution
and maybe risky one (just a feeling at that point). I'd rather have
the right rpm name somehow and get it from there.

Comment 26 Dumitru Ceara 2022-06-22 13:36:14 UTC
(In reply to Fernando Royo from comment #24)
> Hi, 
> 
> Basically to confirm points.
> 
> (In reply to Sofer Athlan-Guyot from comment #23)
> > Hi,
> > 
> > I've edited the replies for clarity.
> > 
> > (In reply to Fernando Royo from comment #22)
> > > Hi Sofer, 
> > > 
> > > > Second, it has been mentionned that "the only possible solution for this
> > > > issue would be to make RAFT available in OSP 16.2?"[0]. Is that
> > > > solution has an issue tracker already as it seems that whatever we
> > > > come up with here will be a crutch not a complete solution as far as I
> > > > understood - correct me if I'm wrong -.
> > > 
> > > Not exactly. Moving to an ovsdb-server clustered (RAFT) version will be
> > > implemented for versions > 17.0, so we need a workaround for 16.X versions,
> > > with the main focus on the upgrade from 16.1 to 16.2, where this situation
> > > is more than likely to occur, but totally agree that this should reduce the
> > > probability of problems but not a totally safe solution. Because if we have
> > > a failover during the process we could have the issue again, but as said
> > > before, we are reducing likehood, and this is a workaround.
> > 
> > Noted, so we won't expect a backport of the RAFT db. As a side note we
> > don't really know in the 16.2 code if we running a 16.1->16.2 update
> > or an update of 16.2. Ideally the code would run equally well for both
> > scenarios.
> > 
> 
> Yeah.
> 
> > > > Could this be enough to run your command?
> > > 
> > > I think so, but after reading your well explained lines above, I think we
> > > should need three tasks:
> > 
> > So we would run it before the update starts so that all db schema are
> > updated before ovs/ovndb are updated, correct?
> > 
> > > Task 1 - We should need to get the new schema files somehow
> > 
> > Oki, so on the undercloud (which is already updated) we have the file:
> > 
> > /usr/share/openvswitch/vswitch.ovsschema and
> > /usr/share/openvswitch/vtep.ovsschema. I guess those are the needed
> > files?
> 
> No, OVN NB DB is linked to /usr/share/ovn/ovn-nb.ovsschema and OVN SB DB to
> /usr/share/ovn/ovn-sb.ovsschema, are those files also available on
> undercloud?
> 
> > 
> > Through ansible we could send them to the Overcloud's nodes and use
> > them.
> > 
> > > Task 2 - Update schema over backup servers (still need confirmation if we
> > > can easily identify via ovsdb-client)
> > > Task 3 - Update schema over backup servers (still need confirmation if we
> > > can easily identify via ovsdb-client)
> > 
> > hum ... it's twice "backup" :) But latter it becomes clear that it's
> > first backup and then active.
> > 
> 
> ouch, copy-paste ...
> 
> > At that time the ovs binary will not have been updated, is that ok to
> > update the schema without the new ovs binary ?
> > 
> > Same question knowing that ovn-northdb would not be updated neither at
> > that time.
> > 
> 
> I think so, in any case I will confirm with @dceara 
> 

It *should* be fine.  Schema changes in OVN should be backwards
compatible. 

> > Now a question about the sequencing.  The order (whatever it is) seems
> > important and backup needs to be done first.
> > 
> > That mean running ansible with a serial 1 over the controller
> > nodes. Here some ansible prototype:
> > 
> >  - 0. send the schema(s?) file over to the controller's nodes from the
> > undercloud;
> >  - 1. collect info over all controllers to get the status of ovndb there
> > (active or backup) somehow.
> >  - 2. find the active one
> >  - 3. run the command with serial 1 (one after the other) on the other nodes
> > to update its ovs database, while ovs and ovn-northdb aren't yet updated,
> > but ovn-controller is.
> >  - 4. run the same command over the active node
> > 
> > Should the step 3. "block" somehow waiting for something or just
> > running the command is ok and we can move on to the other nodes, or
> > can we do all backup at the same time ?
> 
> I think is not neccesary, I think that we can run step 3 in paralell in all
> backups nodes (different ones to the found in step 2), but @dceara could
> confirm this point :D
> 

Running the schema upgrade on all backups in parallel is fine.

> >   
> > > Task 2 and 3 could run in all controllers indepentently of the role (backup
> > > or active), the "script" will ensure that Task 2 just works over the backup
> > > and Task 3 over the active ones.
> > 
> > Oki, I'm a bit lost here.  Does it mean that order doesn't matter or
> > that the command needed on the backup node are different from the one
> > needed on the active node and that the script should take that into
> > account ?
> 
> Sorry for the mix here. I was thinking to have the same script file to be
> run on all controllers, i.e "update_db_schema.sh
> /tmp/ovn/new_ovn-nb.ovsschema backup" and just the backups ones will do the
> magic, and after that run "update_db_schema.sh /tmp/ovn/new_ovn-nb.ovsschema
> active" to run finally on the active one, but I prefer your options to first
> determine active/backup ones and run according to that, just to be cover for
> a failover during the steps 3 and 4 and finally we get one node did it twice
> and one did not do it at all
> 
> > 
> > > > > 5 - Run the actual upgrade process 
> > > > 
> > > 
> > > Here, if I understand well, will continue with "update run --limit
> > > controller" where several failover may happen but if we already have the
> > > schema updated in all nodes, they do not worry us too much.
> > 
> > Ok, nice. So when we actually update the ovn-northdb container it
> > won't complain and allow synchronisation over the board because the
> > schema version will be the same everywhere, correct?
> 
> happy to see that I was understood, right!!!
> 
> > 
> > I emphasis again that ovs rpm is updated but not restarted during that
> > process, ie the previous binary is still running in memory until
> > reboot.
> 
> ok! just let see if @dceara could confirm that it is not an issue.
> 

Should be fine.

> > 
> > As an "out of this bugzilla context" question, we used to need that
> > "non reboot ovs" state because ovs would cut the connections if we
> > restarted it, is that still the case, do you know?
> > 
> 
> No, another point that @dceara could give us some light
> 

I don't know the history behind this either.  But it really depends on
how ovs is restarted.  It might be worth double checking with ovs team.

> > Thanks,

(In reply to Sofer Athlan-Guyot from comment #25)
> Hi,
> 
> > No, OVN NB DB is linked to /usr/share/ovn/ovn-nb.ovsschema and OVN
> > SB DB to /usr/share/ovn/ovn-sb.ovsschema, are those files also
> > available on undercloud?
> 
> No, ovn is not installed by default on the undercloud, either rpm or
> container.
> 
> Furthermore the ovn-controller container doesn't have the necessary
> rpm:
> 
>     The rpm is ovn22.03-central-22.03.0-52.el9fdp.x86_64 :
>     
>     [heat-admin@controller-0 ~]$ sudo podman exec -ti
> ovn-dbs-bundle-podman-1 rpm -ql ovn22.03-central-22.03.0-52.el9fdp.x86_64
> |grep ovn-nb.ovsschema
>     /usr/share/ovn/ovn-nb.ovsschema
>     
> It's not in ovn_controller (which is updated at that time):
> 
>     [heat-admin@controller-0 ~]$ sudo podman exec -ti ovn_controller rpm -qa
> |grep  'ovn.*central'
>     => nothing
>     
> List of ovn related rpm in that updated container just to make sure:
> 
>     [heat-admin@controller-0 ~]$ sudo podman exec -ti ovn_controller rpm -qa
> |grep  'ovn'
>     ovn22.03-22.03.0-52.el9fdp.x86_64
>     rhosp-ovn-22.03-5.el9ost.noarch
>     ovn22.03-host-22.03.0-52.el9fdp.x86_64
>     rhosp-ovn-host-22.03-5.el9ost.noarch
> 
> That means that the script will have to get the file somehow. Here a
> "somehow" that could work:
> 
>     mkdir /var/tmp/ovn-db-schema
>     cd /var/tmp/ovn-db-schema
>     dnf download dnf download ovn22.03-central-22.03.0-52.el9fdp.x86_64
>     rpm2cpio dnf download ovn22.03-central-22.03.0-52.el9fdp.x86_64|cpio
> -vimd
>     
> The main problem seems to have the correct name... it seems the
> version is embedded in the name.
> 
> Furthermore, in the testing env I've got:
> 
>     dnf whatprovides /usr/share/ovn/ovn-nb.ovsschema
> 
> returns different naming scheme:
> 
>     ovn-2021-central-21.12.0-30.el9fdp.x86_64 : Open Virtual Network support
>     Repo        : rhosp-rhel-9.0-fdp-nightly
>     Matched from:
>     Filename    : /usr/share/ovn/ovn-nb.ovsschema
>     
>     ovn22.03-central-22.03.0-22.el9fdp.x86_64 : Open Virtual Network support
>     Repo        : rhosp-rhel-9.0-fdp-cdn
>     Matched from:
>     Filename    : /usr/share/ovn/ovn-nb.ovsschema
>     
>     ovn22.03-central-22.03.0-52.el9fdp.x86_64 : Open Virtual Network support
>     Repo        : rhosp-rhel-9.0-fdp-nightly
>     Matched from:
>     Filename    : /usr/share/ovn/ovn-nb.ovsschema
>     
> Not sure how it looks on a customer env.

The naming is not ideal in OVN, we're working on that.

The ovn-2021 package has 4 streams (one for every quarterly upstream release):
ovn-2021-central-21.03.0-...
ovn-2021-central-21.06.0-...
ovn-2021-central-21.09.0-...
ovn-2021-central-21.12.0-...

Starting with 2022 we moved to a package-per-upstream-release approach:

ovn22.03 rpms are based on upstream v22.03.0 release
ovn22.06 rpms are based on upstream v22.06.0 release
and so on..

> 
> So getting the right rpm might be an issue.  If that's too complicated
> then we will have to pull the image of the
> rhosp17-openstack-ovn-northd container (container repo are updated at
> that time) and get it out of there. But that's a complicated solution
> and maybe risky one (just a feeling at that point). I'd rather have
> the right rpm name somehow and get it from there.

I don't know the details about how OSP decides which rpms to use but
there's definitely a mapping somewhere.  Maybe Daniel or Jakub can help
out with answering this?

Also, cc-ing Numan, just to have another set of eyes on the discussion
about live NB/SB schema upgrade before upgrade OVN central components.

Thanks,
Dumitru

Comment 27 Jakub Libosvar 2022-06-22 18:05:44 UTC
(In reply to Dumitru Ceara from comment #26)
> 
> I don't know the details about how OSP decides which rpms to use but
> there's definitely a mapping somewhere.  Maybe Daniel or Jakub can help
> out with answering this?
> 

There is rhosp-ovn wrapper package that has dependency on the major OVN version (e.g. rhosp-ovn-2021 requires ovn-2021). It doesn't include minor versions, you can't with the current state of spec file request ovn-2021-21.03 for example, but only ovn-2021. It always picks the latest available.

Comment 28 ldenny 2022-06-24 02:59:02 UTC
Hi Team, 

I would like to add another customer case to this issue: 03245762

We are seeing the same errors as #2 but this is outside any upgrade operation, with the same initial symptom of causing kuryr-controller to keep restarting.


Note You need to log in before you can comment on or make changes to this bug.