Bug 1963948
Summary: | [RHEL7] [RFE] Transfer RAFT leadership during snapshot writing | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | Ilya Maximets <i.maximets> |
Component: | ovsdb2.13 | Assignee: | Ilya Maximets <i.maximets> |
Status: | CLOSED ERRATA | QA Contact: | Zhiqiang Fang <zfang> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | RHEL 8.0 | CC: | ctrautma, jhsiao, jishi, kfida, ovs-qe, ovs-team, ralongi, tredaelli, trozet |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openvswitch2.13-2.13.0-94.el7fdp | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 1960391 | Environment: | |
Last Closed: | 2021-06-21 14:44:07 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1960391 | ||
Bug Blocks: | 1943631 |
Description
Ilya Maximets
2021-05-24 13:12:13 UTC
Using same method from BZ#1964573 verified this RFE on openvswitch2.13-2.13.0-95.el7fdp.x86_64 Test bed: a 3-host ovn raft cluster and a ovn chassis (a host installed ovn-controller). Method to trigger snapshot: To trigger a snapshot the rule is that database should grow more than 50% and be at least more than 10MB. After 10-20 minutes ovsdb-server will check and decide to compact/create a snapshot. In this test, the way to increase db is to add 3000 lsp in short period of time. RPMs have been used: [root@wsfd-advnetlab35 ~]# rpm -aq | egrep "ovn|openv" openvswitch-selinux-extra-policy-1.0-18.el7fdp.noarch ovn2.13-central-20.12.0-135.el7fdp.x86_64 ovn2.13-host-20.12.0-135.el7fdp.x86_64 openvswitch2.13-2.13.0-95.el7fdp.x86_64 ovn2.13-20.12.0-135.el7fdp.x86_64 [root@wsfd-advnetlab35 ~]# OVN_Southbound db leadership transfer: #cat /var/log/ovn/ovsdb-server-nb.log ... 2021-06-16T19:59:30.812Z|00041|raft|INFO|Transferring leadership to write a snapshot. 2021-06-16T19:59:30.812Z|00042|raft|INFO|rejected append_reply (not leader) ... 2021-06-16T19:59:31.860Z|00060|raft|INFO|rejected append_reply (not leader) 2021-06-16T19:59:31.860Z|00061|raft|INFO|server 037c is leader for term 2 2021-06-16T20:02:36.933Z|00062|raft|INFO|received leadership transfer from 037c in term 2 2021-06-16T20:02:36.933Z|00063|raft|INFO|term 3: starting election 2021-06-16T20:02:36.934Z|00064|raft|INFO|term 3: elected leader by 2+ of 3 servers ... # ovs-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound ############## Wed Jun 16 15:59:30 EDT 2021 079b Name: OVN_Southbound Cluster ID: d908 (d9082509-96e1-4d97-b777-7fa7cc472cd1) Server ID: 079b (079b530e-1030-4204-90e0-3413beac73df) Address: tcp:wsfd-advnetlab35.xyz:6644 Status: cluster member Role: leader Term: 1 Leader: self Vote: self Election timer: 1000 Log: [2, 3002] Entries not yet committed: 0 Entries not yet applied: 0 Connections: <-037c ->037c <-1774 ->1774 Servers: 037c (037c at tcp:netqe5.xyz:6644) next_index=3002 match_index=3001 079b (079b at tcp:wsfd-advnetlab35.xyz:6644) (self) next_index=2 match_index=3001 1774 (1774 at tcp:netqe6.xyz:6644) next_index=3002 match_index=3001 ############## Wed Jun 16 15:59:30 EDT 2021 079b Name: OVN_Southbound Cluster ID: d908 (d9082509-96e1-4d97-b777-7fa7cc472cd1) Server ID: 079b (079b530e-1030-4204-90e0-3413beac73df) Address: tcp:wsfd-advnetlab35.xyz:6644 Status: cluster member Role: follower Term: 1 Leader: unknown Vote: self Election timer: 1000 Log: [3002, 3002] Entries not yet committed: 0 Entries not yet applied: 0 Connections: <-037c ->037c <-1774 ->1774 Servers: 037c (037c at tcp:netqe5.xyz:6644) 079b (079b at tcp:wsfd-advnetlab35.xyz:6644) (self) 1774 (1774 at tcp:netqe6.xyz:6644) ############## Wed Jun 16 15:59:32 EDT 2021 079b Name: OVN_Southbound Cluster ID: d908 (d9082509-96e1-4d97-b777-7fa7cc472cd1) Server ID: 079b (079b530e-1030-4204-90e0-3413beac73df) Address: tcp:wsfd-advnetlab35.xyz:6644 Status: cluster member Role: follower Term: 2 Leader: 037c Vote: 037c Election timer: 1000 Log: [3002, 3003] Entries not yet committed: 0 Entries not yet applied: 0 Connections: <-037c ->037c <-1774 ->1774 Servers: 037c (037c at tcp:netqe5.xyz:6644) 079b (079b at tcp:wsfd-advnetlab35.xyz:6644) (self) 1774 (1774 at tcp:netqe6.xyz:6644) ############## OVN_Northbound db leadership transfer: # cat /var/log/ovn/ovsdb-server-nb.log ... 2021-06-16T20:04:19.189Z|00224|raft|INFO|Transferring leadership to write a snapshot. 2021-06-16T20:04:19.190Z|00225|raft|INFO|rejected append_reply (not leader) 2021-06-16T20:04:19.703Z|00226|raft|INFO|rejected append_reply (not leader) 2021-06-16T20:04:19.705Z|00227|raft|INFO|server 2e70 is leader for term 2 ... # ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound ############## Wed Jun 16 16:04:18 EDT 2021 3793 Name: OVN_Northbound Cluster ID: d0d5 (d0d58acf-16c4-4a5b-b646-3cd9c2961a16) Server ID: 3793 (3793f874-4202-4ddb-871d-0544671483df) Address: tcp:wsfd-advnetlab35.xyz:6643 Status: cluster member Role: leader Term: 1 Leader: self Vote: self Election timer: 1000 Log: [2, 5999] Entries not yet committed: 0 Entries not yet applied: 0 Connections: <-7f27 ->7f27 <-2e70 ->2e70 Servers: 2e70 (2e70 at tcp:netqe6.xyz:6643) next_index=5999 match_index=5998 3793 (3793 at tcp:wsfd-advnetlab35.xyz:6643) (self) next_index=2 match_index=5998 7f27 (7f27 at tcp:netqe5.xyz:6643) next_index=5999 match_index=5998 ############## Wed Jun 16 16:04:19 EDT 2021 3793 Name: OVN_Northbound Cluster ID: d0d5 (d0d58acf-16c4-4a5b-b646-3cd9c2961a16) Server ID: 3793 (3793f874-4202-4ddb-871d-0544671483df) Address: tcp:wsfd-advnetlab35.xyz:6643 Status: cluster member Role: follower Term: 1 Leader: unknown Vote: self Election timer: 1000 Log: [5999, 5999] Entries not yet committed: 0 Entries not yet applied: 0 Connections: <-7f27 ->7f27 <-2e70 ->2e70 Servers: 2e70 (2e70 at tcp:netqe6.xyz:6643) 3793 (3793 at tcp:wsfd-advnetlab35.xyz:6643) (self) 7f27 (7f27 at tcp:netqe5.xyz:6643) ############## Wed Jun 16 16:04:20 EDT 2021 3793 Name: OVN_Northbound Cluster ID: d0d5 (d0d58acf-16c4-4a5b-b646-3cd9c2961a16) Server ID: 3793 (3793f874-4202-4ddb-871d-0544671483df) Address: tcp:wsfd-advnetlab35.xyz:6643 Status: cluster member Role: follower Term: 2 Leader: 2e70 Vote: 2e70 Election timer: 1000 Log: [5999, 6000] Entries not yet committed: 0 Entries not yet applied: 0 Connections: <-7f27 ->7f27 <-2e70 ->2e70 Servers: 2e70 (2e70 at tcp:netqe6.xyz:6643) 3793 (3793 at tcp:wsfd-advnetlab35.xyz:6643) (self) 7f27 (7f27 at tcp:netqe5.xyz:6643) ############## Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (openvswitch2.13 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2506 |