Bug 2070363 - Failed to read database with dns hostname address
Summary: Failed to read database with dns hostname address
Keywords:
Status: MODIFIED
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch2.13
Version: FDP 22.A
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ---
: ---
Assignee: Ilya Maximets
QA Contact: Zhiqiang Fang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-30 22:45 UTC by OvS team
Modified: 2023-07-13 07:25 UTC (History)
3 users (show)

Fixed In Version: openvswitch2.13-2.13.0-137.el7fdp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-1869 0 None None None 2022-03-30 22:48:41 UTC

Description OvS team 2022-03-30 22:45:28 UTC
+++ This bug was initially created as a clone of Bug #2055097 +++

Description of problem:

Version-Release number of selected component (if applicable):
ovn21.12-central-21.12.0-25.el8fdp.x86_64
ovn21.12-vtep-21.12.0-25.el8fdp.x86_64
ovn21.12-21.12.0-25.el8fdp.x86_64
ovn21.12-host-21.12.0-25.el8fdp.x86_64

Context: hypershift ovn, run ovn nbdb and sbdb as statefulset.

Assuming ovndb statefulset pods ovnkube-master-guest-0/1/2 formed the quorum, guest-1 is nb leader. Delete both guest-0 and guest-1 pods, guest-2 become leader. 

Since statefulset is used, guest-0 gets re-created first (guest-1 needs to wait until guest-0 is ready, guest pod dns/hostname is only resolvable when pod is running), guest-0 finds the new leader guest-2,  then start nb with the following cmd (local=guest-0, remote=guest-2):


###
+ echo 'Cluster already exists for DB: nb'
+ initial_raft_create=false
+ wait 71
+ exec /usr/share/ovn/scripts/ovn-ctl --db-nb-cluster-local-port=9643 --db-nb-cluster-local-addr=ovnkube-master-guest-0.ovnkube-master-guest.hypershift-ovn.svc.cluster.local --no-monitor --db-nb-cluster-local-proto=ssl --ovn-nb-db-ssl-key=/ovn-cert/tls.key --ovn-nb-db-ssl-cert=/ovn-cert/tls.crt --ovn-nb-db-ssl-ca-cert=/ovn-ca/ca-bundle.crt --db-nb-cluster-remote-port=9643 --db-nb-cluster-remote-addr=ovnkube-master-guest-2.ovnkube-master-guest.hypershift-ovn.svc.cluster.local --db-nb-cluster-remote-proto=ssl '--ovn-nb-log=-vconsole:dbg -vfile:off -vPATTERN:console:%D{%Y-%m-%dT%H:%M:%S.###Z}|%05N|%c%T|%p|%m' --db-nb-election-timer=10000 run_nb_ovsdb
2022-02-16T03:05:25.330Z|00001|vlog|INFO|opened log file /var/log/ovn/ovsdb-server-nb.log
ovsdb-server: ovsdb error: error reading record 12 from OVN_Northbound log: ssl:ovnkube-master-guest-1.ovnkube-master-guest.hypershift-ovn.svc.cluster.local:9643: syntax error in address
[1]+  Exit 1                  exec /usr/share/ovn/scripts/ovn-ctl ${OVN_ARGS} --db-nb-cluster-remote-port=9643 --db-nb-cluster-remote-addr=${init_ip} --db-nb-cluster-remote-proto=ssl --ovn-nb-log="-vconsole:${OVN_LOG_LEVEL} -vfile:off -vPATTERN:console:${OVN_LOG_PATTERN_CONSOLE}" ${election_timer} run_nb_ovsdb
###

Guest-0 failed due to guest-1 hostname is not resolvable (syntax error in address).  


Expected results:

Guest-0 reconnects successfully until guest-1 become running.

Comment 1 OvS team 2022-03-30 22:45:31 UTC
* Wed Mar 30 2022 Open vSwitch CI <ovs-ci> - 2.13.0-137
- Merging upstream branch-2.13 [RH git: b8990f68eb]
    Commit list:
    3ceb5dbe92 ovsdb: raft: Fix inability to read the database with DNS host names. (#2055097)


Note You need to log in before you can comment on or make changes to this bug.