Bug 1974364 - [must-gather] ovs/ovn database should be exported or dumped, not compacted and copied
Summary: [must-gather] ovs/ovn database should be exported or dumped, not compacted an...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.9
Hardware: Unspecified
OS: Unspecified
high
low
Target Milestone: ---
: 4.9.0
Assignee: Nadia Pinaeva
QA Contact: Ross Brattain
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-21 13:24 UTC by Adrián Moreno
Modified: 2021-10-18 17:36 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-18 17:35:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift must-gather pull 245 0 None open Bug 1974364: Change the way of gathering ovn db 2021-06-30 12:37:25 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:36:15 UTC

Description Adrián Moreno 2021-06-21 13:24:06 UTC
Description of problem:

Currently must-gather performs the following tasks to pick up the OVN or vswitchd database:

- store the size of the .db file
- compact the .db
- store the size of the .db file
- copy the .db file

This has some problems:

- copying the .db file might pick partial data
- even after compacting, it's not guaranteed that there are no transactions (which could happen after compacting and before copying the file). This makes it difficult for parsers other than ovsdb-server, e.g: insights.
- compacting the db while ovsdb-server is running can cause db corruption


The way I see it, I think we would achieve the same result by using "ovsdb-client backup" or even "ovsdb-client dump".

Raising this BZ to understand if there is a strong reason why "compact" + "cp" was added in the first place that I might be missing before I send a PR to use "backup".

Comment 1 Adrián Moreno 2021-06-23 09:16:37 UTC
OK, I thought it was using ovsdb-tool to compact, but it's using ovsdb-appctl, so at least we don't have risk of corruption.

Comment 2 Maciej Szulik 2021-06-23 13:25:51 UTC
Sending to ovn team who owns this bit.

Comment 3 Nadia Pinaeva 2021-06-28 17:34:33 UTC
There is no special reason to use "compact" + "cp" and we agree that "ovsdb-client backup" will solve some of the issues with copying.
The only downside is it will leave out ephemeral columns https://github.com/ovn-org/ovn/blob/master/ovn-nb.ovsschema#L467 and you may want to leave some comments on it.

Comment 4 Nadia Pinaeva 2021-06-29 16:39:21 UTC
I created a PR https://github.com/openshift/must-gather/pull/245 for this bug, please take a look

Comment 5 Adrián Moreno 2021-06-30 13:25:04 UTC
Thanks Nadia, will follow the discussion on your PR

Comment 7 Ross Brattain 2021-07-19 15:37:02 UTC
Verified on 4.9.0-0.ci-2021-07-16-112407

[must-gather-fhkq5] POD 2021-07-16T21:24:17.494562344Z + for OVNKUBE_MASTER_POD in ${OVNKUBE_MASTER_PODS[@]}
[must-gather-fhkq5] POD 2021-07-16T21:24:17.494673097Z + oc cp openshift-ovn-kubernetes/ovnkube-master-c78z2:/etc/ovn/ovnnb_db.db -c nbdb must-gather/network_logs/ovnkube-master-c78z2_nbdb
[must-gather-fhkq5] POD 2021-07-16T21:24:17.495197064Z + oc -n openshift-ovn-kubernetes exec -c ovnkube-master ovnkube-master-c78z2 -- bash -c 'ovn-nbctl --db=ssl:10.0.223.231:9641,ssl:10.0.133.148:9641,ssl:10.0.163.124:9641   -p /ovn-cert/tls.key -c /ovn-cert/tls.crt -C /ovn-ca/ca-bundle.crt list Logical_Switch_Port'
[must-gather-fhkq5] POD 2021-07-16T21:24:17.496050302Z + oc -n openshift-ovn-kubernetes exec -c ovnkube-master ovnkube-master-c78z2 -- bash -c 'ovn-nbctl --db=ssl:10.0.223.231:9641,ssl:10.0.133.148:9641,ssl:10.0.163.124:9641   -p /ovn-cert/tls.key -c /ovn-cert/tls.crt -C /ovn-ca/ca-bundle.crt list Load_Balancer'
[must-gather-fhkq5] POD 2021-07-16T21:24:17.496763670Z + oc -n openshift-ovn-kubernetes exec -c ovnkube-master ovnkube-master-c78z2 -- bash -c 'ovn-sbctl --db=ssl:10.0.223.231:9642,ssl:10.0.133.148:9642,ssl:10.0.163.124:9642   -p /ovn-cert/tls.key -c /ovn-cert/tls.crt -C /ovn-ca/ca-bundle.crt show'
[must-gather-fhkq5] POD 2021-07-16T21:24:18.053535936Z tar: Removing leading `/' from member names
[must-gather-fhkq5] POD 2021-07-16T21:24:18.071258828Z + oc cp openshift-ovn-kubernetes/ovnkube-master-c78z2:/etc/ovn/ovnsb_db.db -c sbdb must-gather/network_logs/ovnkube-master-c78z2_sbdb
[must-gather-fhkq5] POD 2021-07-16T21:24:18.382628779Z tar: Removing leading `/' from member names
[must-gather-fhkq5] POD 2021-07-16T21:24:18.406525928Z + PIDS+=($!)
[must-gather-fhkq5] POD 2021-07-16T21:24:18.406583250Z + gzip must-gather/network_logs/ovnkube-master-c78z2_nbdb
[must-gather-fhkq5] POD 2021-07-16T21:24:18.409488930Z + oc exec -n openshift-ovn-kubernetes ovnkube-master-c78z2 -c sbdb -- bash -c 'ps -eo nlwp'
[must-gather-fhkq5] POD 2021-07-16T21:24:18.409981461Z + PIDS+=($!)
[must-gather-fhkq5] POD 2021-07-16T21:24:18.410068395Z + for OVNKUBE_MASTER_POD in ${OVNKUBE_MASTER_PODS[@]}
[must-gather-fhkq5] POD 2021-07-16T21:24:18.410143784Z + oc cp openshift-ovn-kubernetes/ovnkube-master-f5s48:/etc/ovn/ovnnb_db.db -c nbdb must-gather/network_logs/ovnkube-master-f5s48_nbdb
[must-gather-fhkq5] POD 2021-07-16T21:24:18.411006709Z + oc exec -n openshift-ovn-kubernetes ovnkube-master-c78z2 -c sbdb -- bash -c 'cat /proc/sys/kernel/threads-max'
[must-gather-fhkq5] POD 2021-07-16T21:24:18.411006709Z + oc exec -n openshift-ovn-kubernetes ovnkube-master-c78z2 -c nbdb -- bash -c 'ovn-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status     OVN_Northbound'
[must-gather-fhkq5] POD 2021-07-16T21:24:18.428092560Z + gzip must-gather/network_logs/ovnkube-master-c78z2_sbdb
[must-gather-fhkq5] POD 2021-07-16T21:24:18.428452496Z + oc exec -n openshift-ovn-kubernetes ovnkube-master-c78z2 -c sbdb -- bash -c 'ovn-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status     OVN_Southbound'
[must-gather-fhkq5] POD 2021-07-16T21:24:18.688684104Z tar: Removing leading `/' from member names
[must-gather-fhkq5] POD 2021-07-16T21:24:18.701251236Z + oc cp openshift-ovn-kubernetes/ovnkube-master-f5s48:/etc/ovn/ovnsb_db.db -c sbdb must-gather/network_logs/ovnkube-master-f5s48_sbdb
[must-gather-fhkq5] POD 2021-07-16T21:24:19.215677683Z tar: Removing leading `/' from member names
[must-gather-fhkq5] POD 2021-07-16T21:24:19.233398062Z + PIDS+=($!)
[must-gather-fhkq5] POD 2021-07-16T21:24:19.233484741Z + gzip must-gather/network_logs/ovnkube-master-f5s48_nbdb
[must-gather-fhkq5] POD 2021-07-16T21:24:19.235252831Z + PIDS+=($!)
[must-gather-fhkq5] POD 2021-07-16T21:24:19.235252831Z + oc exec -n openshift-ovn-kubernetes ovnkube-master-f5s48 -c nbdb -- bash -c 'ovn-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status     OVN_Northbound'
[must-gather-fhkq5] POD 2021-07-16T21:24:19.235252831Z + PIDS+=($!)
[must-gather-fhkq5] POD 2021-07-16T21:24:19.235252831Z + oc exec -n openshift-ovn-kubernetes ovnkube-master-f5s48 -c sbdb -- bash -c 'ovn-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status     OVN_Southbound'
[must-gather-fhkq5] POD 2021-07-16T21:24:19.236019961Z + PIDS+=($!)
[must-gather-fhkq5] POD 2021-07-16T21:24:19.236019961Z + for OVNKUBE_MASTER_POD in ${OVNKUBE_MASTER_PODS[@]}
[must-gather-fhkq5] POD 2021-07-16T21:24:19.236019961Z + oc cp openshift-ovn-kubernetes/ovnkube-master-xcwts:/etc/ovn/ovnnb_db.db -c nbdb must-gather/network_logs/ovnkube-master-xcwts_nbdb
[must-gather-fhkq5] POD 2021-07-16T21:24:19.236755881Z + gzip must-gather/network_logs/ovnkube-master-f5s48_sbdb
[must-gather-fhkq5] POD 2021-07-16T21:24:19.237195971Z + oc exec -n openshift-ovn-kubernetes ovnkube-master-f5s48 -c sbdb -- bash -c 'cat /proc/sys/kernel/threads-max'
[must-gather-fhkq5] POD 2021-07-16T21:24:19.241021634Z + oc exec -n openshift-ovn-kubernetes ovnkube-master-f5s48 -c sbdb -- bash -c 'ps -eo nlwp'
[must-gather-fhkq5] POD 2021-07-16T21:24:19.614633161Z tar: Removing leading `/' from member names
[must-gather-fhkq5] POD 2021-07-16T21:24:19.631991798Z + oc cp openshift-ovn-kubernetes/ovnkube-master-xcwts:/etc/ovn/ovnsb_db.db -c sbdb must-gather/network_logs/ovnkube-master-xcwts_sbdb
[must-gather-fhkq5] POD 2021-07-16T21:24:19.823091605Z tar: Removing leading `/' from member names
[must-gather-fhkq5] POD 2021-07-16T21:24:19.846578224Z + PIDS+=($!)

Comment 10 errata-xmlrpc 2021-10-18 17:35:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.