Bug 2118848
| Summary: | Backport: [ovs-dev] netdev-linux: skip some internal kernel stats gathering | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | Jonathan Maxwell <jmaxwell> |
| Component: | openvswitch2.16 | Assignee: | Aaron Conole <aconole> |
| Status: | CLOSED ERRATA | QA Contact: | Hekai Wang <hewang> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | RHEL 8.0 | CC: | aconole, ctrautma, eglottma, fbaudin, fleitner, hnhan, jhsiao, ovs-qe, ralongi, tredaelli, xzhou |
| Target Milestone: | --- | Flags: | hewang:
needinfo-
|
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | openvswitch2.16-2.16.0-103.el8fdp | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-11-03 00:30:52 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Comment 10
hnhan
2022-09-29 12:38:41 UTC
Patch backported: https://gitlab.cee.redhat.com/nst/openvswitch/openvswitch2.16/-/commit/ce553c99e2f9b8b3784ac66a759c363528c67c8b * Tue Oct 11 2022 Aaron Conole <aconole> - 2.16.0-103
- netdev-linux: Skip some internal kernel stats gathering. [RH git: ce553c99e2] (#2118848)
For netdev_linux_update_via_netlink(), hint to the kernel that
we do not need it to gather netlink internal stats when we want
to update the netlink flags, as those stats are not rendered
within OVS.
Background:
ovs-vswitchd can spend quite a bit of time blocked by the kernel
during netlink calls, especially systems with many cores. This
time is dominated by the kernel-side internal stats gathering
mechanism in netlink, specifically:
inet6_fill_link_af
inet6_fill_ifla6_attrs
__snmp6_fill_stats64
In Linux 4.4+, there exists a hint for netlink requests to not
trigger the ipv6 stats gathering mechanism, which greatly reduces
the amount of time that ovs-vswitchd is on CPU.
Testing and Results:
Tested booting 320 VM's and measuring OVS utilization with perf
record, then visualized into a flamegraph using a patched version
of ovs 2.14.2. Calls under bridge_run() seem to get hit the worst
by this issue.
Before bridge_run() == 11.3% of samples
After bridge_run() == 3.4% of samples
Note that there are at least two observed netlink calls under
bridge_run that are still kernel stats heavy after this patch:
Call 1:
bridge_run -> netdev_run -> route_table_run -> route_table_reset ->
ovs_router_insert -> ovs_router_insert__ -> get_src_addr ->
netdev_ger_addr_list -> netdev_linux_get_addr_list -> getifaddrs
Since the actual netlink call is coming from getifaddrs() in glibc,
fixing would likely involve either duplicating glibc code in ovs
source or patch glibc.
Call 2:
bridge_run -> iface_refresh_stats -> netdev_get_stats ->
netdev_linux_get_stats -> get_stats_via_netlink
This does use netlink based stats; however, it isn't immediately
clear if just dropping the stats from inet6_fill_link_af would
impact anything or not. Given this call is more intermittent, its
of lesser concern.
Acked-by: Greg Smith <gasmith>
Signed-off-by: Jon Kohler <jon>
Signed-off-by: Ilya Maximets <i.maximets>
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2118848
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (openvswitch2.16 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:7390 |