Bug 2100544 - [4.10] [Webscale] High OVS cpu usage causing performance issues
Summary: [4.10] [Webscale] High OVS cpu usage causing performance issues
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Performance Addon Operator
Version: 4.9
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: 4.10.z
Assignee: Yanir Quinn
QA Contact: Niranjan Mallapadi Raghavender
URL:
Whiteboard:
: 2096703 2108217 (view as bug list)
Depends On: 2081852
Blocks: 2108556 2108557
TreeView+ depends on / blocked
 
Reported: 2022-06-23 16:16 UTC by Yanir Quinn
Modified: 2023-09-18 04:39 UTC (History)
34 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2081852
Environment:
Last Closed: 2022-08-09 02:52:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift-kni performance-addon-operators pull 899 0 None Merged Bug 2100544: Set rps for virtual interfaces only in crio hook 2022-07-27 11:55:08 UTC
Github openshift-kni performance-addon-operators pull 916 0 None Merged [release 4.10] Bug 2100544: Fix RPS default physical and virtual settings 2022-07-27 11:55:09 UTC
Red Hat Product Errata RHEA-2022:5929 0 None None None 2022-08-09 02:52:20 UTC

Comment 1 Yanir Quinn 2022-06-23 16:18:35 UTC
*** Bug 2096703 has been marked as a duplicate of this bug. ***

Comment 5 Chen 2022-08-03 06:21:06 UTC
*** Bug 2108217 has been marked as a duplicate of this bug. ***

Comment 6 Shereen Haj Makhoul 2022-08-03 14:21:39 UTC
Verification:

Versions:
OCP: 4.10.25
PAO: 4.10.6-3

Steps:

apply pao profile:

apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
  name: manual
spec:
  cpu:
    isolated: "0-2"
    reserved: "3"
  realTimeKernel:
    enabled: true
  nodeSelector:
    node-role.kubernetes.io/workercnf: ""

and after the nodes are ready apply the gu pod:

apiVersion: v1
kind: Pod
metadata:
  name: test2
  annotations:
     irq-load-balancing.crio.io: "disable"
     cpu-quota.crio.io: "disable"
spec:
  containers:
  - name: test
    image: registry.redhat.io/openshift4/performance-addon-rhel8-operator@sha256:9be84526676476c26d5b20728580a67f202ea078ccb7044431f2b5fd6a3b22c8
    imagePullPolicy: IfNotPresent
    command: ["/bin/sh", "-c"]
    args: [ "while true; do sleep 100000; done;" ]
    resources:
      requests:
        cpu: 2
        memory: "200M"
      limits:
        cpu: 2
        memory: "200M"
  nodeSelector:
    node-role.kubernetes.io/workercnf: ""
  runtimeClassName: performance-manual

- connect to the pod and check the rps mask is set for veth devices and aligned with the PP:

[root@registry ~]# oc rsh test2
sh-4.4#  find /sys/devices/virtual/ -name rps_cpus -printf '%p\n' -exec cat {} \;
/sys/devices/virtual/net/lo/queues/rx-0/rps_cpus
0000,00000000,00000008
/sys/devices/virtual/net/eth0/queues/rx-0/rps_cpus
0000,00000000,00000008
sh-4.4# find /sys/devices/ -name rps_cpus -printf '%p\n' -exec cat {} \;
/sys/devices/virtual/net/lo/queues/rx-0/rps_cpus
0000,00000000,00000008
/sys/devices/virtual/net/eth0/queues/rx-0/rps_cpus
0000,00000000,00000008
sh-4.4# 

As can be seen, in both commands only the veth devices shown.

- Verify the new annotation works properly:

when the profile doesn't enable the annotation, only the virtual devices are seen in systemctl:

sh-4.4# systemctl list-units -all | grep update-rps@
  update-rps                                                                                                                                                     loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                                                                                                                                                    loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                                                                                                                                            loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                                                                                                                                                        loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                                                                                                                                               loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                                                                                                                                                loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                                                                                                                                                      loaded    inactive dead      Sets network devices RPS mask                                                                                                         
sh-4.4# 

and when the new annotation is added to the profile like below:


apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
  annotations:
      performance.openshift.io/enable-physical-dev-rps: "true"   
  name: manual
...

all devices are shown now: 

sh-4.4# systemctl list-units -all | grep update-rps@
  update-rps                                                                                                                                                     loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                                                                                                                                                    loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                              <---                                                                                                                        loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                              <---                                                                                                                        loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                              <---                                                                                                                        loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                              <---                                                                                                                        loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                              <---                                                                                                                       loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                                <---                                                                                                                      loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                              <---                                                                                                                      loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                               <---                                                                                                                     loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                               <---                                                                                                                     loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                               <---                                                                                                                     loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                                                                                                                                            loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                                                                                                                                                        loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                                                                                                                                               loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                                                                                                                                                loaded    inactive dead      Sets network devices RPS mask                                                                                                         
  update-rps                                                                                                                                                      loaded    inactive dead      Sets network devices RPS mask                                                                                                         
sh-4.4# 

bug verified.

Comment 8 errata-xmlrpc 2022-08-09 02:52:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10.26 low-latency extras update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:5929

Comment 10 Red Hat Bugzilla 2023-09-18 04:39:59 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.