Bug 1989600
| Summary: | Registry server RSS and CPU utilization too high during normal operation | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Ben Luddy <bluddy> |
| Component: | OLM | Assignee: | Ben Luddy <bluddy> |
| OLM sub component: | OLM | QA Contact: | Tom Buskey <tbuskey> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | bluddy, tflannag |
| Version: | 4.9 | Keywords: | Triaged |
| Target Milestone: | --- | ||
| Target Release: | 4.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-10-18 17:44:13 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
On my machine, following the repro steps using opm built with the proposed changes and using an identical "redhat.db", top shows:
top - 10:50:31 up 52 days, 17:33, 1 user, load average: 0.76, 0.73, 0.93
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.8 us, 0.8 sy, 0.0 ni, 97.5 id, 0.0 wa, 0.8 hi, 0.0 si, 0.0 st
MiB Mem : 31948.9 total, 2480.5 free, 7505.3 used, 21963.1 buff/cache
MiB Swap: 24264.0 total, 22392.4 free, 1871.6 used. 20876.9 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2470165 bluddy 20 0 1633064 42164 27476 S 0.0 0.1 0:00.29 opm
Reducing RES from 197988KiB to 42164KiB and TIME+ from 3.87s to 0.29s.
Thank you for the detailed directions!
opm version
Version: version.Version{OpmVersion:"v1.17.5-21-g5fc0f422", GitCommit:"5fc0f422", BuildDate:"2021-08-03T14:52:16Z", GoOs:"linux", GoArch:"amd64"}
opm registry serve -d redhat.db
WARN[0000] unable to set termination log path error="open /dev/termination-log: permission denied"
INFO[0000] Keeping server open for infinite seconds database=redhat.db port=50051
INFO[0000] serving registry database=redhat.db port=50051
top -b -n1 -p $(ps aux | grep 'opm registry serve -d redhat.db' | grep -v grep | awk '{print $2}')
top - 11:21:55 up 34 days, 1:21, 2 users, load average: 4.25, 4.41, 3.73
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 24.6 us, 5.7 sy, 0.0 ni, 68.0 id, 0.0 wa, 0.8 hi, 0.8 si, 0.0 st
MiB Mem : 31949.7 total, 4738.5 free, 17990.3 used, 9221.0 buff/cache
MiB Swap: 20168.0 total, 17991.2 free, 2176.8 used. 11935.5 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2857413 tbuskey 20 0 1633132 43948 27588 S 0.0 0.1 0:00.45 opm
20 minutes later:
top - 12:48:23 up 34 days, 2:47, 2 users, load average: 3.24, 3.66, 3.69
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 6.7 us, 2.5 sy, 0.0 ni, 89.1 id, 0.0 wa, 0.8 hi, 0.8 si, 0.0 st
MiB Mem : 31949.7 total, 2111.8 free, 18705.1 used, 11132.9 buff/cache
MiB Swap: 20168.0 total, 17994.7 free, 2173.2 used. 10911.5 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2857413 tbuskey 20 0 1633132 40812 27636 S 0.0 0.1 0:00.99 opm
/lgtm
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |
The registry server binary that exposes catalog contents to cluster components (including catalog-operator and packageserver) at runtime tends to balloon its memory footprint and waste CPU cycles when transferring all catalog entries to catalog-operator via its ListBundles API. Steps to reproduce: 1. Extract a registry database file from redhat-operator-index: $ docker pull registry.redhat.io/redhat/redhat-operator-index:v4.8 $ docker cp $(docker create registry.redhat.io/redhat/redhat-operator-index:v4.8):/database/index.db redhat.db 2. Serve the registry API using opm. This will block, you will need to either background the job or open a second terminal after this step: $ opm registry serve -d redhat.db 3. Query the ListBundles endpoint: $ grpcurl -plaintext localhost:50051 api.Registry/ListBundles > /dev/null 4. Measure the resident set size (the column RES in the output of 'top' below) and CPU time (TIME+): $ top -b -n1 -p $(ps aux | grep 'opm registry serve -d redhat.db' | grep -v grep | awk '{print $2}') top - 10:48:43 up 52 days, 17:31, 1 user, load average: 0.82, 0.71, 0.95 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 4.1 us, 0.8 sy, 0.0 ni, 94.2 id, 0.0 wa, 0.8 hi, 0.0 si, 0.0 st MiB Mem : 31948.9 total, 2276.9 free, 7766.4 used, 21905.6 buff/cache MiB Swap: 24264.0 total, 22392.7 free, 1871.3 used. 20637.0 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2469916 bluddy 20 0 1843444 197988 28524 S 0.0 0.6 0:03.87 opm