The registry server binary that exposes catalog contents to cluster components (including catalog-operator and packageserver) at runtime tends to balloon its memory footprint and waste CPU cycles when transferring all catalog entries to catalog-operator via its ListBundles API. Steps to reproduce: 1. Extract a registry database file from redhat-operator-index: $ docker pull registry.redhat.io/redhat/redhat-operator-index:v4.8 $ docker cp $(docker create registry.redhat.io/redhat/redhat-operator-index:v4.8):/database/index.db redhat.db 2. Serve the registry API using opm. This will block, you will need to either background the job or open a second terminal after this step: $ opm registry serve -d redhat.db 3. Query the ListBundles endpoint: $ grpcurl -plaintext localhost:50051 api.Registry/ListBundles > /dev/null 4. Measure the resident set size (the column RES in the output of 'top' below) and CPU time (TIME+): $ top -b -n1 -p $(ps aux | grep 'opm registry serve -d redhat.db' | grep -v grep | awk '{print $2}') top - 10:48:43 up 52 days, 17:31, 1 user, load average: 0.82, 0.71, 0.95 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 4.1 us, 0.8 sy, 0.0 ni, 94.2 id, 0.0 wa, 0.8 hi, 0.0 si, 0.0 st MiB Mem : 31948.9 total, 2276.9 free, 7766.4 used, 21905.6 buff/cache MiB Swap: 24264.0 total, 22392.7 free, 1871.3 used. 20637.0 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2469916 bluddy 20 0 1843444 197988 28524 S 0.0 0.6 0:03.87 opm
On my machine, following the repro steps using opm built with the proposed changes and using an identical "redhat.db", top shows: top - 10:50:31 up 52 days, 17:33, 1 user, load average: 0.76, 0.73, 0.93 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.8 us, 0.8 sy, 0.0 ni, 97.5 id, 0.0 wa, 0.8 hi, 0.0 si, 0.0 st MiB Mem : 31948.9 total, 2480.5 free, 7505.3 used, 21963.1 buff/cache MiB Swap: 24264.0 total, 22392.4 free, 1871.6 used. 20876.9 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2470165 bluddy 20 0 1633064 42164 27476 S 0.0 0.1 0:00.29 opm Reducing RES from 197988KiB to 42164KiB and TIME+ from 3.87s to 0.29s.
Thank you for the detailed directions! opm version Version: version.Version{OpmVersion:"v1.17.5-21-g5fc0f422", GitCommit:"5fc0f422", BuildDate:"2021-08-03T14:52:16Z", GoOs:"linux", GoArch:"amd64"} opm registry serve -d redhat.db WARN[0000] unable to set termination log path error="open /dev/termination-log: permission denied" INFO[0000] Keeping server open for infinite seconds database=redhat.db port=50051 INFO[0000] serving registry database=redhat.db port=50051 top -b -n1 -p $(ps aux | grep 'opm registry serve -d redhat.db' | grep -v grep | awk '{print $2}') top - 11:21:55 up 34 days, 1:21, 2 users, load average: 4.25, 4.41, 3.73 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 24.6 us, 5.7 sy, 0.0 ni, 68.0 id, 0.0 wa, 0.8 hi, 0.8 si, 0.0 st MiB Mem : 31949.7 total, 4738.5 free, 17990.3 used, 9221.0 buff/cache MiB Swap: 20168.0 total, 17991.2 free, 2176.8 used. 11935.5 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2857413 tbuskey 20 0 1633132 43948 27588 S 0.0 0.1 0:00.45 opm 20 minutes later: top - 12:48:23 up 34 days, 2:47, 2 users, load average: 3.24, 3.66, 3.69 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 6.7 us, 2.5 sy, 0.0 ni, 89.1 id, 0.0 wa, 0.8 hi, 0.8 si, 0.0 st MiB Mem : 31949.7 total, 2111.8 free, 18705.1 used, 11132.9 buff/cache MiB Swap: 20168.0 total, 17994.7 free, 2173.2 used. 10911.5 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2857413 tbuskey 20 0 1633132 40812 27636 S 0.0 0.1 0:00.99 opm /lgtm
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759