Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2009660

Summary: cyclictest run missing -m argument
Product: OpenShift Container Platform Reporter: Marius Cornea <mcornea>
Component: CNF Platform ValidationAssignee: Federico Paolinelli <fpaoline>
Status: CLOSED ERRATA QA Contact: Nikita <nkononov>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.9CC: aos-bugs, dgonyier, titzhak
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: cnf-tests-container-v4.10.0-43 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-10 19:34:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2020931    

Description Marius Cornea 2021-10-01 07:57:05 UTC
Description of problem:

Currently when running cyclictest we are not passing -m argument which is important to prevent memory from getting paged out.

Version-Release number of selected component (if applicable):
                "vcs-ref": "b46a47f7c4d3007b8150551c771cc8460fba9dfa",
                "vcs-type": "git",
                "vcs-url": "https://github.com/openshift-kni/cnf-features-deploy",


How reproducible:
100%

Steps to Reproduce:
podman run --name cnf-container-tests  \
  --net=host  \
  -v /home/kni/cnf_tests_mcornea:/kubeconfig:Z  \
  -e KUBECONFIG=/kubeconfig/kubeconfig \
  -e IMAGE_REGISTRY=registry.kni-qe-0.lab.eng.rdu2.redhat.com:5000/ \
  -e CNF_TESTS_IMAGE=cnf-tests \
  -e LATENCY_TEST_RUN=true \
  -e LATENCY_TEST_RUNTIME=3600 \
  -e LATENCY_TEST_CPUS=16 \
  -e OSLAT_MAXIMUM_LATENCY=10 \
  -e CYCLICTEST_MAXIMUM_LATENCY=20 \
  -e HWLATDETECT_MAXIMUM_LATENCY=10 \
  -e ROLE_WORKER_CNF=master \
  -e PERF_TEST_PROFILE=openshift-node-performance-profile \
  -e DISCOVERY_MODE=true \
  quay.io/openshift-kni/cnf-tests \
  /usr/bin/test-run.sh  \
  -ginkgo.focus="\[performance\]\[config\]|\[performance\]\ Latency\ Test\ with\ the\ cyclictest" \
  --junit /junit -ginkgo.v


Actual results:

running the cyclictest command with arguments [-D 3600 -p 1 -t 16 -a 2-9,26-33 -h 30 -i 1000 --quiet]

Expected results:
-m argument passed to cyclictest command

Additional info:

Comment 3 Dwaine Gonyier 2022-02-02 19:29:06 UTC
Note: Note: Bug environment appears to be SNO cluster.

Comment 4 Dwaine Gonyier 2022-02-02 19:31:00 UTC
Verified fix on normal cluster (virtual). 
--mlockall parameter added to cyclic test command line.


### OCP version
[root@ocp-edge52 performance]# oc version
Client Version: 4.10.0-fc.3
Server Version: 4.10.0-fc.3
Kubernetes Version: v1.23.0+2135ac2


### PAO build
oc describe -n openshift-performance-addon-operator               pod/performance-operator-5f87f8d694-fjbdq  | grep sha256
                registry.redhat.io/openshift4/performance-addon-rhel8-operator@sha256:f8cbad44f91f5bfeb285ce7daa4e61dffc2500a11983861b6911f2b6d2f19abf
    Image:         registry.redhat.io/openshift4/performance-addon-rhel8-operator@sha256:f8cbad44f91f5bfeb285ce7daa4e61dffc2500a11983861b6911f2b6d2f19abf


[root@ocp-edge52 performance]#  skopeo inspect --tls-verify=false docker://registry-proxy.engineering.redhat.com/rh-osbs/openshift4-performance-addon-rhel8-operator@sha256:f8cbad44f91f5bfeb285ce7daa4e61dffc2500a11983861b6911f2b6d2f19abf | grep '"Digest"\|"url"'
    "Digest": "sha256:f8cbad44f91f5bfeb285ce7daa4e61dffc2500a11983861b6911f2b6d2f19abf",
        "url": "https://access.redhat.com/containers/#/registry.access.redhat.com/openshift4/performance-addon-rhel8-operator/images/v4.10.0-27",
### cnf-test build

openshift4-cnf-tests:v4.10.0-43 

### podman command 

[root@ocp-edge52 performance]# cat ./cyclic_alt.sh 
podman container rm cnf-container-tests
podman run --name cnf-container-tests  \
  --net=host  \
  -v /root/.kcli/clusters/ocp410/auth:/kubeconfig:Z  \
  -v /tmp/junit:/junit:Z \
  -v /tmp/report:/report:Z \
  -e KUBECONFIG=/kubeconfig/kubeconfig \
  -e IMAGE_REGISTRY=registry-proxy.engineering.redhat.com/rh-osbs/ \
  -e CNF_TESTS_IMAGE=openshift4-cnf-tests:v4.10.0-43 \
  -e LATENCY_TEST_RUN=true \
  -e LATENCY_TEST_RUNTIME=120 \
  -e LATENCY_TEST_CPUS=2 \
  -e OSLAT_MAXIMUM_LATENCY=3000 \
  -e LATENCY_TEST_DELAY=65 \
  -e CYCLICTEST_MAXIMUM_LATENCY=5200 \
  -e HWLATDETECT_MAXIMUM_LATENCY=5300 \
  -e ROLE_WORKER_CNF=worker-cnf \
  -e PERF_TEST_PROFILE=performance \
  -e DISCOVERY_MODE=true \
  registry-proxy.engineering.redhat.com/rh-osbs/openshift4-cnf-tests:v4.10.0-43 \
  /usr/bin/test-run.sh  \
  -ginkgo.focus="\[performance\]\ Latency\ Test" \
  --junit /junit -ginkgo.v
  #you may replace -ginkgo.focus  value with \[performance\]\ Latency\ Test, to
  #execute all the tools since hwlatdetec had the same bug,
  #-ginkgo.focus="cyclictest" -ginkgo.v


### Run latency tests on normal cluster (virtual)

[root@ocp-edge52 performance]# time bash ./cyclic_alt.sh 
a05523f7e37bb30791819979dbca962154536d4b2ac8ac4846c9341a3e5197bf
running /usr/bin/validationsuite -ginkgo.focus=\[performance\]\ Latency\ Test --junit /junit -ginkgo.v
I0202 17:43:05.978463       8 request.go:668] Waited for 1.044139781s due to client-side throttling, not priority and fairness, request: GET:https://api.ocp410.demo.lab.dgonyier:6443/apis/cloudcredential.openshift.io/v1?timeout=32s
Running Suite: CNF Features e2e validation
==========================================
Random Seed: 1643823784
Will run 0 of 48 specs

SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
JUnit report was created: /junit/validation_junit.xml

Ran 0 of 48 Specs in 0.000 seconds
SUCCESS! -- 0 Passed | 0 Failed | 0 Pending | 48 Skipped
PASS
You're using deprecated Ginkgo functionality:
=============================================
Ginkgo 2.0 is under active development and will introduce (a small number of) breaking changes.
To learn more, view the migration guide at https://github.com/onsi/ginkgo/blob/v2/docs/MIGRATING_TO_V2.md
To comment, chime in at https://github.com/onsi/ginkgo/issues/711

  You are using a custom reporter.  Support for custom reporters will likely be removed in V2.  Most users were using them to generate junit or teamcity reports and this functionality will be merged into the core reporter.  In addition, Ginkgo 2.0 will support emitting a JSON-formatted report that users can then manipulate to generate custom reports.

  If this change will be impactful to you please leave a comment on https://github.com/onsi/ginkgo/issues/711
  Learn more at: https://github.com/onsi/ginkgo/blob/v2/docs/MIGRATING_TO_V2.md#removed-custom-reporters

To silence deprecations that can be silenced set the following environment variable:
  ACK_GINKGO_DEPRECATIONS=1.16.4

Discovery mode enabled, skipping setup
running /usr/bin/cnftests -ginkgo.focus=\[performance\]\ Latency\ Test --junit /junit -ginkgo.v
I0202 17:43:08.697027      44 request.go:668] Waited for 1.042978123s due to client-side throttling, not priority and fairness, request: GET:https://api.ocp410.demo.lab.dgonyier:6443/apis/network.openshift.io/v1?timeout=32s
Running Suite: CNF Features e2e integration tests
=================================================
Random Seed: 1643823787
Will run 3 of 157 specs

SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
[performance] Latency Test with the oslat image 
  should succeed
  /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:153
STEP: Waiting two minutes to download the latencyTest image
STEP: Checking actual CPUs number for the running pod
STEP: Waiting another two minutes to give enough time for the cluster to move the pod to Succeeded phase
Feb  2 17:46:28.762: [INFO]: found mcd machine-config-daemon-hh8xd for node ocp410-worker-0.demo.lab.dgonyier
Feb  2 17:46:29.621: [INFO]: found mcd machine-config-daemon-hh8xd for node ocp410-worker-0.demo.lab.dgonyier
Feb  2 17:46:29.830: [INFO]: found mcd machine-config-daemon-hh8xd for node ocp410-worker-0.demo.lab.dgonyier
Log file created at: 2022/02/02 17:43:21
Running on machine: oslat-2f7dg
Binary: Built with gc go1.16.6 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0202 17:43:21.563413       1 node.go:37] Environment information: /proc/cmdline: BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-769653a88b3fa5277f7ee22aa792f3130a868aa89527e2666bd517a9ebd50598/vmlinuz-4.18.0-305.34.2.rt7.107.el8_4.x86_64 random.trust_cpu=on console=tty0 console=ttyS0,115200n8 ignition.platform.id=qemu ostree=/ostree/boot.1/rhcos/769653a88b3fa5277f7ee22aa792f3130a868aa89527e2666bd517a9ebd50598/0 root=UUID=43fee01c-54ae-4c3c-be5d-a47a6904dbba rw rootflags=prjquota boot=UUID=65c2f6bd-6a74-495e-8cc1-4c14a2619f5c skew_tick=1 nohz=on rcu_nocbs=1-3 tuned.non_isolcpus=00000ff1 intel_pstate=disable nosoftlockup tsc=nowatchdog intel_iommu=on iommu=pt isolcpus=managed_irq,1-3 systemd.cpu_affinity=0,4,5,6,7,8,9,10,11 default_hugepagesz=1G hugepagesz=1G hugepages=1 +
I0202 17:43:21.563708       1 node.go:44] Environment information: kernel version 4.18.0-305.34.2.rt7.107.el8_4.x86_64
I0202 17:44:26.564121       1 main.go:53] Running the oslat command with arguments [--duration 120 --rtprio 1 --cpu-list 2 --cpu-main-thread 1]
I0202 17:46:27.633822       1 main.go:59] Succeeded to run the oslat command: oslat V 2.10
Total runtime: 		120 seconds
Thread priority: 	SCHED_FIFO:1
CPU list: 		2
CPU for main thread: 	1
Workload: 		no
Workload mem: 		0 (KiB)
Preheat cores: 		1

Pre-heat for 1 seconds...
Test starts...
Test completed.

        Core:	 2
    CPU Freq:	 2096 (Mhz)
    001 (us):	 3877192008
    002 (us):	 23084
    003 (us):	 4934
    004 (us):	 63506
    005 (us):	 52065
    006 (us):	 68939
    007 (us):	 37551
    008 (us):	 7931
    009 (us):	 1720
    010 (us):	 539
    011 (us):	 27897
    012 (us):	 11267
    013 (us):	 8881
    014 (us):	 222
    015 (us):	 259
    016 (us):	 39414
    017 (us):	 8547
    018 (us):	 1062
    019 (us):	 123
    020 (us):	 182
    021 (us):	 378
    022 (us):	 298
    023 (us):	 100
    024 (us):	 24
    025 (us):	 16
    026 (us):	 9
    027 (us):	 9
    028 (us):	 9
    029 (us):	 6
    030 (us):	 4
    031 (us):	 6
    032 (us):	 18 (including overflows)
     Minimum:	 1 (us)
     Average:	 1.001 (us)
     Maximum:	 1052 (us)
     Max-Min:	 1051 (us)
    Duration:	 119.941 (sec)

Feb  2 17:46:30.008: [INFO]: found mcd machine-config-daemon-hh8xd for node ocp410-worker-0.demo.lab.dgonyier
Feb  2 17:47:30.517: [ERROR]: timed out waiting for the condition

• [SLOW TEST:258.650 seconds]
[performance] Latency Test
/remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:62
  with the oslat image
  /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:128
    should succeed
    /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:153
------------------------------
[performance] Latency Test with the cyclictest image 
  should succeed
  /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:197
STEP: Waiting two minutes to download the latencyTest image
STEP: Checking actual CPUs number for the running pod
STEP: Waiting another two minutes to give enough time for the cluster to move the pod to Succeeded phase
Feb  2 17:50:43.012: [INFO]: found mcd machine-config-daemon-hh8xd for node ocp410-worker-0.demo.lab.dgonyier
Feb  2 17:50:43.186: [INFO]: found mcd machine-config-daemon-hh8xd for node ocp410-worker-0.demo.lab.dgonyier
Feb  2 17:50:43.430: [INFO]: found mcd machine-config-daemon-hh8xd for node ocp410-worker-0.demo.lab.dgonyier
Log file created at: 2022/02/02 17:47:35
Running on machine: cyclictest-c8svn
Binary: Built with gc go1.16.6 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0202 17:47:35.180137       1 node.go:37] Environment information: /proc/cmdline: BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-769653a88b3fa5277f7ee22aa792f3130a868aa89527e2666bd517a9ebd50598/vmlinuz-4.18.0-305.34.2.rt7.107.el8_4.x86_64 random.trust_cpu=on console=tty0 console=ttyS0,115200n8 ignition.platform.id=qemu ostree=/ostree/boot.1/rhcos/769653a88b3fa5277f7ee22aa792f3130a868aa89527e2666bd517a9ebd50598/0 root=UUID=43fee01c-54ae-4c3c-be5d-a47a6904dbba rw rootflags=prjquota boot=UUID=65c2f6bd-6a74-495e-8cc1-4c14a2619f5c skew_tick=1 nohz=on rcu_nocbs=1-3 tuned.non_isolcpus=00000ff1 intel_pstate=disable nosoftlockup tsc=nowatchdog intel_iommu=on iommu=pt isolcpus=managed_irq,1-3 systemd.cpu_affinity=0,4,5,6,7,8,9,10,11 default_hugepagesz=1G hugepagesz=1G hugepages=1 +
I0202 17:47:35.180373       1 node.go:44] Environment information: kernel version 4.18.0-305.34.2.rt7.107.el8_4.x86_64
I0202 17:48:40.181339       1 main.go:51] running the cyclictest command with arguments [-D 120 -p 95 -t 2 -a 1-2 -h 30 -i 1000 --mlockall --quiet]
I0202 17:50:40.785066       1 main.go:57] succeeded to run the cyclictest command: # /dev/cpu_dma_latency set to 0us
# Histogram
000000 000000	000000
000001 000000	000000
000002 000000	000000
000003 000000	000000
000004 000007	000000
000005 000074	000000
000006 000183	000000
000007 000272	000003
000008 000306	000015
000009 000335	000042
000010 000310	000082
000011 000332	000146
000012 000481	000177
000013 000488	000187
000014 000518	000202
000015 000604	000245
000016 000611	000272
000017 000626	000298
000018 000652	000330
000019 000683	000398
000020 000637	000414
000021 000658	000345
000022 000725	000431
000023 000723	000422
000024 000694	000482
000025 000704	000508
000026 000764	000450
000027 001055	000431
000028 002831	000414
000029 007397	000424
# Total: 000022670 000006718
# Min Latencies: 00004 00007
# Avg Latencies: 00033 00054
# Max Latencies: 04823 04966
# Histogram Overflows: 97316 113269
# Histogram Overflow at cycle number:
# Thread 0: 00014 00015 00018 00019 00020 00021 00022 00023 00024 00025 00026 00027 00028 00029 00030 00031 00032 00033 00034 00035 00036 00037 00038 00039 00041 00042 00043 00044 00045 00046 # 97286 others
# Thread 1: 00000 00001 00002 00003 00004 00005 00006 00007 00008 00009 00010 00011 00012 00013 00014 00015 00016 00017 00018 00019 00020 00021 00022 00023 00024 00025 00026 00027 00028 00029 # 113239 others

Feb  2 17:50:43.667: [INFO]: found mcd machine-config-daemon-hh8xd for node ocp410-worker-0.demo.lab.dgonyier
Feb  2 17:51:44.380: [ERROR]: timed out waiting for the condition

• [SLOW TEST:253.289 seconds]
[performance] Latency Test
/remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:62
  with the cyclictest image
  /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:185
    should succeed
    /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:197
------------------------------
[performance] Latency Test with the hwlatdetect image 
  should succeed
  /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:236
STEP: Waiting two minutes to download the latencyTest image
STEP: Checking actual CPUs number for the running pod
STEP: Waiting another two minutes to give enough time for the cluster to move the pod to Succeeded phase
Feb  2 17:54:59.376: [INFO]: found mcd machine-config-daemon-hh8xd for node ocp410-worker-0.demo.lab.dgonyier
Feb  2 17:54:59.605: [INFO]: found mcd machine-config-daemon-hh8xd for node ocp410-worker-0.demo.lab.dgonyier
Log file created at: 2022/02/02 17:51:48
Running on machine: hwlatdetect-vdq87
Binary: Built with gc go1.16.6 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0202 17:51:48.525040       1 node.go:37] Environment information: /proc/cmdline: BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-769653a88b3fa5277f7ee22aa792f3130a868aa89527e2666bd517a9ebd50598/vmlinuz-4.18.0-305.34.2.rt7.107.el8_4.x86_64 random.trust_cpu=on console=tty0 console=ttyS0,115200n8 ignition.platform.id=qemu ostree=/ostree/boot.1/rhcos/769653a88b3fa5277f7ee22aa792f3130a868aa89527e2666bd517a9ebd50598/0 root=UUID=43fee01c-54ae-4c3c-be5d-a47a6904dbba rw rootflags=prjquota boot=UUID=65c2f6bd-6a74-495e-8cc1-4c14a2619f5c skew_tick=1 nohz=on rcu_nocbs=1-3 tuned.non_isolcpus=00000ff1 intel_pstate=disable nosoftlockup tsc=nowatchdog intel_iommu=on iommu=pt isolcpus=managed_irq,1-3 systemd.cpu_affinity=0,4,5,6,7,8,9,10,11 default_hugepagesz=1G hugepagesz=1G hugepages=1 +
I0202 17:51:48.525670       1 node.go:44] Environment information: kernel version 4.18.0-305.34.2.rt7.107.el8_4.x86_64
I0202 17:52:53.526113       1 main.go:50] running the hwlatdetect command with arguments [/usr/bin/hwlatdetect --threshold 5300 --hardlimit 5300 --duration 120 --window 10000000us --width 950000us]
I0202 17:54:56.786751       1 main.go:56] succeeded to run the hwlatdetect command: hwlatdetect:  test duration 120 seconds
   detector: tracer
   parameters:
        Latency threshold: 5300us
        Sample window:     10000000us
        Sample width:      950000us
     Non-sampling period:  9050000us
        Output File:       None

Starting test
test finished
Max Latency: Below threshold
Samples recorded: 0
Samples exceeding threshold: 0
Feb  2 17:54:59.778: [INFO]: found mcd machine-config-daemon-hh8xd for node ocp410-worker-0.demo.lab.dgonyier
Feb  2 17:56:00.268: [ERROR]: timed out waiting for the condition

• [SLOW TEST:256.151 seconds]
[performance] Latency Test
/remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:62
  with the hwlatdetect image
  /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:228
    should succeed
    /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:236
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSI0202 17:56:28.561846      44 request.go:668] Waited for 1.156106585s due to client-side throttling, not priority and fairness, request: GET:https://api.ocp410.demo.lab.dgonyier:6443/apis/operator.openshift.io/v1alpha1?timeout=32s
I0202 17:56:38.761896      44 request.go:668] Waited for 11.356020753s due to client-side throttling, not priority and fairness, request: GET:https://api.ocp410.demo.lab.dgonyier:6443/apis/operators.coreos.com/v2?timeout=32s

JUnit report was created: /junit/cnftests-junit.xml

Ran 3 of 157 Specs in 839.672 seconds
SUCCESS! -- 3 Passed | 0 Failed | 0 Pending | 154 Skipped
PASS
You're using deprecated Ginkgo functionality:
=============================================
Ginkgo 2.0 is under active development and will introduce (a small number of) breaking changes.
To learn more, view the migration guide at https://github.com/onsi/ginkgo/blob/v2/docs/MIGRATING_TO_V2.md
To comment, chime in at https://github.com/onsi/ginkgo/issues/711

  You are using a custom reporter.  Support for custom reporters will likely be removed in V2.  Most users were using them to generate junit or teamcity reports and this functionality will be merged into the core reporter.  In addition, Ginkgo 2.0 will support emitting a JSON-formatted report that users can then manipulate to generate custom reports.

  If this change will be impactful to you please leave a comment on https://github.com/onsi/ginkgo/issues/711
  Learn more at: https://github.com/onsi/ginkgo/blob/v2/docs/MIGRATING_TO_V2.md#removed-custom-reporters

To silence deprecations that can be silenced set the following environment variable:
  ACK_GINKGO_DEPRECATIONS=1.16.4


real	14m6.618s
user	0m1.774s
sys	0m0.956s

Comment 6 errata-xmlrpc 2022-03-10 19:34:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10 low-latency extras update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:0640