OCP cluster upgraded from 4.7.11 to 4.8.0-0.nightly-ppc64le-2021-05-26-112202 Cluster is installed on libvirt. e2e failure seen is : [sig-arch] Managed cluster should ensure platform components have system-* priority class associated [Suite:openshift/conformance/parallel] Log of the above failure : [BeforeEach] [Top Level] github.com/openshift/origin/test/extended/util/framework.go:1450 [BeforeEach] [Top Level] github.com/openshift/origin/test/extended/util/framework.go:1450 [BeforeEach] [Top Level] github.com/openshift/origin/test/extended/util/test.go:59 [BeforeEach] [sig-arch] Managed cluster should github.com/openshift/origin/test/extended/util/client.go:142 STEP: Creating a kubernetes client [It] ensure platform components have system-* priority class associated [Suite:openshift/conformance/parallel] github.com/openshift/origin/test/extended/pods/priorityclasses.go:20 May 28 11:52:38.383: INFO: Workloads with outstanding bugs: Component downloads has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954866 Component image-pruner has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954891 Component ingress-canary has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954892 Component migrator has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954868 Component network-check-source has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954870 Component network-check-target has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954870 May 28 11:52:38.383: INFO: Workloads with outstanding bugs: Component downloads has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954866 Component image-pruner has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954891 Component ingress-canary has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954892 Component migrator has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954868 Component network-check-source has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954870 Component network-check-target has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954870 May 28 11:52:38.384: FAIL: 23 pods found with invalid priority class (should be openshift-user-critical or begin with system-): openshift-local-storage/busybox-lso (currently "") openshift-local-storage/diskmaker-manager-4br57 (currently "") openshift-local-storage/diskmaker-manager-zwxw2 (currently "") openshift-local-storage/local-storage-operator-776bb985f-2lv9k (currently "") openshift-local-storage/local-storage-operator-776bb985f-44n8r (currently "") openshift-local-storage/local-storage-operator-776bb985f-4hkfn (currently "") openshift-local-storage/local-storage-operator-776bb985f-66zqv (currently "") openshift-local-storage/local-storage-operator-776bb985f-8xgsr (currently "") openshift-local-storage/local-storage-operator-776bb985f-9c78z (currently "") openshift-local-storage/local-storage-operator-776bb985f-9s6md (currently "") openshift-local-storage/local-storage-operator-776bb985f-cfjk5 (currently "") openshift-local-storage/local-storage-operator-776bb985f-dfx77 (currently "") openshift-local-storage/local-storage-operator-776bb985f-dqq4s (currently "") openshift-local-storage/local-storage-operator-776bb985f-fsb8m (currently "") openshift-local-storage/local-storage-operator-776bb985f-fsxtn (currently "") openshift-local-storage/local-storage-operator-776bb985f-jc947 (currently "") openshift-local-storage/local-storage-operator-776bb985f-nfh7m (currently "") openshift-local-storage/local-storage-operator-776bb985f-p6r6m (currently "") openshift-local-storage/local-storage-operator-776bb985f-tlvnz (currently "") openshift-local-storage/local-storage-operator-776bb985f-x9d9j (currently "") openshift-local-storage/local-storage-operator-776bb985f-xjknb (currently "") openshift-local-storage/local-storage-operator-776bb985f-xkdz2 (currently "") openshift-marketplace/pre-48-operator-catalog-vc52c (currently "") Full Stack Trace github.com/onsi/ginkgo/internal/leafnodes.(*runner).runSync(0xc00195db00, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/leafnodes/runner.go:113 +0xb8 github.com/onsi/ginkgo/internal/leafnodes.(*runner).run(0xc00195db00, 0xc00285d8c0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/leafnodes/runner.go:64 +0x180 github.com/onsi/ginkgo/internal/leafnodes.(*ItNode).Run(0xc001941f00, 0x1266c9188, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/leafnodes/it_node.go:26 +0x98 github.com/onsi/ginkgo/internal/spec.(*Spec).runSample(0xc001c22960, 0x0, 0x1266c9188, 0xc0003ea040) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/spec/spec.go:215 +0x22c github.com/onsi/ginkgo/internal/spec.(*Spec).Run(0xc001c22960, 0x1266c9188, 0xc0003ea040) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/spec/spec.go:138 +0x110 github.com/onsi/ginkgo/internal/specrunner.(*SpecRunner).runSpec(0xc0015b8dc0, 0xc001c22960, 0x0) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/specrunner/spec_runner.go:200 +0x100 github.com/onsi/ginkgo/internal/specrunner.(*SpecRunner).runSpecs(0xc0015b8dc0, 0x1) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/specrunner/spec_runner.go:170 +0x148 github.com/onsi/ginkgo/internal/specrunner.(*SpecRunner).Run(0xc0015b8dc0, 0xc0020c6b48) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/specrunner/spec_runner.go:66 +0x118 github.com/onsi/ginkgo/internal/suite.(*Suite).Run(0xc000279ef0, 0x1266c9448, 0xc0022907d0, 0x0, 0x0, 0xc001ba6160, 0x1, 0x1, 0x12679f9f8, 0xc0003ea040, ...) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/suite/suite.go:62 +0x378 github.com/openshift/origin/pkg/test/ginkgo.(*TestOptions).Run(0xc0015a6d80, 0xc001455b30, 0x1, 0x1, 0x128f17f00, 0x122778eb0) github.com/openshift/origin/pkg/test/ginkgo/cmd_runtest.go:61 +0x390 main.newRunTestCommand.func1.1() github.com/openshift/origin/cmd/openshift-tests/openshift-tests.go:426 +0x60 github.com/openshift/origin/test/extended/util.WithCleanup(0xc001f1fbb0) github.com/openshift/origin/test/extended/util/test.go:167 +0x80 main.newRunTestCommand.func1(0xc000902b00, 0xc001455b30, 0x1, 0x1, 0x0, 0x0) github.com/openshift/origin/cmd/openshift-tests/openshift-tests.go:426 +0x2d4 github.com/spf13/cobra.(*Command).execute(0xc000902b00, 0xc001455b00, 0x1, 0x1, 0xc000902b00, 0xc001455b00) github.com/spf13/cobra.1/command.go:850 +0x3d0 github.com/spf13/cobra.(*Command).ExecuteC(0xc000902000, 0x0, 0x1266d0d48, 0x129333458) github.com/spf13/cobra.1/command.go:958 +0x2b4 github.com/spf13/cobra.(*Command).Execute(...) github.com/spf13/cobra.1/command.go:895 main.main.func1(0xc000902000, 0x0, 0x0) github.com/openshift/origin/cmd/openshift-tests/openshift-tests.go:84 +0xa0 main.main() github.com/openshift/origin/cmd/openshift-tests/openshift-tests.go:85 +0x3b4 [AfterEach] [sig-arch] Managed cluster should github.com/openshift/origin/test/extended/util/client.go:140 [AfterEach] [sig-arch] Managed cluster should github.com/openshift/origin/test/extended/util/client.go:141 fail [github.com/onsi/ginkgo.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: May 28 11:52:38.384: 23 pods found with invalid priority class (should be openshift-user-critical or begin with system-): openshift-local-storage/busybox-lso (currently "") openshift-local-storage/diskmaker-manager-4br57 (currently "") openshift-local-storage/diskmaker-manager-zwxw2 (currently "") openshift-local-storage/local-storage-operator-776bb985f-2lv9k (currently "") openshift-local-storage/local-storage-operator-776bb985f-44n8r (currently "") openshift-local-storage/local-storage-operator-776bb985f-4hkfn (currently "") openshift-local-storage/local-storage-operator-776bb985f-66zqv (currently "") openshift-local-storage/local-storage-operator-776bb985f-8xgsr (currently "") openshift-local-storage/local-storage-operator-776bb985f-9c78z (currently "") openshift-local-storage/local-storage-operator-776bb985f-9s6md (currently "") openshift-local-storage/local-storage-operator-776bb985f-cfjk5 (currently "") openshift-local-storage/local-storage-operator-776bb985f-dfx77 (currently "") openshift-local-storage/local-storage-operator-776bb985f-dqq4s (currently "") openshift-local-storage/local-storage-operator-776bb985f-fsb8m (currently "") openshift-local-storage/local-storage-operator-776bb985f-fsxtn (currently "") openshift-local-storage/local-storage-operator-776bb985f-jc947 (currently "") openshift-local-storage/local-storage-operator-776bb985f-nfh7m (currently "") openshift-local-storage/local-storage-operator-776bb985f-p6r6m (currently "") openshift-local-storage/local-storage-operator-776bb985f-tlvnz (currently "") openshift-local-storage/local-storage-operator-776bb985f-x9d9j (currently "") openshift-local-storage/local-storage-operator-776bb985f-xjknb (currently "") openshift-local-storage/local-storage-operator-776bb985f-xkdz2 (currently "") openshift-marketplace/pre-48-operator-catalog-vc52c (currently "") failed: (800ms) 2021-05-28T15:52:38 "[sig-arch] Managed cluster should ensure platform components have system-* priority class associated [Suite:openshift/conformance/parallel]"
Cluster is installed on ppc64le. # arch ppc64le
openshift-user-critical sounds the right one for LSO Deployment, system-node-critical should be used in our DaemonSets.
Passed with local-storage-operator.4.8.0-202106091727 oc get pods NAME READY STATUS RESTARTS AGE diskmaker-manager-8nrw6 1/1 Running 0 2m10s diskmaker-manager-nxlfw 1/1 Running 0 2m10s diskmaker-manager-rfhr8 1/1 Running 0 2m10s local-storage-operator-6c698798bd-lkr5z 1/1 Running 0 106m 1 pods found with invalid priority class (should be openshift-user-critical or begin with system-): openshift-marketplace/qe-app-registry-fvj6j (currently "") failed: (8s) 2021-06-11T05:19:42 "[sig-arch] Managed cluster should ensure platform components have system-* priority class associated [Suite:openshift/conformance/parallel]"
Above failure is still seen with LSO version 4.8.0-202106291913 provided by Red Hat. OCP 4.8.0 is installed on a libvirt environment (ppc64le). # oc get pods -n openshift-local-storage NAME READY STATUS RESTARTS AGE busybox-lso 1/1 Running 7 76m diskmaker-manager-26fpd 1/1 Running 0 77m diskmaker-manager-bjzm2 1/1 Running 0 77m local-storage-operator-7bff9f47f7-9r2w6 1/1 Running 0 128m started: (0/1/1) "[sig-arch] Managed cluster should ensure platform components have system-* priority class associated [Suite:openshift/conformance/parallel]" [BeforeEach] [Top Level] github.com/openshift/origin/test/extended/util/framework.go:1450 [BeforeEach] [Top Level] github.com/openshift/origin/test/extended/util/framework.go:1450 [BeforeEach] [Top Level] github.com/openshift/origin/test/extended/util/test.go:59 [BeforeEach] [sig-arch] Managed cluster should github.com/openshift/origin/test/extended/util/client.go:142 STEP: Creating a kubernetes client [It] ensure platform components have system-* priority class associated [Suite:openshift/conformance/parallel] github.com/openshift/origin/test/extended/pods/priorityclasses.go:20 Jul 15 08:48:08.373: INFO: Workloads with outstanding bugs: Component downloads has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954866 Component ingress-canary has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954892 Component migrator has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954868 Component network-check-source has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954870 Component network-check-target has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954870 Jul 15 08:48:08.373: INFO: Workloads with outstanding bugs: Component downloads has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954866 Component ingress-canary has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954892 Component migrator has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954868 Component network-check-source has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954870 Component network-check-target has a bug associated already: https://bugzilla.redhat.com/show_bug.cgi?id=1954870 Jul 15 08:48:08.373: FAIL: 2 pods found with invalid priority class (should be openshift-user-critical or begin with system-): openshift-local-storage/busybox-lso (currently "") openshift-marketplace/rc-48-operator-catalog-w6j7k (currently "") Full Stack Trace github.com/onsi/ginkgo/internal/leafnodes.(*runner).runSync(0xc0013c3ec0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/leafnodes/runner.go:113 +0xa0 github.com/onsi/ginkgo/internal/leafnodes.(*runner).run(0xc0013c3ec0, 0xc0017bf200, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/leafnodes/runner.go:64 +0x170 github.com/onsi/ginkgo/internal/leafnodes.(*ItNode).Run(0xc0013331a0, 0x17e31598, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/leafnodes/it_node.go:26 +0x88 github.com/onsi/ginkgo/internal/spec.(*Spec).runSample(0xc00289d1d0, 0x0, 0x17e31598, 0xc000468040) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/spec/spec.go:215 +0x21c github.com/onsi/ginkgo/internal/spec.(*Spec).Run(0xc00289d1d0, 0x17e31598, 0xc000468040) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/spec/spec.go:138 +0xf8 github.com/onsi/ginkgo/internal/specrunner.(*SpecRunner).runSpec(0xc0021d0140, 0xc00289d1d0, 0x0) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/specrunner/spec_runner.go:200 +0xec github.com/onsi/ginkgo/internal/specrunner.(*SpecRunner).runSpecs(0xc0021d0140, 0x1) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/specrunner/spec_runner.go:170 +0x130 github.com/onsi/ginkgo/internal/specrunner.(*SpecRunner).Run(0xc0021d0140, 0xc002c24e00) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/specrunner/spec_runner.go:66 +0x104 github.com/onsi/ginkgo/internal/suite.(*Suite).Run(0xc000279ef0, 0x17e31858, 0xc002180b40, 0x0, 0x0, 0xc0022fe990, 0x1, 0x1, 0x17f06c60, 0xc000468040, ...) github.com/onsi/ginkgo.0-origin.0+incompatible/internal/suite/suite.go:62 +0x364 github.com/openshift/origin/pkg/test/ginkgo.(*TestOptions).Run(0xc00119d980, 0xc000ee5490, 0x1, 0x1, 0x0, 0x14082ce0) github.com/openshift/origin/pkg/test/ginkgo/cmd_runtest.go:61 +0x37c main.newRunTestCommand.func1.1() github.com/openshift/origin/cmd/openshift-tests/openshift-tests.go:426 +0x4c github.com/openshift/origin/test/extended/util.WithCleanup(0xc0016bfbb0) github.com/openshift/origin/test/extended/util/test.go:167 +0x6c main.newRunTestCommand.func1(0xc0014e8b00, 0xc000ee5490, 0x1, 0x1, 0x0, 0x0) github.com/openshift/origin/cmd/openshift-tests/openshift-tests.go:426 +0x2c0 github.com/spf13/cobra.(*Command).execute(0xc0014e8b00, 0xc000ee53d0, 0x1, 0x1, 0xc0014e8b00, 0xc000ee53d0) github.com/spf13/cobra.1/command.go:850 +0x3a4 github.com/spf13/cobra.(*Command).ExecuteC(0xc0014e8000, 0x0, 0x17e39138, 0x1ab74428) github.com/spf13/cobra.1/command.go:958 +0x29c github.com/spf13/cobra.(*Command).Execute(...) github.com/spf13/cobra.1/command.go:895 main.main.func1(0xc0014e8000, 0x0, 0x0) github.com/openshift/origin/cmd/openshift-tests/openshift-tests.go:84 +0x8c main.main() github.com/openshift/origin/cmd/openshift-tests/openshift-tests.go:85 +0x3a0 [AfterEach] [sig-arch] Managed cluster should github.com/openshift/origin/test/extended/util/client.go:140 [AfterEach] [sig-arch] Managed cluster should github.com/openshift/origin/test/extended/util/client.go:141 fail [github.com/onsi/ginkgo.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Jul 15 08:48:08.373: 2 pods found with invalid priority class (should be openshift-user-critical or begin with system-): openshift-local-storage/busybox-lso (currently "") openshift-marketplace/rc-48-operator-catalog-w6j7k (currently "") failed: (700ms) 2021-07-15T12:48:08 "[sig-arch] Managed cluster should ensure platform components have system-* priority class associated [Suite:openshift/conformance/parallel]" Timeline: Jul 15 12:48:07.697 I e2e-test/"[sig-arch] Managed cluster should ensure platform components have system-* priority class associated [Suite:openshift/conformance/parallel]" started Jul 15 12:48:07.697 - 689ms E e2e-test/"[sig-arch] Managed cluster should ensure platform components have system-* priority class associated [Suite:openshift/conformance/parallel]" e2e test finished As "Failed" Jul 15 12:48:08.387 E e2e-test/"[sig-arch] Managed cluster should ensure platform components have system-* priority class associated [Suite:openshift/conformance/parallel]" finishedStatus/Failed Flaky invariants: [sig-arch] Monitor cluster while tests execute Failing tests: [sig-arch] Managed cluster should ensure platform components have system-* priority class associated [Suite:openshift/conformance/parallel] Writing JUnit report to /tmp/conformance-parallel/junit_e2e_20210715-124808.xml error: 1 fail, 0 pass, 0 skip (700ms)
Hi @Mustafa, per Alisha's comment above, seems that we have hit this bug during 4.8 testing once again. Can we re-open this bug so that your team could investigate into the cause of the recurrence?
Hi Den, Please feel free to reopen and assign to aos-storage. Thanks
Thank you. Re-opening back to "NEW" based on failed testing mentioned in Comment 6 and changing the assignee back to default
Can you point to us - code that is creating busybox-lso pod? none of our deploy scripts create such a pod.
Making Hemant's comment un-private as Alisha is a PE and cannot see private comment. Hi Alisha, please see Hemant's request in Comment 10
The busybox-lso pod is created maually with the help of below YAML, which would create a PVC that points to the LSO-backed storage and also a pod that wants to mount the block device. The "busybox-lso" pod is in running state. # cat pvc-pod.yaml kind: PersistentVolumeClaim apiVersion: v1 metadata: name: test-local-claim namespace: openshift-local-storage spec: accessModes: - ReadWriteOnce volumeMode: Block resources: requests: storage: 100Gi storageClassName: localblock-sc --- apiVersion: v1 kind: Pod metadata: name: busybox-lso namespace: openshift-local-storage spec: volumes: - name: local-data persistentVolumeClaim: claimName: test-local-claim containers: - name: busybox image: quay.io/powercloud/busybox:ubi command: ['sh', '-c', 'echo "Hello, Kubernetes!" && sleep 600'] volumeDevices: - devicePath: "/dev/lso" name: local-data [root@rc-47-bastion ~]# oc apply -f pvc-pod.yaml persistentvolumeclaim/test-local-claim created pod/busybox-lso created [root@rc-47-bastion ~]# oc get pvc -n openshift-local-storage NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE test-local-claim Bound local-pv-facd92d8 200Gi RWO localblock-sc 29s # oc get pods -n openshift-local-storage NAME READY STATUS RESTARTS AGE busybox-lso 1/1 Running 0 7m6s diskmaker-manager-2krr7 1/1 Running 0 50m diskmaker-manager-b67js 1/1 Running 0 50m local-storage-operator-864b957d58-b7h6c 1/1 Running 0 104m
The last 2 pods with wrong class are: > 2 pods found with invalid priority class (should be openshift-user-critical or begin with system-): > openshift-local-storage/busybox-lso (currently "") > openshift-marketplace/rc-48-operator-catalog-w6j7k (currently "") And busybox-lso was created manually, it's not part of OCP: > The busybox-lso pod is created maually with the help of below YAML, which would create a PVC that points to the LSO-backed storage and also a pod that wants to mount the block device. So the test did not catch any LSO error in OCP. Please create the busybox-lso pod in a different namespace, so it's not caught by the test. Moving this BZ back to VERIFIED (assuming you have another BZ for openshift-marketplace/rc-48-operator-catalog-w6j7k).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438