Bug 2093339

Summary: [rebase v1.24] Only known images used by tests
Product: OpenShift Container Platform Reporter: Abu Kashem <akashem>
Component: StorageAssignee: Fabio Bertinatto <fbertina>
Storage sub component: Storage QA Contact: Wei Duan <wduan>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: medium CC: jsafrane, kenzhang, tsedovic
Version: 4.11Keywords: Rebase
Target Milestone: ---   
Target Release: 4.13.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-17 22:46:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Abu Kashem 2022-06-03 14:03:49 UTC
The following test is failing for https://github.com/openshift/origin/pull/27181


: [sig-arch] Only known images used by tests expand_less	0s
{  Cluster accessed images that were not mirrored to the testing repository or already part of the cluster, see test/extended/util/image/README.md in the openshift/origin repo:

k8s.gcr.io/sig-storage/hello-populator:v1.0.1 from pods:
  ns/e2e-provisioning-3131-pop-5996 pod/populate-9c2fdb01-5efe-4cd3-ab75-e063bda26379 node/ip-10-0-171-18.us-west-1.compute.internal
}


hello-popular-deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-populator
  namespace: hello
spec:
  selector:
    matchLabels:
      app: hello
  template:
    metadata:
      labels:
        app: hello
    spec:
      serviceAccount: hello-account
      containers:
        - name: hello
          image: k8s.gcr.io/sig-storage/hello-populator:v1.0.1
          imagePullPolicy: IfNotPresent
          args:
            - --mode=controller
            - --image-name=k8s.gcr.io/sig-storage/hello-populator:v1.0.1
            - --http-endpoint=:8080
          ports:
            - containerPort: 8080
              name: http-endpoint
              protocol: TCP


the image name 'hello-populator:v1.0.1' is passed as an argument to hello-populator
> - --image-name=k8s.gcr.io/sig-storage/hello-populator:v1.0.1

If I am looking at the correct source code, hello-populator creates a Pod
- https://github.com/NetApp/hello-populator/blob/master/main.go#L59
- https://github.com/NetApp/hello-populator/blob/master/pkg/populator-machinery/controller.go#L519

Comment 1 Abu Kashem 2022-06-03 14:29:45 UTC
I need to mark this BZ as blocker+ to skip the test until certain threshold in order to merge the origin PR https://github.com/openshift/origin/pull/27181. 
Feel free to set its blocker(+/-) as you see fit when you triage the BZ

Comment 2 Fabio Bertinatto 2022-06-03 18:44:40 UTC
The hello-populator binary has 2 modes of operation: controller and populator. The Deployment above rolls out a pod/container running the binary in controller mode, which in turn creates another pod/controller running the binary in populator mode.

According to this job run:

> [sig-arch] unknown image: k8s.gcr.io/sig-storage/hello-populator:v1.0.1 (container/populate reason/Pulled duration/2.533s image/k8s.gcr.io/sig-storage/hello-populator:v1.0.1)

What appears to be failing to start is the second pod (called *populate*, started by the controller):
 
> [sig-arch] unknown image: k8s.gcr.io/sig-storage/hello-populator:v1.0.1 (container/populate reason/Pulled duration/2.533s image/k8s.gcr.io/sig-storage/hello-populator:v1.0.1)

---

That being said, I looked at how this error is triggered and to skip this verification we need to add an exception here:

https://github.com/openshift/origin/blob/master/cmd/openshift-tests/images.go#L190

I'm currently testing a fix here:

https://github.com/openshift/origin/pull/27215

Comment 3 Jan Safranek 2022-06-06 08:27:57 UTC
IMO we should not just skip the check, the test then won't work in disconnected mode.

We should update the tests to fix populator deployment cmdline [1] with the mirrored image, probably somewhere in [2]. There is a function ReplaceRegistryInImageURL that can translate images from registry.k8s.io/sig-storage to the mirrored location, not sure how it can deal with "--image-name=" prefix.

1: https://github.com/kubernetes/kubernetes/blob/2d7dcf928c3e0e8dd4c29c421893a299e1a1b857/test/e2e/testing-manifests/storage-csi/any-volume-datasource/hello-populator-deploy.yaml#L63
2: https://github.com/kubernetes/kubernetes/blob/2d7dcf928c3e0e8dd4c29c421893a299e1a1b857/test/e2e/storage/testsuites/provisioning.go#L297

Comment 4 Fabio Bertinatto 2022-06-07 18:56:52 UTC
Ack. I updated the WIP PR [1] to replace the populator image instead. Testing it ATM.

[1] https://github.com/openshift/origin/pull/27215

Comment 5 Fabio Bertinatto 2022-06-13 14:33:43 UTC
Upstream PR to fix this issue has been merged: https://github.com/kubernetes/kubernetes/pull/110465

Another PR to cherry-pick this fix into 1.24 has been created: https://github.com/kubernetes/kubernetes/pull/110541. Once merged, we'll get the fix into openshift/origin with the next rebase (i.e., 1.24.2).

Comment 6 Jan Safranek 2022-06-14 14:33:01 UTC
Clearing blocker, we don't announce volume populators as OCP feature. We get the fix once the backport above is merged + released in 1.24.z

Comment 9 Wei Duan 2022-06-22 12:16:38 UTC
Checked with Fabio, need to wait the rebase. Change status to POST.

Comment 10 Fabio Bertinatto 2022-07-07 19:12:28 UTC
Cherry-pick to Kubernetes 1.24 has merged: https://github.com/kubernetes/kubernetes/pull/110541

Comment 14 Wei Duan 2022-12-02 12:14:17 UTC
Verified. Cases are enabled although most of them are skipped in 4.13.

Comment 17 errata-xmlrpc 2023-05-17 22:46:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.13.0 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:1326