1915912 – sig-storage-csi-snapshotter image not available

Bug 1915912 - sig-storage-csi-snapshotter image not available

Summary: sig-storage-csi-snapshotter image not available

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Storage
Sub Component:
Version:	4.7
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	---
Target Release:	4.7.0
Assignee:	Jan Safranek
QA Contact:	Qin Ping
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-01-13 17:45 UTC by jamo luhrsen
Modified:	2021-02-24 15:53 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-02-24 15:53:00 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift origin pull 25862	0	None	open	Bug 1915912: Fix CSI snapshotter image version	2021-02-05 11:05:58 UTC
Red Hat Product Errata	RHSA-2020:5633	0	None	None	None	2021-02-24 15:53:24 UTC

Description jamo luhrsen 2021-01-13 17:45:20 UTC

while debugging a new permafailing test case (Prometheus when installed on the cluster shouldn't report
any alerts in firing state apart from Watchdog and AlertmanagerReceiversNotConfigured) in a rather unhealth
vsphere job [0], I notice the following error in the build-log for this failing case:

  Jan 13 16:05:43.128 W ns/e2e-volume-expand-4934-7689 pod/csi-hostpath-snapshotter-0 node/ci-op-zjvhpwx5-56515-jwgvs-worker-zrnr6 reason/Failed Failed to pull image 
  "quay.io/openshift/community-e2e-images:e2e-k8s-gcr-io-sig-storage-csi-snapshotter-v3-0-2-xM4zaRecqro9vBr7": rpc error: code = Unknown desc = Error reading manifest e2e-k8s-gcr-io- 
  sig-storage-csi-snapshotter-v3-0-2-xM4zaRecqro9vBr7 in quay.io/openshift/community-e2e-images: manifest unknown: manifest unknown


I am not sure this error will fix the failing test case or not, but it's a starting point.

example failing job [1]


[0] https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#periodic-ci-openshift-release-master-ocp-4.7-e2e-vsphere&grid=old
[1] https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ocp-4.7-e2e-vsphere/1349375486697934848

Comment 1 jamo luhrsen 2021-01-13 18:50:51 UTC

I also see this same error in the frequently failing test case ([sig-imageregistry][Feature:ImageTriggers] Annotation trigger reconciles after the image is overwritten [Suite:openshift/conformance/parallel])
in the same job.

Also, I noticed that the (Prometheus when installed ...) is also failing in other non-vsphere jobs, like this one:
  https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#release-openshift-origin-installer-e2e-gcp-4.7&grid=old

Comment 2 jamo luhrsen 2021-01-14 23:09:33 UTC

this may be a dup of https://bugzilla.redhat.com/show_bug.cgi?id=1891068# marking it that way and will re-open if not.

*** This bug has been marked as a duplicate of bug 1891068 ***

Comment 3 Vadim Rutkovsky 2021-02-04 23:02:10 UTC

Not a dupe, some e2e tests (upgrade-conformance suite at least) start pods which are in ImagePullBackOff:

```
$ oc -n e2e-provisioning-3072-7881 get pod/csi-hostpath-snapshotter-0 -o yaml
...
Back-off pulling image "quay.io/openshift/community-e2e-images:e2e-k8s-gcr-io-sig-storage-csi-snapshotter-v3-0-2-xM4zaRecqro9vBr7"
```
This image doesn't exist:

```
$ skopeo inspect docker://quay.io/openshift/community-e2e-images:e2e-k8s-gcr-io-sig-storage-csi-snapshotter-v3-0-2-xM4zaRecqro9vBr7
FATA[0002] Error parsing image name "docker://quay.io/openshift/community-e2e-images:e2e-k8s-gcr-io-sig-storage-csi-snapshotter-v3-0-2-xM4zaRecqro9vBr7": Error reading manifest e2e-k8s-gcr-io-sig-storage-csi-snapshotter-v3-0-2-xM4zaRecqro9vBr7 in quay.io/openshift/community-e2e-images: manifest unknown: manifest unknown
```

So either it needs to be uploaded or the tests should use correct image

Comment 7 Qin Ping 2021-02-07 03:37:00 UTC

Hi Jan,

The test image issue is fixed with this PR, but the related e2e test cases seem still are excluded from the OCP e2e test for the `[Disabled: Broken]` keyword.

echo '"[sig-storage] CSI Volumes [Driver: csi-hostpath] [Testpattern: Dynamic PV (default fs)] provisioning should provision storage with snapshot data source [Feature:VolumeSnapshotDataSource] [Disabled:Broken] [Suite:k8s]"'|openshift-tests run -f -

1 pass, 0 skip (2m30s)

I'll change the bug status to assigned first, if you want to fix this issue in another bug, feel free to mark it as verified, thanks!

Comment 8 Qin Ping 2021-02-07 03:37:35 UTC

Checked with version: 4.7.0-0.nightly-2021-02-06-084550

Comment 9 Jan Safranek 2021-02-08 08:35:25 UTC

Qin, I filed https://bugzilla.redhat.com/show_bug.cgi?id=1925493 to enable the tests.

Comment 12 errata-xmlrpc 2021-02-24 15:53:00 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633

Note You need to log in before you can comment on or make changes to this bug.