Bug 1852637 - Kubelet sets incorrect image names in node status images section
Summary: Kubelet sets incorrect image names in node status images section
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.8.0
Assignee: Sascha Grunert
QA Contact: MinLi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-01 01:07 UTC by Clayton Coleman
Modified: 2021-07-27 22:32 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 22:32:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:32:43 UTC

Description Clayton Coleman 2020-07-01 01:07:30 UTC
A 4.4 cluster is reporting:

```
  images:
  - names:
    - image-registry.openshift-image-registry.svc:5000/ci-op-641wm07t/pipeline@sha256:0319298f692327262118f79cab90dc32e56811744da8d24fb062a80e8c7dea1a
    - image-registry.openshift-image-registry.svc:<none>
    sizeBytes: 5909507506
  - names:
    - image-registry.openshift-image-registry.svc:5000/ci-op-d37tk3tj/pipeline@sha256:0b87d902528f8726b5faea8d5578610337788bb4ef0b520397f7e2a76319c232
    - image-registry.openshift-image-registry.svc:<none>
    sizeBytes: 5870381982
  - names:
    - image-registry.openshift-image-registry.svc:5000/ci-op-gzx95rim/pipeline@sha256:f0c298ecde6adafcb7a7a4225b2713ff2cac668ecfd371dea7007f9e27caf174
    - image-registry.openshift-image-registry.svc:<none>
    sizeBytes: 5845514815
```

in the node.status.images array.  The <none> entries should not be returned because the list exists to allow the scheduler to match incoming pods to nodes that may already have the image, and the <none> image is not targetable.  The list is *not* for estimating the size of images on disk (we have metrics for that).  

The "names" documentation gives a reasonable summary - it's the name by which the image is known, you can't "know" multiple images with the same name.

DESCRIPTION:
     List of container images on this node

     Describe a container image

FIELDS:
   names	<[]string> -required-
     Names by which this image is known. e.g. ["k8s.gcr.io/hyperkube:v1.0.7",
     "dockerhub.io/google_containers/hyperkube:v1.0.7"]

Comment 3 Seth Jennings 2020-08-31 22:00:57 UTC
Triaging notes

https://github.com/kubernetes/kubernetes/blob/4db3a096ce8ac730b2280494422e1c4cf5fe875e/pkg/kubelet/nodestatus/setters.go#L446-L447

		for _, image := range containerImages {
			// make a copy to avoid modifying slice members of the image items in the list
			names := append([]string{}, image.RepoDigests...)
			names = append(names, image.RepoTags...)
...
			imagesOnNode = append(imagesOnNode, v1.ContainerImage{
				Names:     names,
				SizeBytes: image.Size,
			})
		}

Not sure if the extra name is coming in on RepoDigests or RepoTags.

Comment 4 Urvashi Mohnani 2020-09-15 02:43:49 UTC
The <none> tags and digests was introduced by https://github.com/cri-o/cri-o/pull/2455/files and later modified by https://github.com/cri-o/cri-o/pull/3002. Basically, if an image that was pulled down to the node doesn't have a tag or loses its tag as a newer version of the image with the same tag now exists, cri-o displays the name of the image but sets the tag as <none>. However, cri-o also returns the repoDigest of the image and that is why we see two entries in the snippet above - one is the repoDigest and the other is the image with the repoTag, but the image either doesn't have a tag or was untagged and that is why it is <none>.

Examples from a cluster:

```
- names:                                                                                                                                                    
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:93eb2ef0fe444efa2b58b28d90b50515f70cb95fcea901e8bbd62a7a22003f0b                                  
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>                                                                                            
    sizeBytes: 489293028

And when we inspect the image, we see that the repoTags is an empty list, hence the <none>

sh-4.4# crictl inspecti ef6181be8837f
{
  "status": {
    "id": "ef6181be8837f5a49151efcbc612af59b440f87bd3a9795a5455e9a1dd10b985",
    "repoTags": [],
    "repoDigests": [
      "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:93eb2ef0fe444efa2b58b28d90b50515f70cb95fcea901e8bbd62a7a22003f0b"
    ],
    "size": "489293028",
```

Now an image that has both repoTags and repoDigests
```
- names:                                                                                                                                                    
    - registry.redhat.io/redhat/certified-operator-index@sha256:2c6a74b70cfa6eb19fc51a9dc0e61d10408d7c6ac8ad428862c494e4b65a070c                              
    - registry.redhat.io/redhat/certified-operator-index@sha256:3655bc5373be539609de9fb8822e64ecb3bdea7f4d26839235ade918bec56d54                              
    - registry.redhat.io/redhat/certified-operator-index:v4.6                                                                                                 
    sizeBytes: 485553854

And when we inspect the image, we see the repoTags and repoDigests for the image

sh-4.4# crictl inspecti a872dc0cbc364
{
  "status": {
    "id": "a872dc0cbc3647ee8769cb2612fad1e1a2101acd7e783f342f05e402cb3b3d4d",
    "repoTags": [
      "registry.redhat.io/redhat/certified-operator-index:v4.6"
    ],
    "repoDigests": [
      "registry.redhat.io/redhat/certified-operator-index@sha256:2c6a74b70cfa6eb19fc51a9dc0e61d10408d7c6ac8ad428862c494e4b65a070c",
      "registry.redhat.io/redhat/certified-operator-index@sha256:3655bc5373be539609de9fb8822e64ecb3bdea7f4d26839235ade918bec56d54"
    ],
    "size": "485553854",
```

There is no loss of information happening here. The node.status.images array is up to date with the images on the node, so the scheduler should be matching incoming pods to the nodes correctly. This is more of a semantic issue as the <none> can be misleading. We could potentially just not return any repoTags if there are no tags available.

This is not a blocker, so deferring to 4.7 for now.

Comment 8 Sascha Grunert 2021-03-09 12:06:46 UTC
(In reply to Urvashi Mohnani from comment #4)
> There is no loss of information happening here. The node.status.images array
> is up to date with the images on the node, so the scheduler should be
> matching incoming pods to the nodes correctly. This is more of a semantic
> issue as the <none> can be misleading.

I agree, we do not omit any information within CRI-O because other tools like
crictl may rely on the output. If I build a container image locally on a node
two times in a row:

$ cat Containerfile
FROM scratch

$ sudo podman build -t test .
STEP 1: FROM scratch
STEP 2: COMMIT test
--> 910a4e3636c
910a4e3636c43cf62cdd4296ce495c4a02089b0781a8f34753a27f0dbfa18f2b

$ sudo podman build --no-cache -t test .
STEP 1: FROM scratch
STEP 2: COMMIT test
--> f7ce05b93a7
f7ce05b93a7aed95cc870718aaf57481ecdf051ffa3a00a5010544148373a217

Then the crictl images will use the output from the listimages response of CRI-O
to display the local images like this:

$ sudo crictl images
IMAGE                              TAG                 IMAGE ID            SIZE
localhost/test                     <none>              910a4e3636c43       2.11kB
localhost/test                     latest              f7ce05b93a7ae       2.1kB

And the kubelet gets those information as well:

$ kubectl get node 127.0.0.1 -o json | jq .status.images
…
  {
    "names": [
      "<none>@<none>",
      "localhost/test:<none>"
    ],
    "sizeBytes": 2106
  },
  {
    "names": [
      "localhost/test@sha256:753f1becc3d8d2482e0f9aa9c6d2edf3efc0eef279661e5abfeb206c7403ff99",
      "localhost/test:latest"
    ],
    "sizeBytes": 2105
  }
…

> We could potentially just not return any repoTags if there are no tags available.

If we would strip those information out then users may not have a chance to cleanup their layers
by the usage of tools like crictl.

---

It may be possible to fix that issue within the scheduler to ignore such image names.

Clayton, what do you think?

Comment 9 Sascha Grunert 2021-03-16 10:58:49 UTC
Found a possible way to enhance this, which should fix the mentioned issue within this bug:
https://github.com/cri-o/cri-o/pull/4662

Comment 10 Sascha Grunert 2021-03-18 10:07:12 UTC
Another approach for simplification https://github.com/cri-o/cri-o/pull/4673

Comment 11 Sascha Grunert 2021-03-23 08:20:11 UTC
The PR https://github.com/cri-o/cri-o/pull/4673 got merged and this issue should be fixed in CRI-O v1.21.0 (no backport planned yet).

Comment 13 MinLi 2021-03-31 09:53:07 UTC
verified on version : 4.8.0-0.nightly-2021-03-30-181828

  - names:
    - registry.redhat.io/redhat/certified-operator-index@sha256:d3c3d7de7e003d55e01312ecee362efa36f6e9d8ae88c3c20f776756618a04ff
    - registry.redhat.io/redhat/certified-operator-index@sha256:de4d42e1170477ec42d371a3eacb1887c4c8cf909ce56b579956c407fcc6d7c5
    - registry.redhat.io/redhat/certified-operator-index:v4.7
    sizeBytes: 646668616
  - names:
    - registry.redhat.io/redhat/redhat-operator-index@sha256:0fccbae42e76a32c00fbacf61b06639727c723e0ee97576664458396c6c3a820
    - registry.redhat.io/redhat/redhat-operator-index@sha256:f1785169e7c268bb9e7b5fb4abc23b01a1f25538b6dbee6bdd8b6b829d79b0eb
    - registry.redhat.io/redhat/redhat-operator-index:v4.7
    sizeBytes: 646500680
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f78361b4c62dcd2f81b88b435dcbd3327791c3cbac002fdc3eac0730e124862d
    sizeBytes: 259234568
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ca82d82d73933e1e2e7451a93e00f02886f0cc9503d70f4f83415f08bb194dfd
    sizeBytes: 541226611

sh-4.4# crictl inspecti 21d6474497e1f
{
  "status": {
    "id": "21d6474497e1f60ae7784a1ec36fc454b5525b6a3c9c2b6d888dd202c899984a",
    "repoTags": [
      "registry.redhat.io/redhat/certified-operator-index:v4.7"
    ],
    "repoDigests": [
      "registry.redhat.io/redhat/certified-operator-index@sha256:d3c3d7de7e003d55e01312ecee362efa36f6e9d8ae88c3c20f776756618a04ff",
      "registry.redhat.io/redhat/certified-operator-index@sha256:de4d42e1170477ec42d371a3eacb1887c4c8cf909ce56b579956c407fcc6d7c5"
    ],


sh-4.4# crictl inspecti 41ed539c1d66a
{
  "status": {
    "id": "41ed539c1d66a97667a2651cdf1b38e4ad5cbba774ba0400909f08f7df62ba19",
    "repoTags": [],
    "repoDigests": [
      "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f78361b4c62dcd2f81b88b435dcbd3327791c3cbac002fdc3eac0730e124862d"
    ],
    "size": "259234568",

Comment 16 errata-xmlrpc 2021-07-27 22:32:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.