Bug 1683491 - There are duplicate availableUpdates images in CV config
Summary: There are duplicate availableUpdates images in CV config
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.1.0
Assignee: Stefan Junker
QA Contact: liujia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-27 01:38 UTC by liujia
Modified: 2019-07-25 05:32 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Cincinnati allowed to add duplicate edges via Quay labels Consequence: Duplicate edges resulted in duplicate updates available in the cluster Fix: Skip pre-existing edges when processing the edge labels from Quay in Cincinnati Result: No more duplicate edges available in the cluster
Clone Of:
Environment:
Last Closed: 2019-07-25 05:32:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:1809 0 None None None 2019-07-25 05:32:37 UTC

Description liujia 2019-02-27 01:38:24 UTC
Description of problem:
Trigger CVO to sync available update from upstream, checked the sync result that there are duplicated images in availableUpdates list.

[root@preserve-jliu-worker 20190226]# oc get clusterversion -o json|jq ".items[0].status.availableUpdates"
[
  {
    "image": "registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-02-25-234632",
    "version": "4.0.0-0.nightly-2019-02-25-234632"
  },
  {
    "image": "registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-02-26-054336",
    "version": "4.0.0-0.nightly-2019-02-26-054336"
  },
  {
    "image": "registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-02-25-234632",
    "version": "4.0.0-0.nightly-2019-02-25-234632"
  },
  {
    "image": "registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-02-26-054336",
    "version": "4.0.0-0.nightly-2019-02-26-054336"
  }
]



Version-Release number of the following components:
sh-4.2# cluster-version-operator version
ClusterVersionOperator v4.0.0-0.185.0.0-dirty

How reproducible:
always

Steps to Reproduce:
1. Install cluster with nightly build 4.0.0-0.nightly-2019-02-25-194625
2. Edit clusterversion config to specify upstream to https://openshift-release.svc.ci.openshift.org/graph
# oc get clusterversion -o json|jq ".items[0].spec"
{
  "channel": "stable-4.0",
  "clusterID": "29263f55-7e56-41bd-a77c-dd2173814e29",
  "upstream": "https://openshift-release.svc.ci.openshift.org/graph"
}

3. Check available update


Actual results:
There are duplicate availableUpdates images in CV config.

Expected results:
There should not be duplicated images.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 W. Trevor King 2019-03-05 20:25:49 UTC
Stale?  I see no dups (or any nightlies at all?) now, even when I supply a nightly current version:

$ curl -sH 'Accept: application/json' 'https://api.openshift.com/api/upgrades_info/v1/graph?channel=does-not-exist&version=4.0.0-0.nightly-2019-02-20-194410' | jq -r '.nodes[].version'
4.0.0-5
4.0.0-4
4.0.0-6
4.0.0-7
4.0.0-8
4.0.0-9
4.0.0-0.1
4.0.0-0.okd-0
4.0.0-0.2
4.0.0-0.3
4.0.0-0.4
4.0.0-0.5
4.0.0-0.6
4.0.0-0.7

Comment 2 liujia 2019-03-06 03:02:24 UTC
(In reply to W. Trevor King from comment #1)
> Stale?  I see no dups (or any nightlies at all?) now, even when I supply a
> nightly current version:
> 
> $ curl -sH 'Accept: application/json'
> 'https://api.openshift.com/api/upgrades_info/v1/graph?channel=does-not-
> exist&version=4.0.0-0.nightly-2019-02-20-194410' | jq -r '.nodes[].version'
> 4.0.0-5
> 4.0.0-4
> 4.0.0-6
> 4.0.0-7
> 4.0.0-8
> 4.0.0-9
> 4.0.0-0.1
> 4.0.0-0.okd-0
> 4.0.0-0.2
> 4.0.0-0.3
> 4.0.0-0.4
> 4.0.0-0.5
> 4.0.0-0.6
> 4.0.0-0.7

Maybe you should use https://openshift-release.svc.ci.openshift.org/graph address for nightly build, and https://api.openshift.com/api/upgrades_info/v1/graph address for release build.

And there are different response between my try against 4.0.0-0.nightly-2019-02-27-154505 on Monday and my try against 4.0.0-0.nightly-2019-03-04-180718 yestoday .

# curl --silent --header 'Accept:application/json' https://openshift-release.svc.ci.openshift.org/graph | jq '. as $graph | $graph.nodes | map(.version == "4.0.0-0.nightly-2019-02-27-154505") | index(true) as $orig | $graph.edges | map(select(.[0] == $orig)[1]) | map($graph.nodes[.])'
[
  {
    "version": "4.0.0-0.nightly-2019-02-27-172930",
    "payload": "registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-02-27-172930"
  },
  {
    "version": "4.0.0-0.nightly-2019-02-27-190649",
    "payload": "registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-02-27-190649"
  },
  {
    "version": "4.0.0-0.nightly-2019-02-27-172930",
    "payload": "registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-02-27-172930"
  },
  {
    "version": "4.0.0-0.nightly-2019-02-27-190649",
    "payload": "registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-02-27-190649"
  }
]

# curl --silent --header 'Accept:application/json' https://openshift-release.svc.ci.openshift.org/graph | jq '. as $graph | $graph.nodes | map(.version == "4.0.0-0.nightly-2019-03-04-180718") | index(true) as $orig | $graph.edges | map(select(.[0] == $orig)[1]) | map($graph.nodes[.])'
[
  {
    "version": "4.0.0-0.nightly-2019-03-04-234414",
    "payload": "registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-03-04-234414"
  },
  {
    "version": "4.0.0-0.nightly-2019-03-05-034154",
    "payload": "registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-03-05-034154"
  }
]

I wonder if the policy on server end changed this.

Comment 3 Alex Crawford 2019-03-06 17:40:23 UTC
https://openshift-release.svc.ci.openshift.org/graph is not a Cincinnati stack. It's just a hack to easily get the nightly images into a graph. Are the duplicate entries problematic?

Comment 4 liujia 2019-03-07 03:43:59 UTC
> Are the duplicate entries problematic?
No big problem. Just no need to give a duplicate available updates since there are not two identical nodes in nightly build list. And it's very confused. So removing the dup one makes it better.

Comment 5 Stefan Junker 2019-03-08 16:48:49 UTC
My understanding is that this graph is handed out by the release controller [1] and I don't know what the general direction is with this endpoint. We could either run a Cincinnati instance for this endpoint or incorporate the nightly releases into the instance running at https://api.{,staging.}openshift.com:443/api/upgrades_info/v1/graph. That would effectively result in having two disjoint graphs in the result, which should just work.

[1]: https://github.com/openshift/release-controller/blob/ecc8b22e46952c1585502c28b992559ade39233e/cmd/release-controller/graph.go#L23-L78

Comment 7 liujia 2019-03-15 10:36:16 UTC
Hi Stefan

I tried again but still get duplicate output from the

Comment 8 Stefan Junker 2019-03-18 17:14:43 UTC
Hey Liujia, 

no action has been taken to prevent the duplicate from showing up. AFAIU we don't consider this a problem on the release controller's graph endpoint since it's meant for nightlies only. These versions aren't supported in any way.
In Cincinnati this will not happen since there are safety checks [1] which guard against duplicate versions when adding releases to the graph.

I think we can close this ticket since it's not an issue in practice. Any objections to that?

[1]: https://github.com/openshift/cincinnati/blob/f595336f5570fa07609036b4b97342260353a42a/cincinnati/src/lib.rs#L105-L109

Comment 9 liujia 2019-03-19 07:12:20 UTC
That's ok for me. Since https://openshift-release.svc.ci.openshift.org/graph is not considered as a supported graph endpoint, so close the bug.

Comment 10 liujia 2019-06-03 03:37:26 UTC
Do test based rc.7 from quay. Still hit the issue even with a supported graph endpoint(default one), so re-open this bug to keep tracking the issue.

# ./oc get clusterversion
NAME      VERSION      AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-rc.7   True        False         43s     Cluster version is 4.1.0-rc.7

# ./oc get clusterversion -o json|jq ".items[0].spec"
{
  "channel": "prerelease-4.1",
  "clusterID": "bff6de88-7500-4925-b85d-013fcf4649e5",
  "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph"
}

# ./oc adm upgrade
Cluster version is 4.1.0-rc.7

Updates:

VERSION    IMAGE
4.1.0-rc.8 quay.io/openshift-release-dev/ocp-release@sha256:8250bbf79d4f567e24a7b85795aba97f9e75a9df18738891a0cb6ba42e422584
4.1.0-rc.8 quay.io/openshift-release-dev/ocp-release@sha256:8250bbf79d4f567e24a7b85795aba97f9e75a9df18738891a0cb6ba42e422584
4.1.0-rc.9 quay.io/openshift-release-dev/ocp-release@sha256:49c4b6bf70061e522e3525aed534d087c9abfba7c39cbcbdd1bd770ab096bf9e

Comment 11 Stefan Junker 2019-06-04 16:02:09 UTC
Thanks for reporting this new occurrence of the double-edge. It has been discussed Slack and the reason is known. The edge has been specified in the release-metadata of the payload as well as on quay via an image tag label.
There wasn't clear consensus if we want the edge removed by Cincinnati, since it would be easier to remove it on quay where it was falsely specified as a duplicate.

Comment 12 W. Trevor King 2019-06-04 18:02:00 UTC
> There wasn't clear consensus if we want the edge removed by Cincinnati, since it would be easier to remove it on quay where it was falsely specified as a duplicate.

People tagging in Quay can make mistakes.  I think we want to dedup in Cincinnati and alert the admins maintaining the datasource (Quay labels), to avoid confusing downstream users.  I'm also fine moving the datasource away from Quay labels and guarding against admin mistakes with some sort of preflight-testing of admin changes to the datasource.  But I think we want to move on at least one of those.

Comment 13 Stefan Junker 2019-06-04 19:09:54 UTC
> I think we want to dedup in Cincinnati and alert the admins maintaining the datasource (Quay labels), to avoid confusing downstream users.

Ironically the hard part is the alarming here ;-) Who are the admins and how can Cincinnati alert them?

Comment 14 Stefan Junker 2019-07-10 08:55:34 UTC
Liujia, we have merged a fix to production yesterday. Can you please verify that the bug is gone?

Comment 15 liujia 2019-07-11 03:01:05 UTC
[root@preserve-jliu-worker 20190711_14830]# ./oc get clusterversion
NAME      VERSION      AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-rc.7   True        False         5m25s   Cluster version is 4.1.0-rc.7

[root@preserve-jliu-worker 20190711_14830]# ./oc get clusterversion -o json|jq ".items[0].spec"
{
  "channel": "prerelease-4.1",
  "clusterID": "fe69dbb3-cc92-4cdc-82c0-439543fff7da",
  "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph"
}

[root@preserve-jliu-worker 20190711_14830]# ./oc adm upgrade
Cluster version is 4.1.0-rc.7

Updates:

VERSION    IMAGE
4.1.0-rc.9 quay.io/openshift-release-dev/ocp-release@sha256:49c4b6bf70061e522e3525aed534d087c9abfba7c39cbcbdd1bd770ab096bf9e
4.1.0-rc.8 quay.io/openshift-release-dev/ocp-release@sha256:8250bbf79d4f567e24a7b85795aba97f9e75a9df18738891a0cb6ba42e422584

Moreover, checked all upgrade paths for both prerelease-4.1 and stable-4.1 channel from Cincinnati stack, no dup path found. Verified the bug.

Comment 19 errata-xmlrpc 2019-07-25 05:32:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1809


Note You need to log in before you can comment on or make changes to this bug.