Bug 1768879 - 4.1 to 4.2 upgrades get stuck when insecureRegistries is configured
Summary: 4.1 to 4.2 upgrades get stuck when insecureRegistries is configured
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.2.z
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.1.z
Assignee: Colin Walters
QA Contact: Sunil Choudhary
URL:
Whiteboard:
: 1760484 (view as bug list)
Depends On:
Blocks: 1186913
TreeView+ depends on / blocked
 
Reported: 2019-11-05 13:14 UTC by Jessica Forrester
Modified: 2023-09-07 20:56 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-21 09:17:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:3875 0 None None None 2019-11-21 09:18:03 UTC

Description Jessica Forrester 2019-11-05 13:14:25 UTC
1) Install a 4.1.21 cluster
2) In the config.openshift.io/images Global Config file modify it to set an insecureRegistry. Example spec:

spec:
  registrySources:
    insecureRegistries:
      - quay-enterprise.example.openshift.com

3) Wait for the registry config change to roll out to all of the nodes.

4) Switch cluster to the fast-4.2 channel and upgrade to 4.2.2

The cluster upgrade will get stuck at 88% failing to update the first node in each MachineConfigPool. In the MCP it gets marked Degraded with the message "failed to run pivot: failed to start pivot.service: exit status 1"

From the MCD logs we can see:

Error initializing source docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9c2f247fa92936266fc771eb946e0cb84d95279060954fe7e21f5d3f08ec43f3: error loading registries: invalid URL: cannot be empty
2019-11-03T09:39:23.730784095Z W1103 09:39:23.529578   98790 run.go:40] podman failed: exit status 125; retrying...

Looking at the MachineConfig for registries I can see it has already been updated to the new format for 4.2:

unqualified-search-registries %3D ["registry.access.redhat.com"%2C "docker.io"]

[[registry]]
  location %3D "quay-enterprise.example.openshift.com"
  insecure %3D true
  blocked %3D false
  mirror-by-digest-only %3D false
  prefix %3D ""


However the Node has not pivoted into the new version of RHCOS yet, so podman on that node would still be the version we used in 4.1

As I understand from another BZ https://bugzilla.redhat.com/show_bug.cgi?id=1737043 that registries.conf format is incompatible with that version of podman.


I suspect we have the same problem with allowedRegistries and blockedRegistries being configured as well but didn't confirm those.

Comment 1 Miloslav Trmač 2019-11-05 15:01:05 UTC
FWIW:

- ~All versions of podman can support the old V1 format:
  > [registries.insecure]
  > registries = ['...']
- Some versions of podman support the pre-release version of V2 that requires "URL", not "Location"
- podman ≥ 1.4.0 (better is 1.4.1) supports the released version of V2 that requires "Location", not "URL".

- machine-config-operator was generating V1 before 58db21b6e360404c52bf66b9345609bd2c40eca9 , and has been generating V2 since. It never generated the pre-release version of V2.

Comment 2 Brent Baude 2019-11-05 15:52:31 UTC
Not much to add here from the Podman side.  Miloslav is spot on

Comment 16 W. Trevor King 2019-11-12 23:08:17 UTC
Moving back to target 4.1.z, because this bug is already ON_QA.  I dunno what 4.2.z and 4.3.0 parents for a "under these conditions 4.1.z->4.2.z upgrades break, so we need to fix 4.1.z" issue would look like anyway.

Comment 20 errata-xmlrpc 2019-11-21 09:17:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3875

Comment 23 Colin Walters 2020-01-31 19:27:34 UTC
*** Bug 1760484 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.