Bug 1865806

Summary: Etcd upgrade fails with DNS clash
Product: OpenShift Container Platform Reporter: Mangirdas Judeikis <mjudeiki>
Component: Etcd OperatorAssignee: Dan Mace <dmace>
Status: CLOSED ERRATA QA Contact: Mike Fiedler <mifiedle>
Severity: high Docs Contact:
Priority: high    
Version: 4.4CC: amcdermo, awestbro, bbennett, dhansen, dmace, ehashman, geliu, jhunter, jmalde, jminter, mgahagan, mifiedle, mjudeiki, mmasters, sdodson, wking
Target Milestone: ---Keywords: Upgrades
Target Release: 4.5.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1869681 (view as bug list) Environment:
Last Closed: 2020-09-08 10:54:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1869681    

Description Mangirdas Judeikis 2020-08-04 08:46:38 UTC
Description of problem:

Customer in Azure configure customer hostnames. This ends in custom DNS records created in privateZone records. If this hostname clashes with customer forward dns zones - cluster and etcd is in unhealthy state.

System components should not rely on CoreDNS for zone resolution and rely on system provided dns settings on the nodes.

Details situation description:

Part 1. Install.
Customer with hostname shared-cluster.osadev.cloud creates a cluster in Azure providing his domain to intaller.

_etcd-server-ssl._tcp SRV points  etcd-0.shared-cluster.osadev.cloud
All other records corresponds to IP addresses.

Customer configure ROOT DNS zone with records shared-cluster.osadev.cloud pointing to child zone.

Cluster installs and is in a healthy state.

Part 2. Day 2.

Customer wants his custom domain be forwarded to custom DNS server, he/she configures CoreDNS with:
   servers:
    - forwardPlugin:
        upstreams:
        - 10.x.y.4
      name: external-dns
      zones:
      - osadev.cloud 

Customer pods are healthy and they can resolve external hostnames,

Part 3. Upgrade.

Once cluster upgrade was initiated ETCD upgrade failed with errors:

EtcdMemberIPMigratorDegraded: unable to locate a node for peerURL=https://etcd-2.shared-cluster.osadev.cloud:2380, dnsErrors=lookup _etcd-server-ssl._tcp.shared-cluster.osadev.cloud on 172.30.0.10:53: no such host

I think system components should not rely on CoreDNS forward configuration due to reasons that it might break existing DNS patterns. 

Version-Release number of selected component (if applicable):

4.3+


Actual results:

Upgrade fails and stuck


Expected results:

Upgrade successful.


Additional info:

I believe all Azure clusters with custom hostname set will be ipacted.

Comment 2 Dan Mace 2020-08-04 20:23:59 UTC
A quick update.

During the 4.4 upgrade, OpenShift is trying to migrate etcd away from relying on
DNS, and in doing so tries to look up etcd DNS records created in the cluster's
private DNS zone to find the IPs to replace the DNS addresses. However, the DNS
lookup flows through CoreDNS, which in these clusters is configured to forward
the requests to an external nameserver which is missing those DNS records present
in the private zone.

Transparent immediate fixes include:

1. Configure the forwarded nameserver to delegate to the private zone containing
the records.
2. Copy the records from the private zone to the forwarded zone.

Semantically this seems reasonable because the forwarded nameserver declares
authority for the domain and so should be able to answer those records.

Practically speaking it may not be reasonable for the end-users to make these
changes.

We're currently exploring some plausible solutions that don't require a manual
intervention and hope to have an update as soon as possible.

Comment 5 Mangirdas Judeikis 2020-08-07 08:30:19 UTC
I have different concern here. 
What we need to do to make sure any other component starting from 4.4 will not fall over this pitfall moving forward.

Customer configuration:
* Cluster using custom Domain redhat.com (azure privateDNS zone is being created with these records and being served by Azure native DNS on Host level)
* Customer is using forwardPluging for redhat.com to 1.1.1.1 (or any other custom dns)
* Any component, running in the cluster and using cluster DNS will pontially will not be able to call routes, API, etc. These components should either use .cluster reference or use hostNetwork/DNS.

How would be best to track this work item? @sdobson, @dmace, your input would be great so I can do the what is needed.

Comment 6 Dan Mace 2020-08-07 14:15:24 UTC
 (In reply to Mangirdas Judeikis from comment #5)
 > I have different concern here. 
 > What we need to do to make sure any other component starting from 4.4 will
 > not fall over this pitfall moving forward.
 > 
 > Customer configuration:
 > * Cluster using custom Domain redhat.com (azure privateDNS zone is being
 > created with these records and being served by Azure native DNS on Host
 > level)
 > * Customer is using forwardPluging for redhat.com to 1.1.1.1 (or any other
 > custom dns)
 > * Any component, running in the cluster and using cluster DNS will pontially
 > will not be able to call routes, API, etc. These components should either
 > use .cluster reference or use hostNetwork/DNS.
 > 
 > How would be best to track this work item? @sdobson, @dmace, your input
 > would be great so I can do the what is needed.
 
CC'ing the Network Edge folks.
 
My first intuition is that Ben's earlier assessment applies here: if the administrator sets up a global forward for a domain that's declared to be managed by openshift (e.g. the ingress domain), that upstream nameserver had better actually be authoritative for the domain. Perhaps there's a documentation issue in this regard — the nameserver behind such a rule would need to delegate to the OpenShift-managed zones for which authority is declared, because OpenShift can't manage records in the opaque user-defined upstream.
 
At a glance this seems like a fairly reasonable expectation, but I'm sure it's possible there's more nuance here I'm failing to consider.
 
At the very least I would hope we can use the documentation to warn users about the potential footguns associated with DNS forwarding. I'm not sure if an alert of some kind would be appropriate.
 
Curious to get some feedback from the NE folks on this. It's a reasonable line of questioning and it does seem clear there are implications to DNS forwarding that weren't considered when the feature was originally designed, and I'm glad the feature is getting used and appreciate the feedback. This bugzilla might not be the best place to hash out the details, but it's a start; if there's a better discussion venue I'm happy to take the conversation there.

Comment 7 Miciah Dashiel Butler Masters 2020-08-07 16:56:13 UTC
I'd expect the same problem to apply to the API; would it not?  Did the customer copy the api-int records to the private zone and not the etcd records?  If so, this would imply that the customer knew to follow (and possibly invent) some process, and the process had a step to create the api-int records but not to create the etcd records, so one solution would be to make sure this process is explicitly documented and complete.

If an arbitrary component needs a DNS record in the private zone, it would make sense to have an alert for that component if the name on the record doesn't resolve.  The alert could suggest checking the DNS forwarding configuration.

In retrospect, it might have been best if we'd prevented overriding name resolution for the cluster domain, but it's too late to change that now.

Comment 8 Daneyon Hansen 2020-08-07 17:10:25 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1867205 created to update DNS forwarding docs.

Comment 9 Dan Mace 2020-08-10 18:42:34 UTC
Please note that the scope of this fix is limited to etcd errors during upgrade. If etcd upgrades, the fix works, and any followon issues should be treated separately unless there's a reason to believe the issues are caused by this fix.

To test, I used the following procedures.

### Verify the problem

1. Launch a 4.3.31 IPI cluster on Azure.

2. Edit the `dnses.operator.openshift.io/default` with the the following `spec` field:

      servers:
      - forwardPlugin:
          upstreams:
          - 1.1.1.1
        name: external-dns
        zones:
        - ci-ln-2pmxhgk-002ac.ci.azure.devcluster.openshift.com
  
   The `zones` field should match the cluster domain.

3. Upgrade the cluster to a stable 4.4 release:

      oc adm upgrade --force --allow-upgrade-with-warnings --allow-explicit-upgrade --to-image quay.io/openshift-release-dev/ocp-release:4.4.16-x86_64

4. Verify the upgrade fails due to the etcd operator becoming degraded due to DNS lookup failures:

      $ oc get clusterversion/version
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.3.31    True        True          10m     Unable to apply 4.4.16: the cluster operator etcd is degraded

      $ oc get clusteroperators/etcd -o yaml
      apiVersion: config.openshift.io/v1
      kind: ClusterOperator
      metadata:
        creationTimestamp: "2020-08-10T14:39:34Z"
        generation: 1
        name: etcd
        resourceVersion: "25429"
        selfLink: /apis/config.openshift.io/v1/clusteroperators/etcd
        uid: 91438145-ab1a-4717-800f-57007cda0a72
      spec: {}
      status:
        conditions:
        - lastTransitionTime: "2020-08-10T14:41:34Z"
          message: 'EtcdMemberIPMigratorDegraded: unable to locate a node for peerURL=https://etcd-1.ci-ln-2pmxhgk-002ac.ci.azure.devcluster.openshift.com:2380,dnsErrors=lookup _etcd-server-ssl._tcp.ci-ln-2pmxhgk-002ac.ci.azure.devcluster.openshift.com on 172.30.0.10:53: no such host'
          reason: EtcdMemberIPMigrator_Error
          status: "True"
          type: Degraded


### Verify the fix

1. Launch a 4.3.31 IPI cluster on Azure.

2. Edit the `dnses.operator.openshift.io/default` with the the following `spec` field:

      servers:
      - forwardPlugin:
          upstreams:
          - 1.1.1.1
        name: external-dns
        zones:
        - ci-ln-gk8wrgt-002ac.ci.azure.devcluster.openshift.com
  
   The `zones` field should match the cluster domain.

3. Upgrade the cluster to a 4.4 release image containing the fix:

      oc adm upgrade --force --allow-upgrade-with-warnings --allow-explicit-upgrade --to-image registry.svc.ci.openshift.org/ci-ln-m17603t/release:latest

4. Verify the that etcd successfully upgrades.

      $ oc get clusteroperators/etcd -o yaml
      apiVersion: config.openshift.io/v1
      kind: ClusterOperator
      metadata:
        creationTimestamp: "2020-08-07T15:27:54Z"
        generation: 1
        name: etcd
        resourceVersion: "28214"
        selfLink: /apis/config.openshift.io/v1/clusteroperators/etcd
        uid: c1ef4c59-4a5a-49fd-85d7-7787f1a3f058
      spec: {}
      status:
        conditions:
        - lastTransitionTime: "2020-08-07T15:32:18Z"
          message: |-
            NodeControllerDegraded: All master nodes are ready
            EtcdMembersDegraded: No unhealthy members found
          reason: AsExpected
          status: "False"
          type: Degraded
        - lastTransitionTime: "2020-08-07T15:32:42Z"
          message: |-
            NodeInstallerProgressing: 3 nodes are at revision 2
            EtcdMembersProgressing: No unstarted etcd members found
          reason: AsExpected
          status: "False"
          type: Progressing
        - lastTransitionTime: "2020-08-07T15:29:53Z"
          message: |-
            StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 2
            EtcdMembersAvailable: 3 members are available
          reason: AsExpected
          status: "True"
          type: Available
        - lastTransitionTime: "2020-08-07T15:27:54Z"
          reason: AsExpected
          status: "True"
          type: Upgradeable

5. Double-check that the expected events were produced indicating the new fallback logic was executed:

        101s        Warning   MemberIPLookupFailed                   deployment/etcd-operator                         member "etcd-member-ci-ln-gk8wrgt-002ac-dmd9h-master-0" IP couldn't be determined via DNS: unableto locate a node for peerURL=https://etcd-0.ci-ln-gk8wrgt-002ac.ci.azure.devcluster.openshift.com:2380, dnsErrors=lookup _etcd-server-ssl._tcp.ci-ln-gk8wrgt-002ac.ci.azure.devcluster.openshift.com on 172.30.0.10:53: no such host; will attempt a fallback lookup
        101s        Normal    MemberSettingIPPeer                    deployment/etcd-operator                         member "etcd-member-ci-ln-gk8wrgt-002ac-dmd9h-master-0"; new peer list https://10.0.0.6:2380
        101s        Normal    MemberUpdate                           deployment/etcd-operator                         updating member "etcd-member-ci-ln-gk8wrgt-002ac-dmd9h-master-0" with peers https://10.0.0.6:2380
        101s        Normal    MemberMissingIPPeer                    deployment/etcd-operator                         member "etcd-member-ci-ln-gk8wrgt-002ac-dmd9h-master-1" is missing an IP in the peer list
        88s         Warning   MemberIPLookupFailed                   deployment/etcd-operator                         member "etcd-member-ci-ln-gk8wrgt-002ac-dmd9h-master-1" IP couldn't be determined via DNS: unableto locate a node for peerURL=https://etcd-1.ci-ln-gk8wrgt-002ac.ci.azure.devcluster.openshift.com:2380, dnsErrors=lookup _etcd-server-ssl._tcp.ci-ln-gk8wrgt-002ac.ci.azure.devcluster.openshift.com on 172.30.0.10:53: no such host; will attempt a fallback lookup
        88s         Normal    MemberSettingIPPeer                    deployment/etcd-operator                         member "etcd-member-ci-ln-gk8wrgt-002ac-dmd9h-master-1"; new peer list https://10.0.0.5:2380
        88s         Normal    MemberUpdate                           deployment/etcd-operator                         updating member "etcd-member-ci-ln-gk8wrgt-002ac-dmd9h-master-1" with peers https://10.0.0.5:2380
        88s         Normal    MemberMissingIPPeer                    deployment/etcd-operator                         member "etcd-member-ci-ln-gk8wrgt-002ac-dmd9h-master-2" is missing an IP in the peer list
        76s         Warning   MemberIPLookupFailed                   deployment/etcd-operator                         member "etcd-member-ci-ln-gk8wrgt-002ac-dmd9h-master-2" IP couldn't be determined via DNS: unableto locate a node for peerURL=https://etcd-2.ci-ln-gk8wrgt-002ac.ci.azure.devcluster.openshift.com:2380, dnsErrors=lookup _etcd-server-ssl._tcp.ci-ln-gk8wrgt-002ac.ci.azure.devcluster.openshift.com on 172.30.0.10:53: no such host; will attempt a fallback lookup
        76s         Normal    MemberSettingIPPeer                    deployment/etcd-operator                         member "etcd-member-ci-ln-gk8wrgt-002ac-dmd9h-master-2"; new peer list https://10.0.0.7:2380
        76s         Normal    MemberUpdate                           deployment/etcd-operator                         updating member "etcd-member-ci-ln-gk8wrgt-002ac-dmd9h-master-2" with peers https://10.0.0.7:2380

   The `MemberIPLookupFailed` errors are what trigger the fallback logic (notice the "will attempt a fallback lookup" message). The `MemberSettingIPPeer` and `MemberUpdate` are signals that the DNS workaround was successful and new IPs assigned to the members.

Comment 10 Dan Mace 2020-08-10 19:01:24 UTC
Mangirdas,

Can your team test this independently? It's easy for us to build a custom 4.4 image containing the fix.

Comment 11 Dan Mace 2020-08-10 19:05:56 UTC
*** Bug 1864436 has been marked as a duplicate of this bug. ***

Comment 12 Mangirdas Judeikis 2020-08-12 06:38:15 UTC
@Dan,

We can try if this still needed. Is this new release image or etcd-operator image? 

MJ

Comment 13 Dan Mace 2020-08-12 14:09:26 UTC
(In reply to Mangirdas Judeikis from comment #12)
> @Dan,
> 
> We can try if this still needed. Is this new release image or etcd-operator
> image? 
> 
> MJ

You can use @cluster-bot to build an image from https://github.com/openshift/cluster-etcd-operator/pull/419 to which you can upgrade a cluster. Please feel free to reach out directly if you need any help working through that process.

Comment 14 Mangirdas Judeikis 2020-08-14 11:02:35 UTC
I a bit puzzeld. 
Tested old behaviour and new and upgrade succeded. But I was not able to observe MemberIPLookupFailed or MemberSettingIPPeer.
Result positive, but not sure where the events gone...

Standard upgrade:
1 event.go:278] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-etcd-operator", Name:"etcd-operator", UID:"720dfeac-a1a0-4dd1-bcdf-c37d3669d9cc", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'MemberMissingIPPeer' member "etcd-member-mjudeikis-j269c-master-2" is missing an IP in the peer list
E0814 10:20:10.823407       1 etcdmemberipmigrator.go:314] key failed with : unable to locate a node for peerURL=https://etcd-2.xw1rnv4j.v4-eastus.osadev.cloud:2380, dnsErrors=lookup _etcd-server-ssl._tcp.xw1rnv4j.v4-eastus.osadev.cloud on 172.30.0.10:53: no such host
I0814 10:20:11.928081       1 request.go:621] Throttling request took 1.181303068s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-etcd/pods/installer-2-mjudeikis-j269c-master-2
I0814 10:20:12.713978       1 etcdcli.go:96] service/host-etcd-2 is missing annotation alpha.installer.openshift.io/etcd-bootstrap
I0814 10:20:12.744864       1 event.go:278] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-etcd-operator", Name:"etcd-operator", UID:"720dfeac-a1a0-4dd1-bcdf-c37d3669d9cc", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'PodCreated' Created Pod/installer-2-mjudeikis-j269c-master-2 -n openshift-etcd because it was missing
I0814 10:20:12.745975       1 etcdcli.go:96] service/host-etcd-2 is missing annotation alpha.installer.openshift.io/etc

Events from successful upgrade: 
0s         Normal    OperatorStatusChanged                   deployment/etcd-operator              Status for clusteroperator/etcd changed: Degraded message changed from "EtcdMemberIPMigratorDegraded: unable to locate a node for peerURL=https://etcd-2.xw1rnv4j.v4-eastus.osadev.cloud:2380, dnsErrors=lookup _etcd-serv
er-ssl._tcp.xw1rnv4j.v4-eastus.osadev.cloud on 172.30.0.10:53: no such host" to "EtcdMembersControllerDegraded: node lister not synced\nEtcdMemberIPMigratorDegraded: unable to locate a node for peerURL=https://etcd-2.xw1rnv4j.v4-eastus.osadev.cloud:2380, dnsErrors=lookup _etcd-server-ssl._tcp.xw1rnv4j.v4-eastus.osade
v.cloud on 172.30.0.10:53: no such host"                                                                                                                                                                                                                                                                                      
90s         Normal    OperatorStatusChanged                   deployment/etcd-operator              Status for clusteroperator/etcd changed: Degraded message changed from "EtcdMembersControllerDegraded: node lister not synced\nEtcdMemberIPMigratorDegraded: unable to locate a node for peerURL=https://etcd-2.xw1rnv4j.v
4-eastus.osadev.cloud:2380, dnsErrors=lookup _etcd-server-ssl._tcp.xw1rnv4j.v4-eastus.osadev.cloud on 172.30.0.10:53: no such host" to "EtcdMembersControllerDegraded: node lister not synced\nBootstrapTeardownDegraded: node lister not synced\nEtcdMemberIPMigratorDegraded: unable to locate a node for peerURL=https://et
cd-2.xw1rnv4j.v4-eastus.osadev.cloud:2380, dnsErrors=lookup _etcd-server-ssl._tcp.xw1rnv4j.v4-eastus.osadev.cloud on 172.30.0.10:53: no such host"                                                                                                                                                                            
89s         Normal    OperatorStatusChanged                   deployment/etcd-operator              Status for clusteroperator/etcd changed: Degraded message changed from "EtcdMembersControllerDegraded: node lister not synced\nBootstrapTeardownDegraded: node lister not synced\nEtcdMemberIPMigratorDegraded: unable to 
locate a node for peerURL=https://etcd-2.xw1rnv4j.v4-eastus.osadev.cloud:2380, dnsErrors=lookup _etcd-server-ssl._tcp.xw1rnv4j.v4-eastus.osadev.cloud on 172.30.0.10:53: no such host" to "EtcdMembersControllerDegraded: node lister not synced\nBootstrapTeardownDegraded: node lister not synced\nEtcdMemberIPMigratorDegra
ded: unable to locate a node for peerURL=https://etcd-2.xw1rnv4j.v4-eastus.osadev.cloud:2380, dnsErrors=lookup _etcd-server-ssl._tcp.xw1rnv4j.v4-eastus.osadev.cloud on 172.30.0.10:53: no such host\nClusterMemberControllerDegraded: node lister not synced"                                                                
89s         Normal    OperatorStatusChanged                   deployment/etcd-operator              Status for clusteroperator/etcd changed: Degraded changed from True to False ("EtcdMembersControllerDegraded: node lister not synced\nBootstrapTeardownDegraded: node lister not synced\nNodeControllerDegraded: All maste
r nodes are ready\nClusterMemberControllerDegraded: node lister not synced\nEtcdMembersDegraded: No unhealthy members found")                                                                                                                                                                                                 
88s         Normal    OperatorStatusChanged                   deployment/etcd-operator              Status for clusteroperator/etcd changed: Degraded message changed from "EtcdMembersControllerDegraded: node lister not synced\nBootstrapTeardownDegraded: node lister not synced\nNodeControllerDegraded: All master nodes
 are ready\nClusterMemberControllerDegraded: node lister not synced\nEtcdMembersDegraded: No unhealthy members found" to "EtcdMembersControllerDegraded: node lister not synced\nBootstrapTeardownDegraded: node lister not synced\nNodeControllerDegraded: All master nodes are ready\nEtcdMembersDegraded: No unhealthy memb
ers found"                                                                                                                                                                                                                                                                                                                    
88s         Normal    OperatorStatusChanged                   deployment/etcd-operator              Status for clusteroperator/etcd changed: Degraded message changed from "EtcdMembersControllerDegraded: node lister not synced\nBootstrapTeardownDegraded: node lister not synced\nNodeControllerDegraded: All master nodes
 are ready\nEtcdMembersDegraded: No unhealthy members found" to "EtcdMembersControllerDegraded: node lister not synced\nNodeControllerDegraded: All master nodes are ready\nEtcdMembersDegraded: No unhealthy members found"                                                                                                  
88s         Normal    OperatorStatusChanged                   deployment/etcd-operator              Status for clusteroperator/etcd changed: Degraded message changed from "EtcdMembersControllerDegraded: node lister not synced\nNodeControllerDegraded: All master nodes are ready\nEtcdMembersDegraded: No unhealthy membe
rs found" to "NodeControllerDegraded: All master nodes are ready\nEtcdMembersDegraded: No unhealthy members found"                                                                                                                                                                                                            
84s         Normal    ConfigMapUpdated                        deployment/etcd-operator              Updated ConfigMap/etcd-pod -n openshift-etcd:                                                                                                                                                                             
cause by changes in data.pod.yaml,data.version
84s         Normal    RevisionTriggered                       deployment/etcd-operator              new revision 3 triggered by "configmap/etcd-pod has changed"

Comment 15 Mangirdas Judeikis 2020-08-17 18:03:38 UTC
Did more testing. All good

82s         Warning   MemberIPLookupFailed                    deployment/etcd-operator              member "mjudeikis-ggpdf-master-1" IP couldn't be determined via
 DNS: unable to locate a node for peerURL=https://etcd-1.iy5kl2c0.v4-westeurope.osadev.cloud:2380, dnsErrors=lookup _etcd-server-ssl._tcp.iy5kl2c0.v4-westeurope.os
adev.cloud on 172.30.0.10:53: no such host; will attempt a fallback lookup                                                                                         
82s         Normal    MemberSettingIPPeer                     deployment/etcd-operator              member "mjudeikis-ggpdf-master-1"; new peer list https://10.5.0.8:2380   

And result:
sh-4.2# etcdctl member list
7c1c361f137ec111, started, mjudeikis-ggpdf-master-2, https://10.5.0.9:2380, https://10.5.0.9:2379
8099625059b34b7a, started, mjudeikis-ggpdf-master-0, https://10.5.0.7:2380, https://10.5.0.7:2379
aa431bcae9995c87, started, mjudeikis-ggpdf-master-1, https://10.5.0.8:2380, https://10.5.0.8:2379

Comment 16 Dan Mace 2020-08-18 13:33:46 UTC
This bug doesn't affect 4.5+ upgrades; I cloned https://bugzilla.redhat.com/show_bug.cgi?id=1869681 to track the 4.4.z work.

Comment 19 Mike Fiedler 2020-08-25 18:09:17 UTC
@dmace @mj  This bug moved to POST->MODIFIED->ON_QA but I don't see a PR attached for 4.5.   Does QE need to verify this for 4.5.z?  or just CLOSE it.   It is currently blocking the merge of https://github.com/openshift/cluster-etcd-operator/pull/419

Comment 20 Dan Mace 2020-08-25 18:19:05 UTC
The bug only applies to 4.3 -> 4.4 upgrades. There is no bug to fix in the 4.5 release, and so there will be no 4.5 PR. This bug exists only to allow the 4.4 PR to merge and satisfy the overall process.

Comment 21 Dan Mace 2020-08-25 18:20:02 UTC
(In reply to Dan Mace from comment #20)
> This bug exists only to allow the 4.4 PR to merge and satisfy the overall process.

I mean that the 4.5 bug only exists to satisfy process. This bug, for 4.4, makes sense because the fix is delivered in a PR against the 4.4 branch. Hope that clarifies!

Comment 27 errata-xmlrpc 2020-09-08 10:54:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.5.8 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3510