Bug 1981975 - Master Machine Config Pool degraded at install time
Summary: Master Machine Config Pool degraded at install time
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Routing
Version: 4.8
Hardware: x86_64
OS: Unspecified
high
high
Target Milestone: ---
: 4.9.0
Assignee: Luigi Mario Zuccarelli
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On:
Blocks: 1985588
TreeView+ depends on / blocked
 
Reported: 2021-07-13 20:22 UTC by Daniel Del Ciancio
Modified: 2021-10-18 17:53 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The config drift seems to happen when the CNO attempts to sanitize the proxy configuration (specifically the no_proxy config). Consequence: It has been observed that a specific IPv6 CIDR missing from the noproxy Fix: Implement logic that updates the dual stack (IPV4 and IPV6) for all scenarios Result: The fix has been verified using verified with 4.9.0-0.nightly-2021-07-25-125326
Clone Of:
Environment:
Last Closed: 2021-10-18 17:39:52 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 1155 0 None open Bug 1981975: Update service network status to reflect dual stack entries 2021-07-19 07:43:04 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:53:09 UTC

Comment 4 Miciah Dashiel Butler Masters 2021-07-15 16:15:38 UTC
Based on comment 1, it looks like we need to change this code to account for an install-config with multiple service networks defined, as is the case with dualstack:

	if len(network.Status.ServiceNetwork) > 0 {
		set.Insert(network.Status.ServiceNetwork[0])
	} else {
		return "", fmt.Errorf("serviceNetwork missing from network '%s' status", network.Name)
	}

https://github.com/openshift/cluster-network-operator/blob/18c4ad6453fe4e247d1af6326dfcbdb8ccfdfbca/pkg/util/proxyconfig/no_proxy.go#L77-L81

We'll fix this in 4.9.0 and then evaluate whether we need to backport the fix.

Comment 6 Daniel Del Ciancio 2021-07-16 18:00:09 UTC
Customer has been able to add the IPv6 serviceNetwork CIDR manually to the noproxy configuration and the MCO is no longer in degraded state and the MC update completed successfully on the master nodes.

Could we expect a fix for 4.8 ?  If so, when ? 

Thanks!

Comment 10 Hongan Li 2021-07-26 06:12:13 UTC
verified with 4.9.0-0.nightly-2021-07-25-125326 and the issue has been fixed.

# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-07-25-125326   True        False         19m     Cluster version is 4.9.0-0.nightly-2021-07-25-125326

# oc get network/cluster -oyaml
<---snip---->
status:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  - cidr: fd01::/48
    hostPrefix: 64
  clusterNetworkMTU: 1400
  networkType: OVNKubernetes
  serviceNetwork:
  - 172.30.0.0/16
  - fd02::/112

# oc get proxies.config.openshift.io cluster -oyaml
<---snip---->
status:
  httpProxy: http://xxx.redhat.com:xxx
  httpsProxy: http://xxx.redhat.com:xxx
  noProxy: .cluster.local,.svc,10.128.0.0/14,10.73.116.0/23,10.73.a.b,127.0.0.1,172.30.0.0/16,2620:52:0:4974::/64,api-int.bm2-zzhao.qe.devcluster.openshift.com,bm2-zzhao.qe.devcluster.openshift.com,fd01::/48,fd02::/112,localhost


### in old 4.8 version we can see:
status:
  httpProxy: http://xxx.redhat.com:xxx
  httpsProxy: http://xxx.redhat.com:xxx
  noProxy: .cluster.local,.svc,10.128.0.0/14,10.73.116.0/23,10.73.a.b,127.0.0.1,172.30.0.0/16,2620:52:0:4974::/64,api-int.bm2-zzhao.qe.devcluster.openshift.com,bm2-zzhao.qe.devcluster.openshift.com,fd01::/48,localhost

Comment 11 Daniel Del Ciancio 2021-08-17 14:36:19 UTC
Are there plans to backport this to 4.8?

Comment 12 Daniel Del Ciancio 2021-08-18 19:39:47 UTC
My customer tested on 4.8.3 and IPv6 service network CIDR did not appear in the NOPROXY list.  

Do we have an ETA as to when we could expect this fix to land in 4.8.z?

Comment 13 Miciah Dashiel Butler Masters 2021-08-19 21:43:29 UTC
Daniel, please see bug 1985588, which is tracking the 4.8.z backport.  It's currently blocked on CI.  Once it passes CI, it can get cherry-pick approval and merge.  Once a backport merges, it generally will ship a week or two later in the next z-stream release.

Comment 15 Daniel Del Ciancio 2021-09-08 14:08:01 UTC
Customer tested 4.8.10 and proxy configuration looks good.  Tested both fresh cluster install and making the change post cluster install.  Both tests yielded successful results.  Issue can be closed.

Comment 16 Luigi Mario Zuccarelli 2021-09-08 14:48:33 UTC
@ddelcian@redhat.com  - Thanks for the feedback, happy that the fix worked.

Comment 18 errata-xmlrpc 2021-10-18 17:39:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759

Comment 19 errata-xmlrpc 2021-10-18 17:52:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.