1882191 – Installation fails against external resources which lack DNS Subject Alternative Name

Bug 1882191 - Installation fails against external resources which lack DNS Subject Alternative Name

Summary: Installation fails against external resources which lack DNS Subject Alternat...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Machine Config Operator
Sub Component:
Version:	4.6
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	4.7.0
Assignee:	Scott Dodson
QA Contact:	Michael Nguyen
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	ocp-46-z-tracker 1885737
TreeView+	depends on / blocked

Reported:	2020-09-24 02:31 UTC by Philip Chan
Modified:	2023-09-15 00:48 UTC (History)
CC List:	15 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Clones:	1885737 (view as bug list)
Environment:
Last Closed:	2021-02-24 15:19:20 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
bootkube.log (1.49 MB, text/plain) 2020-09-24 16:18 UTC, Philip Chan	no flags	Details
master-0.kubelet.service.log (2.99 MB, text/plain) 2020-09-24 16:19 UTC, Philip Chan	no flags	Details
master-1.kubelet.service.log (2.26 MB, text/plain) 2020-09-24 16:19 UTC, Philip Chan	no flags	Details
master-2.kubelet.service.log (2.28 MB, text/plain) 2020-09-24 16:19 UTC, Philip Chan	no flags	Details
bootkube.service.journalctl.log (368.09 KB, text/plain) 2020-09-30 20:09 UTC, Philip Chan	no flags	Details
cluster-policy-controller.log (7.62 MB, text/plain) 2020-09-30 20:09 UTC, Philip Chan	no flags	Details
kube-apiserver.log (12.99 MB, text/plain) 2020-09-30 20:10 UTC, Philip Chan	no flags	Details
kube-controller-manager.log (4.58 MB, text/plain) 2020-09-30 20:11 UTC, Philip Chan	no flags	Details
kube-scheduler.log (990.54 KB, text/plain) 2020-09-30 20:11 UTC, Philip Chan	no flags	Details
master-0.kubelet.service.log.tgz from zKVM 9/30/2020 (5.57 MB, application/gzip) 2020-10-01 15:28 UTC, Philip Chan	no flags	Details
master-1.kubelet.service.log.tgz from zKVM 9/30/2020 (6.36 MB, application/gzip) 2020-10-01 15:29 UTC, Philip Chan	no flags	Details
master-2.kubelet.service.log.tgz from zKVM 9/30/2020 (7.48 MB, application/gzip) 2020-10-01 15:29 UTC, Philip Chan	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift installer pull 4248	None	closed	Bug 1882191: Add GODEBUG=x509ignoreCN=0 to systemd DefaultEnvironment	2021-02-19 14:20:24 UTC
Github	openshift machine-config-operator pull 2141	None	closed	Bug 1882191: Add GODEBUG=x509ignoreCN=0 to systemd DefaultEnvironment	2021-02-19 14:20:24 UTC
Red Hat Product Errata	RHSA-2020:5633	None	None	None	2021-02-24 15:20:58 UTC

Description Philip Chan 2020-09-24 02:31:44 UTC

Description of problem:
Performing a OCP 4.6 Installation in a restricted network on zVM fails.  The 

Version-Release number of selected component (if applicable):
RHCOS 4.6.0-0.nightly-s390x-2020-09-10-112115
OCP 4.6.0-0.nightly-s390x-2020-09-22-223822

How reproducible:
Consistently

Steps to Reproduce:
1. Follow steps to configure the mirror host on bastion:
https://docs.openshift.com/container-platform/4.5/installing/install_config/installing-restricted-networks-preparations.html
2. Install cluster using restricted network steps:
https://docs.openshift.com/container-platform/4.5/installing/installing_bare_metal/installing-restricted-networks-bare-metal.html#installing-restricted-networks-bare-metal
3. IPL the bootstrap and cluster nodes.

Actual results: Bootstrap, master and worker nodes all start.  However, the master nodes never become Ready:

[root@OSPAMGR2 ~]# oc get nodes
NAME                                   STATUS     ROLES    AGE     VERSION
master-0.ospamgr2-sep22.zvmocp.notld   NotReady   master   4h1m    v1.19.0+8a39924
master-1.ospamgr2-sep22.zvmocp.notld   NotReady   master   3h56m   v1.19.0+8a39924
master-2.ospamgr2-sep22.zvmocp.notld   NotReady   master   3h48m   v1.19.0+8a39924

Preventing the worker nodes from starting.  The bootkube.service reports this:

Sep 23 23:02:41 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[19435]: E0923 23:02:41.319432       1 reflector.go:251] github.com/openshift/cluster-bootstrap/pkg/start/status.go:66: Failed to watch *v1.Pod: Get "https://localhost:6443/api/v1/pods?watch=true": dial tcp [::1]:6443: connect: connection refused
Sep 23 23:02:42 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[19435]: E0923 23:02:42.325119       1 reflector.go:134] github.com/openshift/cluster-bootstrap/pkg/start/status.go:66: Failed to list *v1.Pod: Get "https://localhost:6443/api/v1/pods": dial tcp [::1]:6443: connect: connection refused
Sep 23 23:02:43 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[19435]: E0923 23:02:43.327963       1 reflector.go:134] github.com/openshift/cluster-bootstrap/pkg/start/status.go:66: Failed to list *v1.Pod: Get "https://localhost:6443/api/v1/pods": dial tcp [::1]:6443: connect: connection refused
Sep 23 23:02:44 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[19435]: E0923 23:02:44.332599       1 reflector.go:134] github.com/openshift/cluster-bootstrap/pkg/start/status.go:66: Failed to list *v1.Pod: Get "https://localhost:6443/api/v1/pods": dial tcp [::1]:6443: connect: connection refused


Expected results: Master and worker nodes start successfully


Additional info:

Comment 1 Carvel Baus 2020-09-24 11:48:11 UTC

I believe there are differences between bare metal and Z installation on a restricted network. These are the instructions for Z on restricted network:

https://docs.openshift.com/container-platform/4.5/installing/installing_ibm_z/installing-restricted-networks-ibm-z.html

Can you confirm you did everything necessary according to the set of instructions for Z?

Also, can you do and 'oc adm must-gather' and provide the entire bootkube.log and logs for the masters?

Comment 2 Philip Chan 2020-09-24 16:18:12 UTC

Hi Carvel,

My mistake, let me clarify.  We do follow the specific instructions for a Z restricted network install.  These are the same installations steps that we've performed previously for releases such as OCP 4.4 and 4.5.

I cannot run 'oc adm must-gather' because the cluster is not up:

[root@OSPAMGR2 ~]# oc adm must-gather
[must-gather      ] OUT the server could not find the requested resource (get imagestreams.image.openshift.io must-gather)
[must-gather      ] OUT
[must-gather      ] OUT Using must-gather plugin-in image: quay.io/openshift/origin-must-gather:latest
[must-gather      ] OUT namespace/openshift-must-gather-9fq6f created
[must-gather      ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-cdvrq created
[must-gather      ] OUT pod for plug-in image quay.io/openshift/origin-must-gather:latest created
[must-gather-f9gkr] OUT gather did not start: timed out waiting for the condition
[must-gather      ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-cdvrq deleted
[must-gather      ] OUT namespace/openshift-must-gather-9fq6f deleted
error: gather did not start for pod must-gather-f9gkr: timed out waiting for the condition

I gathered and will attach the logs for bootkube and masters.  Note, the master logs are very large and will be a problem with the upload limits.  I broke them into tail 10,000 line segments.  Please let me know if you need more.

Thank you,
-Phil

Comment 3 Philip Chan 2020-09-24 16:18:35 UTC

Created attachment 1716365 [details]
bootkube.log

Comment 4 Philip Chan 2020-09-24 16:19:10 UTC

Created attachment 1716366 [details]
master-0.kubelet.service.log

Comment 5 Philip Chan 2020-09-24 16:19:27 UTC

Created attachment 1716367 [details]
master-1.kubelet.service.log

Comment 6 Philip Chan 2020-09-24 16:19:44 UTC

Created attachment 1716368 [details]
master-2.kubelet.service.log

Comment 7 Carvel Baus 2020-09-24 17:05:44 UTC


I see this happening over and over in the bootstrap:

Sep 23 22:20:57 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[2333]: https://localhost:2379 is healthy: successfully committed proposal: took = 21.833623ms
Sep 23 22:20:57 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[2333]: Starting cluster-bootstrap...
Sep 23 22:21:00 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[2333]: Starting temporary bootstrap control plane...
Sep 23 22:21:00 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[2333]: E0923 22:21:00.084126       1 reflector.go:134] github.com/openshift/cluster-bootstrap/pkg/start/status.go:66: Failed to list *v1.Pod: Get "https://localhost:6443/api/v1/pods": dial tcp [::1]:6443: connect: connection refused
Sep 23 22:21:00 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[2333]: [#1] failed to fetch discovery: Get "https://localhost:6443/api?timeout=32s": dial tcp [::1]:6443: connect: connection refused
Sep 23 22:21:00 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[2333]: [#2] failed to fetch discovery: Get "https://localhost:6443/api?timeout=32s": dial tcp [::1]:6443: connect: connection refused
Sep 23 22:21:00 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[2333]: [#3] failed to fetch discovery: Get "https://localhost:6443/api?timeout=32s": dial tcp [::1]:6443: connect: connection refused
Sep 23 22:21:00 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[2333]: [#4] failed to fetch discovery: Get "https://localhost:6443/api?timeout=32s": dial tcp [::1]:6443: connect: connection refused
Sep 23 22:21:00 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[2333]: [#5] failed to fetch discovery: Get "https://localhost:6443/api?timeout=32s": dial tcp [::1]:6443: connect: connection refused


Whats interesting is that localhost:2379 is reachable, but localhost:6443 (the apiserver) is not. The temporary control plane never is able to come up. 

My first thought is that this is network configuration (firewall on the bootstrap??), but it is not clear from the bootkube.log.

Can you take a look at the bootstrap machine and see if there are any other indicators why 6443 is getting refused?

Comment 8 Dan Li 2020-09-28 12:09:49 UTC

Adding Needinfo for Phil per Carvel's Comment 7 as the original needinfo may not have triggered a notification

Comment 9 Philip Chan 2020-09-29 19:56:20 UTC

Hi,

We've taken a closer look at the bootstrap node, and unfortunately we could not determine why the connections are refused.  However, we do see that through netstat, port 6443 is up and connections have been established:

[root@OSPAMGR2 ~]# netstat -an | grep 6443
tcp        0      0 10.20.116.2:6443        0.0.0.0:*               LISTEN
tcp        0      0 9.12.23.25:6443         0.0.0.0:*               LISTEN
tcp        0      0 10.20.116.2:6443        10.20.116.11:40606      ESTABLISHED
tcp        0      0 10.20.116.2:52100       10.20.116.10:6443       ESTABLISHED
tcp        0      0 10.20.116.2:6443        10.20.116.12:39364      ESTABLISHED
tcp        0      0 10.20.116.2:6443        10.20.116.13:57002      ESTABLISHED
tcp        0      0 10.20.116.2:52096       10.20.116.10:6443       ESTABLISHED
tcp        0      0 10.20.116.2:52098       10.20.116.10:6443       ESTABLISHED

Also, we successfully installed OCP 4.5.11 and 4.5.12 disconnected installs on this same cluster (same z/VM hosted bastion, master, and worker nodes) within the last week and a half.  Using the same disconnected install procedure, there seems to be an issue specific with OCP 4.6.  Note that we have also seen the same error with a OCP 4.6 disconnected install on z/KVM.  If there are any other logs or traces that we can generate and provide, please let me know.  

Thank you,
-Phil

Comment 10 Alexander Klein 2020-09-30 08:33:23 UTC

i did recently see the same error in a cluster that did not have enough cpus/memory, could this be related to the increase in used resources of some operators in https://bugzilla.redhat.com/show_bug.cgi?id=1878770 ?

Comment 11 Carvel Baus 2020-09-30 13:14:46 UTC

Could we get a list of memory and cpu configured for each machine?

Comment 12 Philip Chan 2020-09-30 16:11:11 UTC

Hi,

Under the previous disconnected installs on zVM and zKVM; we had 8 CPUs and 32GB of memory for zVM & 4 CPUs and 32GB of memory for zKVM.  This was defined for the cluster nodes bootstrap, masters, and workers.  Today, we increased all the cluster guests for both platforms to the following:

zVM (18 CPU and 64GB Memory) for bootstrap, masters, and workers:
 
 $ cat /proc/sysinfo
 ...
 VM00 CPUs Total:      18
 VM00 CPUs Configured: 18
 VM00 CPUs Standby:    0
 VM00 CPUs Reserved:   0
 
 $ free -h
               total        used        free      shared  buff/cache   available
 Mem:           62Gi       571Mi        62Gi       1.0Mi       371Mi        61Gi
 Swap:            0B          0B          0B

zKVM (8 CPU and 64GB Memory) for bootstrap, masters, and workers:

 $ cat /proc/sysinfo
 ...
 VM00 CPUs Total:      8
 VM00 CPUs Configured: 8
 VM00 CPUs Standby:    0
 VM00 CPUs Reserved:   0
 VM00 Extended Name:   bootstrap-0
 VM00 UUID:            d42fea3a-6919-46a5-8976-104ecd023fb0
 
 $ free -h
               total        used        free      shared  buff/cache   available
 Mem:           62Gi       1.6Gi        59Gi       4.0Mi       2.2Gi        60Gi
 Swap:            0B          0B          0B

We re-tested the disconnected install on both platforms and there is no change. The workers nodes are still unable to connect.

-Phil

Comment 13 Philip Chan 2020-09-30 20:08:52 UTC

Hi Carvel,

I'm attaching additional logs from the bootstrap node as requested.  We attempted a disconnected install today using the CI nightly build 4.6.0-0.nightly-s390x-2020-09-30-122156 earlier today.  Please let me know if there are any additional logs you may need.

Thank you,
-Phil

Comment 14 Philip Chan 2020-09-30 20:09:22 UTC

Created attachment 1717972 [details]
bootkube.service.journalctl.log

Comment 15 Philip Chan 2020-09-30 20:09:55 UTC

Created attachment 1717973 [details]
cluster-policy-controller.log

Comment 16 Philip Chan 2020-09-30 20:10:33 UTC

Created attachment 1717974 [details]
kube-apiserver.log

Comment 17 Philip Chan 2020-09-30 20:11:06 UTC

Created attachment 1717976 [details]
kube-controller-manager.log

Comment 18 Philip Chan 2020-09-30 20:11:23 UTC

Created attachment 1717977 [details]
kube-scheduler.log

Comment 19 Prashanth Sundararaman 2020-09-30 21:33:23 UTC

In the master kubelet logs :

 x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0]): quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3ecd439bba69ca2a6ed21df05911963baef721e9265f64095a31cd5f6c41d32b: Error reading manifest sha256:3ecd439bba69ca2a6ed21df05911963baef721e9265f64095a31cd5f6c41d32b in quay.io/openshift-release-dev/ocp-v4.0-art-dev: unauthorized: access to the requested resource is not authorized"

looks like this is a known issue being tracked, There are several places requiring fixes and this is one of the fixes: https://github.com/openshift/installer/pull/4210

Comment 20 Prashanth Sundararaman 2020-09-30 21:39:09 UTC

In this particular case it is coming from the network-operator and is seen only on master-0:

Sep 23 22:48:34 master-0.ospamgr2-sep22.zvmocp.notld hyperkube[1521]: E0923 22:48:34.954286    1521 kuberuntime_manager.go:730] createPodSandbox for pod "network-operator-77547d9b84-jj2f5_openshift-network-operator(9902f8f3-5fb0-43a3-8cd8-554b7f933e20)" failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_network-operator-77547d9b84-jj2f5_openshift-network-operator_9902f8f3-5fb0-43a3-8cd8-554b7f933e20_0": Error initializing source docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3ecd439bba69ca2a6ed21df05911963baef721e9265f64095a31cd5f6c41d32b: (Mirrors also failed: [bastion:5000/ocp4/openshift4@sha256:3ecd439bba69ca2a6ed21df05911963baef721e9265f64095a31cd5f6c41d32b: error pinging docker registry bastion:5000: Get "https://bastion:5000/v2/": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0]): quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3ecd439bba69ca2a6ed21df05911963baef721e9265f64095a31cd5f6c41d32b: Error reading manifest sha256:3ecd439bba69ca2a6ed21df05911963baef721e9265f64095a31cd5f6c41d32b in quay.io/openshift-release-dev/ocp-v4.0-art-dev: unauthorized: access to the requested resource is not authorized

Comment 22 Philip Chan 2020-10-01 15:27:10 UTC

Hi,

As requested, I will upload the 3 master logs that coincide with the bootstrap logs that were uploaded the day before from 9/30/2020.  Please let me know if you need any additional logs and info.

Thank you,
-Phil

Comment 23 Philip Chan 2020-10-01 15:28:44 UTC

Created attachment 1718200 [details]
master-0.kubelet.service.log.tgz from zKVM 9/30/2020

Comment 24 Philip Chan 2020-10-01 15:29:09 UTC

Created attachment 1718201 [details]
master-1.kubelet.service.log.tgz from zKVM 9/30/2020

Comment 25 Philip Chan 2020-10-01 15:29:32 UTC

Created attachment 1718202 [details]
master-2.kubelet.service.log.tgz from zKVM 9/30/2020

Comment 26 Prashanth Sundararaman 2020-10-01 15:41:49 UTC

Yes, i see those logs in all the master nodes.

Phil,

Which version of the installer and which version release image are you using? Can you use the latest and also check that the installer you are using to install the image is in sync with the release image you are using? Also please remember to use the latest rhcos image. The image you are using seems a bit old.

Thanks

Comment 27 krmoser 2020-10-02 18:22:46 UTC

Prashanth,

Thank you for all your assistance with this issue.  

1. My colleague Phil Chan and I have also been regularly testing OCP 4.6 disconnected installs on a second zVM hosted cluster, with unfortunately the same unsuccessful results.  This includes recent OCP 4.6 and RHCOS builds.  

2. For comparison and debug purposes, we have also been testing OCP 4.5 disconnected installs with the latest OCP 4.5 builds on this same cluster (all successfully).  This includes OCP 4.5.11, 4.5.12, 4.5.13, and 4.5.14.


3. Here is a summary of our initial OCP 4.5 and 4.6 connected and disconnected install related tests for today, all on this same zVM hosted cluster.

     OCP level           RHCOS level                Installation Type      Status           Comments
     ===============     ======================     =================      ============     ====================================================================
  1. OCP 4.5.14          45.82.202009261457-0       connected              successful   
  2. OCP 4.5.14          45.82.202009261457-0       disconnected           successful

  3. OCP 4.6.0-fc.9      46.82.202010010439-0       connected              successful
  4. OCP 4.6.0-fc.9      46.82.202010010439-0       disconnected           unsuccessful     Master nodes do not successfully install, encounter x509 certificate
                                                                                            issue mentioned in comments 19 and 20



4. Here is this zVM hosted cluster's bootstrap, master, and worker nodes' vCPU and Real Memory configuration.  This zVM cluster is hosted on an IBM z15 server.

     Node             vCPU             Real Memory (GB)
     =========        ====             ================
  1. bootstrap-0         8               64
  2. master-0            8               32
  3. master-1            8               32
  4. master-2            8               32
  5. worker-0            8               32
  6. worker-1            8               32


Thank you,
Kyle

Comment 28 Carvel Baus 2020-10-02 19:24:03 UTC

A suggestion of something to look at, given the x509 errors, is something in the self-generated certs is causing the latest go TLS verification to fail. I think what is mentioned here might be worth focusing in on w/r/t the cert itself:

"https://bastion:5000/v2/": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0])

Comment 29 krmoser 2020-10-05 05:40:09 UTC

Carvel and Prashanth,

Thank you for your assistance and the information on the x509 certificate SAN configuration for OCP 4.6 disconnected installs.  Using certificates configured using SANs, Phil and I have successfully tested OCP 4.6 disconnected installs on both zKVM and zVM hosted clusters.

The following OCP 4.6 builds and associated RHCOS builds have been successfully tested for disconnected installs on zVM hosted clusters.  


     OCP level           RHCOS level                Installation Type      Status           Comments
     ===============     ======================     =================      ============     ====================================================================
  1. 4.6.0-fc.6          46.82.202009130739-0       disconnected           successful       RHCOS 46.82.202010010439-0 used to install bootstrap node
  2. 4.6.0-fc.7          46.82.202009170439-0       disconnected           successful       RHCOS 46.82.202010010439-0 used to install bootstrap node
  3. 4.6.0-fc.8          46.82.202009241338-0       disconnected           successful       RHCOS 46.82.202010010439-0 used to install bootstrap node
  4. 4.6.0-fc.9          46.82.202010010439-0       disconnected           successful       RHCOS 46.82.202010010439-0 used to install bootstrap node
 

Thank you,
Kyle

Comment 30 Prashanth Sundararaman 2020-10-05 12:29:57 UTC

Kyle,

Nice! Thanks for the confirmation. Now I am wondering whether this bug should be a generic doc bug with the Openshift component for 4.6 Disconnected installs.

Prashanth

Comment 32 Prashanth Sundararaman 2020-10-05 13:27:09 UTC

Moving this to the docs team. This is not an IBM Z specific issue but rather it needs to be addressed as a whole for Disconnected installs across arches. In 4.6 because of the move to Golang1.15 and Go1.15 requiring SAN in the certificates, the certificates for the local image registry needs to have SANs rather than CNs. This procedure needs to be documented as part of the Disconnected install documentation.

Comment 33 Scott Dodson 2020-10-05 19:01:24 UTC

Can you please test 4.6.0-fc.9 with certificates that lack a SAN? Our intent is to document this deprecation in 4.6 but defer enforcement until a later release given how late it was that we discovered this change.

If we find that we break on certs that lack a SAN in 4.6.0-fc.9 we need to dig deeper and fix those components which still fail.

If we find that things work for you then we can use this bug to track the documentation necessary to announce deprecation.

Comment 34 Carvel Baus 2020-10-05 19:24:49 UTC

You will see in comment #27 that FC9 was tested with certs without SANs and it does break. This breakage was caused by this when we switched to go 1.15:

https://tip.golang.org/doc/go1.15#commonname

Not sure there is something to "fix" per se, except the documentation, unless I misunderstood your intent.


(In reply to Scott Dodson from comment #33)
> Can you please test 4.6.0-fc.9 with certificates that lack a SAN? Our intent
> is to document this deprecation in 4.6 but defer enforcement until a later
> release given how late it was that we discovered this change.
> 
> If we find that we break on certs that lack a SAN in 4.6.0-fc.9 we need to
> dig deeper and fix those components which still fail.
> 
> If we find that things work for you then we can use this bug to track the
> documentation necessary to announce deprecation.

Comment 35 Scott Dodson 2020-10-06 13:32:06 UTC

Thanks, we need to ensure that host level services have the environment variable set. I'm working on coordinating that, moving to installer component for now.

Comment 39 Michael Nguyen 2020-10-27 01:10:39 UTC

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2020-10-26-124513   True        False         5h4m    Cluster version is 4.7.0-0.nightly-2020-10-26-124513

$ oc debug node/ip-10-0-136-111.us-west-2.compute.internal 
Starting pod/ip-10-0-136-111us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# systemctl show-environment
GODEBUG=x509ignoreCN=0
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
sh-4.4# cat << EOF > /etc/systemd/system/test.service
> [Service]
> ExecStart=echo $GODEBUG
> EOF
sh-4.4# systemctl start test.service
sh-4.4# journalctl -lu test.service
-- Logs begin at Mon 2020-10-26 19:34:50 UTC, end at Tue 2020-10-27 01:08:08 UTC. --
Oct 27 01:06:07 ip-10-0-136-111 systemd[1]: /etc/systemd/system/test.service:1: Assignment outside of section. Ignoring.
Oct 27 01:06:07 ip-10-0-136-111 systemd[1]: test.service: Service lacks both ExecStart= and ExecStop= setting. Refusing.
Oct 27 01:06:55 ip-10-0-136-111 systemd[1]: Started test.service.
Oct 27 01:06:55 ip-10-0-136-111 echo[580628]: x509ignoreCN=0
Oct 27 01:06:55 ip-10-0-136-111 systemd[1]: test.service: Consumed 1ms CPU time

Comment 42 errata-xmlrpc 2021-02-24 15:19:20 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633

Comment 43 Red Hat Bugzilla 2023-09-15 00:48:41 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days

Note You need to log in before you can comment on or make changes to this bug.