Bug 1808068

Summary: openshift-ca.crt replaced during 3.11.104 to 3.11.157 upgrade
Product: OpenShift Container Platform Reporter: dtarabor
Component: InstallerAssignee: Russell Teague <rteague>
Installer sub component: openshift-ansible QA Contact: Gaoyun Pei <gpei>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: low CC: cscribne, dominik.mierzejewski, rteague
Version: 3.11.0   
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Missing mounts on the sync pod to the pki directory Consequence: The openshift-ca.crt found was not accessible and was recreated. Fix: Add the missing mounts and volumes to the sync pod. Result: The openshift-ca.crt file is accessible in the sync pod.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-28 05:44:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1826448    
Bug Blocks:    

Description dtarabor 2020-02-27 18:13:24 UTC
Description of problem:

Version-Release number of the following components:
ansible-2.6.20-1.el7ae.noarch
openshift-ansible-3.11.157-1.git.0.10b76ed.el7.noarch
ansible 2.6.20

How reproducible:

reproduced repeatedly on a test environment.

Steps to Reproduce:
1. Upgrade from 3.11.104 to 3.11.157

Actual results:
https://github.com/openshift/openshift-ansible/blob/release-3.11/roles/openshift_node_group/files/sync.yaml#L127-L133

[/usr/share/ansible/openshift-ansible/roles/openshift_node_group/tasks/sync.yml:]

the ca.crt is replaced, causing issues curling any endpoints that initially worked (in this case, an artifactory repo that was seemingly unaltered up until this play in the upgrade playbook)

Expected results:

the ability to curl an artifactory repo successfully.

Additional info:

see sf case # 02578970 comment #50 for:

curl attempts (curl_before*)
/etc/pki* (folder contents)
ansible logs (upgrade*.logs*

curl_before.log
pki_before.tgz
upgrade1.logs

curl_after_fail.log
pki_after_fail.tgz

pki_after_update_trust.tgz
curl_after_update_trust.log

upgrade2.logs
curl_after_2nd_upgrade.log
pki_after_2nd_upgrade.tgz

Comment 1 Chad Scribner 2020-03-05 14:39:06 UTC
This is the patch that was applied to rectify the issue.

--- /usr/share/ansible/openshift-ansible/roles/openshift_node_group/files/sync.yaml    2019-12-02 09:21:20.000000000 +0100
+++ /usr/share/ansible/openshift-ansible/roles/openshift_node_group/files/sync.yaml.fix    2020-02-21 17:03:00.000000000 +0100
@@ -226,6 +226,8 @@
           readOnly: true
         - mountPath: /etc/pki
           name: host-pki
+        - mountPath: /usr/share/pki
+          name: host-pki-usr

       volumes:
       # In bootstrap mode, the host config contains information not easily available
@@ -246,6 +248,10 @@
           path: /etc/pki
           type: ""
         name: host-pki
+      - hostPath:
+          path: /usr/share/pki
+          type: ""
+        name: host-pki-usr
       # Sync daemonset should tolerate all taints to make sure it runs on all nodes
       tolerations:
       - operator: "Exists"

Comment 5 Russell Teague 2020-05-07 17:15:41 UTC
We've had some issues with 3.11 CI that have been resolved and the PR merged.  The change will be picked up by QE and tested before shipping in the next release.

Comment 10 Gaoyun Pei 2020-05-19 13:45:01 UTC
Thanks for the info.

With using openshift-ansible-3.11.218-1.git.0.6f55149.el7.noarch, either for fresh install or upgrade from previous 3.11, the /usr/share/pki is also mounted to node sync pod.

[root@gpei-311bmaster-etcd-nfs-1 ~]# ls /usr/share/pki/ca-trust-source/anchors
test.crt

[root@gpei-311bmaster-etcd-nfs-1 ~]# oc describe -n openshift-node pod sync-ngmpq
...
Volumes:
...
  host-pki:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/pki
    HostPathType:  
  host-pki-usr:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/share/pki
    HostPathType:  

[root@gpei-311bmaster-etcd-nfs-1 ~]# oc -n openshift-node rsh sync-ngmpq
sh-4.2# ls /usr/share/pki/ca-trust-source/anchors
test.crt

So it should be OK to move this bug to verified.

Comment 12 errata-xmlrpc 2020-05-28 05:44:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2215