1428711 – [IntService_public_324] ES pod is unable to read searchguard.truststore after upgarde logging from 3.3.1 to 3.5.0

Bug 1428711 - [IntService_public_324] ES pod is unable to read searchguard.truststore after upgarde logging from 3.3.1 to 3.5.0

Summary: [IntService_public_324] ES pod is unable to read searchguard.truststore afte...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Logging
Sub Component:
Version:	3.5.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Rich Megginson
QA Contact:	Junqi Zhao
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-03-03 08:48 UTC by Junqi Zhao
Modified:	2017-03-20 15:34 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-03-15 16:26:55 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
ansible inventory file (1.15 KB, text/plain) 2017-03-03 08:48 UTC, Junqi Zhao	no flags	Details
ansible running log (897.35 KB, text/plain) 2017-03-03 08:53 UTC, Junqi Zhao	no flags	Details
es dc log (19.94 KB, text/plain) 2017-03-03 08:55 UTC, Junqi Zhao	no flags	Details
Deploy logging 3.3.1 shell script (1.83 KB, application/x-shellscript) 2017-03-07 01:02 UTC, Junqi Zhao	no flags	Details
es pod log, SSL Problem Received fatal alert: unknown_ca (26.66 KB, text/plain) 2017-03-10 05:37 UTC, Junqi Zhao	no flags	Details
upgrade log from 3.4.1 to 3.5.0 via ansible (1.98 MB, text/plain) 2017-03-15 07:59 UTC, Junqi Zhao	no flags	Details
View All

Description Junqi Zhao 2017-03-03 08:48:43 UTC

Created attachment 1259427 [details]
ansible inventory file

Description of problem:

This issue was found when verifying
 https://bugzilla.redhat.com/show_bug.cgi?id=1426511.
Install logging 3.3.1 first, specified nodeselector in configmap, steps please see 'Steps to Reproduce', after upgrading logging from 3.3.1 to 3.5.0, ES pod can not start up, unable to read /etc/elasticsearch/secret/searchguard.truststore

# oc get po
NAME                          READY     STATUS             RESTARTS   AGE
logging-curator-2-zn6lk       1/1       Running            5          24m
logging-deployer-1c4s8        0/1       Completed          0          50m
logging-es-63pgj4rj-2-0qn01   0/1       CrashLoopBackOff   9          24m
logging-fluentd-6sj96         1/1       Running            0          24m
logging-kibana-2-fxblb        2/2       Running            0          24m

# oc logs logging-es-63pgj4rj-2-0qn01
Comparing the specificed RAM to the maximum recommended for ElasticSearch...
Inspecting the maximum RAM available...
ES_JAVA_OPTS: '-Dmapper.allow_dots_in_name=true -Xms128M -Xmx4096m'
Checking if Elasticsearch is ready on https://localhost:9200 ..[2017-03-03 07:27:06,021][INFO ][node                     ] [Iron Fist] version[2.4.4], pid[1], build[b3c4811/2017-01-18T03:01:12Z]
[2017-03-03 07:27:06,022][INFO ][node                     ] [Iron Fist] initializing ...
.[2017-03-03 07:27:07,065][INFO ][plugins                  ] [Iron Fist] modules [reindex, lang-expression, lang-groovy], plugins [search-guard-ssl, openshift-elasticsearch, cloud-kubernetes, search-guard-2], sites []
[2017-03-03 07:27:07,103][INFO ][env                      ] [Iron Fist] using [1] data paths, mounts [[/elasticsearch/persistent (/dev/xvda2)]], net usable_space [17.8gb], net total_space [24.9gb], spins? [possibly], types [xfs]
[2017-03-03 07:27:07,103][INFO ][env                      ] [Iron Fist] heap size [3.9gb], compressed ordinary object pointers [true]
Exception in thread "main" ElasticsearchException[Unable to read /etc/elasticsearch/secret/searchguard.truststore (/etc/elasticsearch/secret/searchguard.truststore) Please make sure this files exists and is readable regarding to permissions]
    at com.floragunn.searchguard.ssl.DefaultSearchGuardKeyStore.checkStorePath(DefaultSearchGuardKeyStore.java:551)
    at com.floragunn.searchguard.ssl.DefaultSearchGuardKeyStore.initSSLConfig(DefaultSearchGuardKeyStore.java:199)
    at com.floragunn.searchguard.ssl.DefaultSearchGuardKeyStore.<init>(DefaultSearchGuardKeyStore.java:139)
    at com.floragunn.searchguard.ssl.SearchGuardSSLModule.<init>(SearchGuardSSLModule.java:40)
    at com.floragunn.searchguard.ssl.SearchGuardSSLPlugin.nodeModules(SearchGuardSSLPlugin.java:126)
    at org.elasticsearch.plugins.PluginsService.nodeModules(PluginsService.java:263)
    at org.elasticsearch.node.Node.<init>(Node.java:179)
    at org.elasticsearch.node.Node.<init>(Node.java:140)
    at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:143)
    at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:194)
    at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:286)
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:45)
Refer to the log for complete error details.

Version-Release number of selected component (if applicable):
openshift-ansible-3.5.20-1.git.0.5a5fcd5.el7.noarch


How reproducible:
Always

Steps to Reproduce:
1.Deploy logging 3.3.1 stacks (on OCP 3.5.0) with journald log driver enabled and node selectors defined in configmap,curator, es and kibana nodeselector are different with fluentd nodeselector:
"use-journal": "true"
"curator-nodeselector": "logging-infra-east=true"
"es-nodeselector": "logging-infra-east=true"
"kibana-nodeselector": "logging-infra-east=true"


2.Upgrade to logging 3.5.0 stacks by using ansible, specifying these parameters in inventory file (as in the attachment), curator, es and kibana nodeselector are different with fluentd nodeselector::
openshift_logging_fluentd_use_journal=true

openshift_logging_es_nodeselector={'logging-infra-east':'true'}
openshift_logging_kibana_nodeselector={'logging-infra-east':'true'}
openshift_logging_curator_nodeselector={'logging-infra-east':'true'}
openshift_logging_fluentd_nodeselector={'logging-infra-fluentd':'true'}

3.Check upgrade result

Actual results:
Upgrade failed, failed to start up ES pod.

Expected results:
Upgrade should be successful

Additional info:
Ansible upgrade log attached
inventory file for the upgrade attached
ES dc info attached

Comment 1 Junqi Zhao 2017-03-03 08:53:31 UTC

Created attachment 1259430 [details]
ansible running log

Comment 2 Junqi Zhao 2017-03-03 08:55:09 UTC

Created attachment 1259431 [details]
es dc log

Comment 3 Rich Megginson 2017-03-03 19:28:42 UTC

In ES 3.3 the truststore is named /etc/elasticsearch/secret/truststore

This commit changed
-        truststore_filepath: /etc/elasticsearch/secret/truststore
to
+  truststore.path: /etc/elasticsearch/secret/searchguard.truststore

commit b7f526dc6dfabf1a98db284984fbc7333080f067
Author: ewolinetz <ewolinet>
Date:   Tue Jul 12 10:05:20 2016 -0500

    bumping up versions to work with es 2.3.5 and kibana 4.5.4

I'm assuming this needs to be handled by the upgrade in ansible?

Comment 5 Rich Megginson 2017-03-04 03:00:34 UTC

How did you do the upgrade from 3.3 to 3.5?  I'm following the official documentation for 3.4: https://docs.openshift.com/container-platform/3.4/install_config/upgrading/automated_upgrades.html#preparing-for-an-automated-upgrade

except that I'm using 

ansible-playbook -vvv -i /root/ansible-inventory playbooks/byo/openshift-cluster/upgrades/v3_5/upgrade.yml

And I get this error message:

MSG:

openshift_release is 3.3 which is not a valid release for a 3.5 upgrade

Comment 6 Junqi Zhao 2017-03-06 03:01:14 UTC

Pl(In reply to Rich Megginson from comment #5)
> How did you do the upgrade from 3.3 to 3.5?  I'm following the official
> documentation for 3.4:
> https://docs.openshift.com/container-platform/3.4/install_config/upgrading/
> automated_upgrades.html#preparing-for-an-automated-upgrade
> 
> except that I'm using 
> 
> ansible-playbook -vvv -i /root/ansible-inventory
> playbooks/byo/openshift-cluster/upgrades/v3_5/upgrade.yml
> 
> And I get this error message:
> 
> MSG:
> 
> openshift_release is 3.3 which is not a valid release for a 3.5 upgrade

(In reply to Rich Megginson from comment #5)
> How did you do the upgrade from 3.3 to 3.5?  I'm following the official
> documentation for 3.4:
> https://docs.openshift.com/container-platform/3.4/install_config/upgrading/
> automated_upgrades.html#preparing-for-an-automated-upgrade
> 
> except that I'm using 
> 
> ansible-playbook -vvv -i /root/ansible-inventory
> playbooks/byo/openshift-cluster/upgrades/v3_5/upgrade.yml
> 
> And I get this error message:
> 
> MSG:
> 
> openshift_release is 3.3 which is not a valid release for a 3.5 upgrade

please see https://bugzilla.redhat.com/show_bug.cgi?id=1426511#c17

Comment 7 Rich Megginson 2017-03-06 20:50:02 UTC

(In reply to Junqi Zhao from comment #6)
> (In reply to Rich Megginson from comment #5)
> > How did you do the upgrade from 3.3 to 3.5?
> 
> please see https://bugzilla.redhat.com/show_bug.cgi?id=1426511#c17

excerpt:

> We specified the following ansible parameters  to upgrade from 3.3.1 to 3.5.0

Specified where?  How did you run ansible?  What version of openshift-ansible did you use?  Did you do a yum update (or git checkout) to go from openshift-ansible 3.3.1 to 3.5.0?  Did you start with openshift-ansible 3.5.0 and somehow install logging 3.3.1?

> 
> openshift_logging_install_logging=false
> openshift_logging_upgrade_logging=true
> 

What I'm looking for is the exact, step-by-step instructions you used.  Because I am unable to reproduce based on the information given so far.

Comment 8 Junqi Zhao 2017-03-07 01:02:02 UTC

Created attachment 1260668 [details]
Deploy logging 3.3.1 shell script

Comment 9 Junqi Zhao 2017-03-07 01:59:24 UTC

@rmeggins,

1. Please use the attached 'Deploy logging 3.3.1 shell script' to deploy logging 3.3.1, change the parameters according to your environment before deployment.

In my scenario, there is one Master and one Node, nodeSelector for fluentd is 'logging-infra-fluentd=true', nodeSelector for curator, es and kibana is 'logging-infra-east=true', since I have only one Node, so you will see the nodeSelector "logging-infra-fluentd=true" and "logging-infra-east=true" are both labeled for Node.

I suggest you use JSON-FILE as Logging driver, since it's slow to show log entry in Kibana UI.

Please make sure logging entries can be found in Kibana before your upgrade.

2. My openshift-ansible is installed by yum
# rpm -qa | grep openshift-ansible
openshift-ansible-3.5.20-1.git.0.5a5fcd5.el7.noarch
openshift-ansible-docs-3.5.20-1.git.0.5a5fcd5.el7.noarch

you can install it from our puddle server 'rcm-guest/puddles/RHAOS/AtomicOpenShift/3.5/'

playbooks are git cloned from git clone https://github.com/openshift/openshift-ansible/

Use the following command to upgrade to 3.5.0 by ansible.
# git clone https://github.com/openshift/openshift-ansible/
# cd openshift-ansible
# ansible-playbook -vvv -i $INVENTORY_FILE playbooks/common/openshift-cluster/openshift_logging.yml

$INVENTORY_FILE is your ansible inventory file used to do upgrade work.

Please use the attached 'ansible inventory file' to upgrade to logging 3.5, change the parameter according to your environment too. In the sample file, "ec2-52-202-98-194.compute-1.amazonaws.com" is master, ansible_ssh_private_key_file is your private key file, also please change openshift_logging_kibana_hostname,openshift_logging_kibana_ops_hostname,public_master_url, openshift_logging_fluentd_hosts and other parameters. The nodeSelector part does not need to be changed, they all the same with logging 3.3.1.

Comment 10 Junqi Zhao 2017-03-09 08:00:07 UTC

This defect is not related to nodeSelector, even not set nodeSelector for curator, es and kibana, after upgrade from 3.3.1 to 3.5.0 via ansible, es pod still can not start up, same error with this defect.

PS: Upgarde from 3.4.1 to 3.5.0 don't have such issue.

Comment 11 Rich Megginson 2017-03-09 21:56:48 UTC

(In reply to Junqi Zhao from comment #9)
> @rmeggins,
> 

0. I deployed an OSE 3.3 single host install using openshift-ansible-3.3:

yum install openshift-ansible openshift-ansible-docs openshift-ansible-callback-plugins openshift-ansible-filter-plugins openshift-ansible-lookup-plugins openshift-ansible-playbooks openshift-ansible-roles

then

cd /usr/share/ansible/openshift-ansible
ANSIBLE_LOG_PATH=/var/log/ansible.log ansible-playbook -vvv -i $INVENTORY playbooks/byo/config.yml

> 1. Please use the attached 'Deploy logging 3.3.1 shell script' to deploy
> logging 3.3.1, change the parameters according to your environment before
> deployment.

done

> 
> In my scenario, there is one Master and one Node, nodeSelector for fluentd
> is 'logging-infra-fluentd=true', nodeSelector for curator, es and kibana is
> 'logging-infra-east=true', since I have only one Node, so you will see the
> nodeSelector "logging-infra-fluentd=true" and "logging-infra-east=true" are
> both labeled for Node.

done

> 
> I suggest you use JSON-FILE as Logging driver, since it's slow to show log
> entry in Kibana UI.

yes

> 
> Please make sure logging entries can be found in Kibana before your upgrade.

# date -u
Thu Mar  9 21:28:19 UTC 2017
# oc exec logging-es-mgvnymku-1-0v5ny -- curl -s -k --cert /etc/elasticsearch/secret/admin-cert --key /etc/elasticsearch/secret/admin-key https://localhost:9200/.operations.*/_search?size=1\&sort=time:desc|python -mjson.tool
...
                "_index": ".operations.2017.03.09",
                    "time": "2017-03-09T16:28:22-05:00",

and

# oc exec logging-es-mgvnymku-1-0v5ny -- curl -s -k --cert /etc/elasticsearch/secret/admin-cert --key /etc/elasticsearch/secret/admin-key https://localhost:9200/logging.*/_search?size=1\&sort=time:desc|python -mjson.tool
...
                "_index": "logging.bcd3dfc9-04f0-11e7-aed4-fa163ed71416.2017.03.10",
                    "time": "2017-03-09T21:24:09.738471446Z",

so, Elasticsearch is up-to-date

> 
> 2. My openshift-ansible is installed by yum
> # rpm -qa | grep openshift-ansible
> openshift-ansible-3.5.20-1.git.0.5a5fcd5.el7.noarch
> openshift-ansible-docs-3.5.20-1.git.0.5a5fcd5.el7.noarch

I edited my /etc/yum.repos.d/rhaos.repo (puddle) file to look like this:

baseurl = http://download.eng.bos.redhat.com/rcm-guest/puddles/RHAOS/AtomicOpenShift/3.5/latest/x86_64/os/

Then I did `yum update openshift-ansible openshift-ansible-docs`

# rpm -q openshift-ansible openshift-ansible-docs
openshift-ansible-3.5.28-1.git.0.103513e.el7.noarch
openshift-ansible-docs-3.5.28-1.git.0.103513e.el7.noarch

> 
> you can install it from our puddle server
> 'rcm-guest/puddles/RHAOS/AtomicOpenShift/3.5/'
> 
> playbooks are git cloned from git clone
> https://github.com/openshift/openshift-ansible/

I'm not sure why you are using openshift-ansible from rpm packaging, but using git for the playbooks, when they are available from 

yum install openshift-ansible-callback-plugins openshift-ansible-filter-plugins openshift-ansible-lookup-plugins openshift-ansible-playbooks openshift-ansible-roles

but, ok

> 
> Use the following command to upgrade to 3.5.0 by ansible.
> # git clone https://github.com/openshift/openshift-ansible/
> # cd openshift-ansible
> # ansible-playbook -vvv -i $INVENTORY_FILE  
> playbooks/common/openshift-cluster/openshift_logging.yml
> 
> $INVENTORY_FILE is your ansible inventory file used to do upgrade work.
> 
> Please use the attached 'ansible inventory file' to upgrade to logging 3.5,
> change the parameter according to your environment too. In the sample file,
> "ec2-52-202-98-194.compute-1.amazonaws.com" is master,
> ansible_ssh_private_key_file is your private key file, also please change
> openshift_logging_kibana_hostname,openshift_logging_kibana_ops_hostname,
> public_master_url, openshift_logging_fluentd_hosts and other parameters. The
> nodeSelector part does not need to be changed, they all the same with
> logging 3.3.1.

ok.

I am now able to reproduce the problem.

Note that this _does not upgrade to ocp 3.5_ - this runs logging 3.5.0 containers _on top of ose 3.3_.  Do we even support that?

Comment 12 Rich Megginson 2017-03-09 23:06:35 UTC

submitted PR: https://github.com/openshift/openshift-ansible/pull/3616

Comment 13 Junqi Zhao 2017-03-10 05:36:29 UTC

@rmeggins,
We use OCP 3.5.0 now, so for this issue, Logging 3.3.1 was installed on OCP 3.5.0 and then upgrade to Logging 3.5.0.

Verified with your fix, ES pod is running now, but there are exceptions in ES log, see the attached file, and curator's status was changed from Running -> Error -> CrashLoopBackOff -> Running, and finally changed to CrashLoopBackOff, there is no log for curator pod. This issue did not happen before your fix.

# oc get po
NAME                          READY     STATUS      RESTARTS   AGE
logging-curator-2-fsc4p       1/1       Running     5          11m
logging-deployer-pvvxt        0/1       Completed   0          1h
logging-es-s6smjn2c-2-5wz6d   1/1       Running     0          56m
logging-fluentd-4kf0b         1/1       Running     0          55m
logging-kibana-2-5v6pq        2/2       Running     0          55m


openshift-ansible and playbooks are yum installed. 
version:
openshift-ansible-3.5.25-1.git.0.a40beae.el7.noarch
openshift-ansible-playbooks-3.5.25-1.git.0.a40beae.el7.noarch

# ansible --version
ansible 2.2.1.0

Comment 14 Junqi Zhao 2017-03-10 05:37:28 UTC

Created attachment 1261796 [details]
es pod log, SSL Problem Received fatal alert: unknown_ca

Comment 15 Rich Megginson 2017-03-10 15:01:39 UTC

Well, it looks like https://github.com/openshift/openshift-ansible/pull/3616 doesn't help in this case, but it should be fixed anyway.

Comment 16 openshift-github-bot 2017-03-10 17:47:36 UTC

Commit pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/bc3042fbb66f6a231056d665f2f82cdc6f6d4a3b
Bug 1428711 - [IntService_public_324] ES pod is unable to read searchguard.truststore after upgarde logging from 3.3.1 to 3.5.0

https://bugzilla.redhat.com/show_bug.cgi?id=1428711
The list of secrets for elasticsearch was missing searchguard.truststore

Comment 17 Rich Megginson 2017-03-15 02:49:21 UTC

ewolinetz jcantrill

The problem is that there are two different CA certs: one for the Elasticsearch certs which is created by the 3.5 ansible playbooks, and one for the other components created by the 3.3 deployer. Elasticsearch doesn't trust the 3.3 CA cert, and the other components do not trust the ES 3.5 CA.

I think the problem is the way ansible handles upgrade.

In 3.3 the certs are generated with an "ephemeral" ca, the key only exists inside the deployer pod:

+ openshift admin ca create-signer-cert --key=/etc/deploy/scratch/ca.key --cert=/etc/deploy/scratch/ca.crt --serial=/etc/deploy/scratch/ca.serial.txt --name=logging-signer-20170315001154

However, this CA key/serial is not saved anywhere, so it cannot be used again.

When 3.5 is installed for the first time, ansible will create a new CA (logging-signer-test) and create potential new certs/keys/truststores/keystores in case some are missing (generate-certs.yaml). However, it doesn't actually install them unless they are missing from the secrets in the openshift_logging_facts (generate_secrets.yaml). This is the case for elasticsearch. The task "Generating secrets for elasticsearch" will replace the existing secrets because the new list ["admin-cert", "searchguard.key", "admin-ca", "key", "truststore", "admin-key", "searchguard.truststore"] does not match the old list ["admin-ca", "admin-cert", "admin-key", "key", "searchguard.key", "truststore"]. The new certs use the new CA, and the old services don't trust ES (and vice versa).

In this case, ansible doesn't need to install new secrets for all of the above, perhaps just the searchguard.key which has quite a different format post 3.3. And the contents of searchguard.truststore are identical to truststore.

One workaround is to add the new CA cert to the the CA cert file of the other services, and add the old CA cert to the Elasticsearch CA and truststores. I guess this can be done by editing the secrets, I'll have to find out.

Comment 18 Rich Megginson 2017-03-15 02:51:52 UTC

Finally, from what I've been able to find out, upgrade directly from 3.3 to 3.5 is not supported.  The supported upgrade path is 3.3 to 3.4 to 3.5.  But there still may be a problem if there is no way to handle the CA certs correctly.  Eric, do you know if the old upgrade in 3.4 will correctly handle the certs as described above?  If so, then this bug is likely CLOSED INVALID.

Comment 19 Junqi Zhao 2017-03-15 07:57:48 UTC

@rmeggins,

Upgraded from 3.4.1 to 3.5.0 via ansible was successful, checked the upgrade log,
there were outputs:
No matching indices found - skipping update_for_uuid
No matching indexes found - skipping update_for_common_data_model

according to defect which about upgrade from 3.2 to 3.4:
https://bugzilla.redhat.com/show_bug.cgi?id=1395170#c3, 
4) "Observe in upgrade pod that this isn't seen "No matching indexes found - skipping update_for_common_data_model". 

Is it the same with upgrade from 3.4.1 to 3.5.0 via ansible, should not there have "No matching indexes found - skipping update_for_common_data_model" in upgrade log?

I remember the file roles/openshift_logging/files/fluent.conf, we have been changed to
@include configs.d/openshift/filter-viaq-data-model.conf

I think we can ignore this output, please correct me if I am wrong.

Comment 20 Junqi Zhao 2017-03-15 07:59:24 UTC

Created attachment 1263202 [details]
upgrade log from 3.4.1 to 3.5.0 via ansible

Comment 21 Rich Megginson 2017-03-15 16:26:55 UTC

(In reply to Junqi Zhao from comment #19)
> @rmeggins,
> 
> Upgraded from 3.4.1 to 3.5.0 via ansible was successful, checked the upgrade
> log,
> there were outputs:
> No matching indices found - skipping update_for_uuid
> No matching indexes found - skipping update_for_common_data_model
> 
> according to defect which about upgrade from 3.2 to 3.4:
> https://bugzilla.redhat.com/show_bug.cgi?id=1395170#c3, 
> 4) "Observe in upgrade pod that this isn't seen "No matching indexes found -
> skipping update_for_common_data_model". 
> 
> Is it the same with upgrade from 3.4.1 to 3.5.0 via ansible, should not
> there have "No matching indexes found - skipping
> update_for_common_data_model" in upgrade log?

It is not the same the upgrade from 3.4 to 3.5.  When upgrading from 3.4 to 3.5 I would expect to see "No matching indexes found - skipping update_for_common_data_model" because the 3.4 indices are already using the common data model.

> 
> I remember the file roles/openshift_logging/files/fluent.conf, we have been
> changed to
> @include configs.d/openshift/filter-viaq-data-model.conf
> 
> I think we can ignore this output, please correct me if I am wrong.

You are correct.

So there still may be a bug when upgrading from 3.3 to 3.4.  Was 3.3 to 3.4 upgrade tested for the OCP 3.4 release?  If so, I think we can close this bug.

At any rate, if you run into this situation, the workaround is this:

* dump all of your secrets that contain CA information e.g.
$ oc get secret logging-kibana \
    --template='{{index .data "ca"}}' | base64 -d > kibana.ca
$ oc get secret logging-elasticsearch \
    --template='{{index .data "truststore"}}' | base64 -d > es.truststore
$ oc get secret logging-elasticsearch \
    --template='{{index .data "key"}}' | base64 -d > es.key
$ oc get secret logging-elasticsearch \
    --template='{{index .data "admin-ca"}}' | base64 -d > es.ca

* for the PEM based ca files (kibana.ca, etc.) just append the es.ca:
$ cat es.ca >> kibana.ca

* for the jks based files, import the kibana.ca:
keytool -import -file kibana.ca -keystore es.truststore -storepass tspass -noprompt -alias old-ca
keytool -import -file kibana.ca -keystore es.key -storepass kspass -noprompt -alias old-ca

* base64 all of these:
$ for file in kibana.ca es.truststore .... ; do cat $file | base64 -w 0 > $file.b64 ; done

* edit the secrets e.g. oc edit secret logging-kibana, logging-elasticsearch, etc.
replace the ca value with the contents of the .b64 file.  For example, do oc edit secret logging-kibana - replace the value of the "ca:" key with the contents of kibana.ca.b64

* redeploy and restart all logging pods

Comment 22 Junqi Zhao 2017-03-16 02:15:21 UTC

@rmeggins,

We did 3.3 to 3.4 upgrade testing on OCP 3.4 release, but we also did 3.2 to 3.4 upgrade testing, it was successfully upgrade from logging 3.2 to 3.4.

Since we are not to fix this issue, I think this defect should be closed as WONTFIX instead of NOTABUG.

Note You need to log in before you can comment on or make changes to this bug.