Bug 1442277 - ansible installer sometimes confuses PVs for metrics and logging
Summary: ansible installer sometimes confuses PVs for metrics and logging
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
: ---
Assignee: ewolinet
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-13 22:00 UTC by Wolfram Richter
Modified: 2017-08-16 19:51 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Feature: Ability to provide pv selectors for PVs created during installation Reason: When installing logging and metrics with the installer, some times a PV created for logging would be bound to a metrics PVC, creating confusion. Result: Now you can provide a pv selector in your inventory when installing logging and metrics and the PVs created will contain the appropriate label so that the generated PVCs will correctly bind.
Clone Of:
Environment:
Last Closed: 2017-08-10 05:20:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ansible inventory (5.02 KB, text/plain)
2017-04-13 22:00 UTC, Wolfram Richter
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1445071 0 low CLOSED Installer is tacking on additional -volume string to persistent volume names 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHEA-2017:1716 0 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.6 RPM Release Advisory 2017-08-10 09:02:50 UTC

Internal Links: 1445071

Description Wolfram Richter 2017-04-13 22:00:00 UTC
Created attachment 1271571 [details]
ansible inventory

Description of problem:
I installed a OCP 3.5 cluster via ansible advanced installer and configured it to roll out metrics + logging with it, using NFS shares as storage. Once the rollout has taken place, the PVs for metrics and logging are swapped.


Version-Release number of selected component (if applicable):

OpenShift Master:
    v3.5.5.5
Kubernetes Master:
    v1.5.2+43a9be4 

How reproducible:
50% - it seems to be random whether metrics and logging PVs are correct or swapped

Steps to Reproduce:
1. configure ansible inventory like in the attached sample
2. run installer
3. check assignment with oc get pv

Actual results:
ovpn-116-18:~ wolfram$ oc get pv
NAME             CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS      CLAIM                        REASON    AGE
logging-volume   10Gi       RWO           Retain          Bound       openshift-infra/metrics-1              12m
metrics-volume   10Gi       RWO           Retain          Bound       logging/logging-es-0                   12m

Expected results:
logging-volume is claimed in the logging project and metrics-volume is claimed in the openshift-infra project

Additional info:

Comment 1 Eric Paris 2017-04-17 18:24:15 UTC
marking this as low severity. Things work. NFS isn't supported for metrics as I understand it.

Now if we do want to fix it, we could/should partially fill in the claimRef when creating the PV. That way the binder will always find the right PVC.

In any case, other than it looks silly, things should be working just fine...

Comment 2 Johnny Liu 2017-04-18 02:44:38 UTC
(In reply to Eric Paris from comment #1)
> marking this as low severity. Things work. NFS isn't supported for metrics
> as I understand it.

According to [1] and [2], seem like our offical doc does not claim that NFS is not supported for logging and metrics. If we really not support that, we should remove all NFS storage from the above doc.


[1]: https://github.com/openshift/openshift-ansible/blob/master/inventory/byo/hosts.ose.example#L472
[2]: https://docs.openshift.com/container-platform/3.5/install_config/install/advanced_install.html#metrics-storage

Comment 17 Junqi Zhao 2017-06-20 09:54:12 UTC
# oc get pv
NAME             CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS    CLAIM                       STORAGECLASS   REASON    AGE
logging-volume   10Gi       RWO           Retain          Bound     logging/logging-es-0                                 44m
metrics-volume   10Gi       RWO           Retain          Bound     openshift-infra/metrics-1                            44m

But I tested on 
# openshift version
openshift v3.6.74
kubernetes v1.6.1+5115d708d7
etcd 3.1.0

It is not the latest OCP 3.6.0 puddle, will re-test tomorrow, but I believe this issue is fixed.

Comment 18 Junqi Zhao 2017-06-21 01:30:15 UTC
Tested again with latest OCP 3.6.0 puddle, issue fixed,it did not confuse PVs for metrics and logging

# oc get pv
NAME             CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS    CLAIM                       STORAGECLASS   REASON    AGE
logging-volume   10Gi       RWO           Retain          Bound     logging/logging-es-0                                 55m
metrics-volume   10Gi       RWO           Retain          Bound     openshift-infra/metrics-1                            55m

# openshift version
openshift v3.6.121
kubernetes v1.6.1+5115d708d7
etcd 3.2.0


Set it to VERIFIED

Comment 19 Junqi Zhao 2017-06-21 01:31:33 UTC
logging and metrics images from ops mirror

# docker images | grep logging
logging-elasticsearch   v3.6                41d18242fe2e        2 hours ago         404.5 MB
logging-kibana          v3.6                1f9fde5cd870        2 hours ago         342.4 MB
logging-fluentd         v3.6                7e35b19b4e40        2 hours ago         232.5 MB
logging-curator         v3.6                a0148dd96b8d        12 days ago         221.5 MB


# docker images | grep metrics
metrics-hawkular-metrics   v3.6                fe12100d5533        About an hour ago   1.293 GB
metrics-cassandra          v3.6                500536a9d23e        2 hours ago         572.8 MB
metrics-heapster           v3.6                821075975cc9        2 hours ago         274.4 MB

Comment 21 errata-xmlrpc 2017-08-10 05:20:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716


Note You need to log in before you can comment on or make changes to this bug.