Bug 1482661

Summary: We do not preserve the security context and node selector for the elasticsearch dc after running ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-logging.yml after upgade
Product: OpenShift Container Platform Reporter: Jeff Cantrill <jcantril>
Component: LoggingAssignee: Jeff Cantrill <jcantril>
Status: CLOSED ERRATA QA Contact: Anping Li <anli>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.6.0CC: aivaras.laimikis, anli, aos-bugs, erich, jcantril, jokerman, misalunk, mmccomas, nnosenzo, pdwyer, pportant, rmeggins, sdodson, xiazhao
Target Milestone: ---   
Target Release: 3.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The nodeSelector and supplementalGroups from existing deployments were being ignored because it was believed these parameters should be defined in the inventory file. There was no way to define values for each DC which hinders the usecase where a deployment is using hostmount volumes Consequence: The nodeSelector and supplementalGroups were being replaces with those defined in the inventory file. Fix: Use nodeSelector and supplementalGroups from logging facts if they exist for a given DC Result: nodeSelector and supplementalGroups are retained when applied changes.
Story Points: ---
Clone Of: 1478771 Environment:
Last Closed: 2017-10-17 11:45:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1478771    
Bug Blocks:    
Attachments:
Description Flags
The deployment config before and after upgrade none

Comment 1 Jeff Cantrill 2017-08-17 21:01:25 UTC
Fixed in PR: openshift-ansible/pull/5124

Comment 2 Rich Megginson 2017-09-08 16:13:56 UTC
Has this been fixed in 3.6 or 3.7 atomic-openshift-ansible?  Can we move this to ON_QA?

Comment 4 Anping Li 2017-09-14 09:01:55 UTC
Using openshift-ansible-3.6.173.0.33, the securityContext and nodeSelector are as before.
#oc get dc logging-es-gcpg63hx -o yaml | grep security -A 1
      securityContext:
        supplementalGroups:
#oc get dc logging-es-gcpg63hx -o yaml | grep nodeSelector -A 2
      nodeSelector:
        logging-es-node: "1"
      restartPolicy: Always
#oc get dc logging-es-mbtd8lz3 -o yaml | grep security -A 1
      securityContext:
        supplementalGroups:
#oc get dc logging-es-mbtd8lz3 -o yaml | grep nodeSelector -A 2
      nodeSelector:
        logging-es-node: "2"
      restartPolicy: Always
#oc get dc logging-es-tybl6f11 -o yaml | grep security -A 1
      securityContext:
        supplementalGroups:
#oc get dc logging-es-tybl6f11 -o yaml | grep nodeSelector -A 2
      nodeSelector:
        logging-es-node: "3"
      restartPolicy: Always

Comment 5 Anping Li 2017-09-14 09:03:49 UTC
The  openshift-ansible-3.6.173.0.33 aren't in the errata 30362, can you confirm if we need an installer error.

Comment 6 Anping Li 2017-09-21 11:29:51 UTC
@scott, The fix is in openshift-ansible. could you move this bug to the proper installer errata?

Comment 9 Anping Li 2017-09-26 03:37:09 UTC
I miss one securityContext in comment 10. There are two securityContext in DC. 
The first is created by ansible. 
The second  is created by 'oc patch' command following the document [1] 
The second one is still overwrote when using the openshift-ansible-3.5.125 with the fix PR.

@Jeff, could you confirm if we need to persist the second securityContext.?


[1]
https://docs.openshift.com/container-platform/3.5/install_config/aggregate_logging.html-> Persistent Elasticsearch Storage -> 2. Each Elasticsearch replica definition must be patched to claim that privilege, for example:
$ for dc in $(oc get deploymentconfig --selector logging-infra=elasticsearch -o name); do
    oc scale $dc --replicas=0
    oc patch $dc \
       -p '{"spec":{"template":{"spec":{"containers":[{"name":"elasticsearch","securityContext":{"privileged": true}}]}}}}'
  done

Comment 10 Anping Li 2017-09-26 07:57:56 UTC
Created attachment 1330898 [details]
The deployment config before and after upgrade

[1] and [2] are removed during upgrade.  the nodeSelector[2] was preserved. From upgrade perspective, we should preserve all changes expect for there are change request from inventory.


[1]
securityContext:
  privileged: true
[2]
- hostPath:
    path: /usr/local/es-storage 
  name: elasticsearch-storage
[3]
  nodeSelector:
    logging-node: "1"

Comment 11 Jeff Cantrill 2017-09-27 18:44:26 UTC
The fix addresses the original issues.  Moved items from #c10 to https://bugzilla.redhat.com/show_bug.cgi?id=1496271

Comment 12 Anping Li 2017-09-28 01:47:40 UTC
The nodeSelector are persisten, so move to verified.  Note that securityContext issue will be fixed in Bug 1496271.

Comment 13 openshift-github-bot 2017-10-05 10:45:10 UTC
Commit pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/ec7d1b04ef91a3d10675efe1c53a88ef100437b8
bug 1482661. Preserve ES dc nodeSelector and supplementalGroups

(cherry picked from commit 601e35cbf4410972c7fa0a1d3d5c6327b82353ac)

Comment 15 errata-xmlrpc 2017-10-17 11:45:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2900