Bug 1490647 - 3.6.2: logging-fluentd deployed with openshift_logging_use_mux=false fails to start due to missing mux secrets
Summary: 3.6.2: logging-fluentd deployed with openshift_logging_use_mux=false fails t...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.6.1
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: ---
: 3.7.z
Assignee: Noriko Hosoi
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-12 00:13 UTC by Mike Fiedler
Modified: 2017-11-28 22:10 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
When deploying logging-fluentd with secure-forward to send the collected logs to logging-mux, it requires openshift_logging_mux_client_mode=maximal with openshift_logging_use_mux=True in the ansible inventory if the fluentd container and the mux container are on the same node. If openshift_logging_mux_client_mode=maximal is set without openshift_logging_use_mux=True, the mux secret directory "/etc/fluent/muxkeys" is mounted in the fluentd container although the secret directory does not exist. It makes fluentd hang when it tries to access the mux secrets at the startup time. This patch checks the value of openshift_logging_mux_client_mode and openshift_logging_use_mux in the ansible playbook and if the former is true while the latter is false, then it does not mount the mux secret directory in the fluentd container. Also, the fluentd start script finds the mux secret directory does not exist, it disables openshift_logging_mux_client_mode even if it is enabled.
Clone Of:
Environment:
Last Closed: 2017-11-28 22:10:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:3188 0 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update 2017-11-29 02:34:54 UTC

Description Mike Fiedler 2017-09-12 00:13:44 UTC
Description of problem:

Deployed logging v3.6.173.0.32 with openshift_logging_use_mux=false.   The logging-fluentd pods all failed to start with the following event:

46s       4m        10        logging-fluentd-4nc46                      Pod                                                      Warning   FailedMount         kubelet, ip-172-31-9-234.us-west-2.compute.internal    MountVolume.SetUp failed for volume "kubernetes.io/secret/09afe9e0-974e-11e7-8bb7-027af757fb2c-muxcerts" (spec.Name: "muxcerts") pod "09afe9e0-974e-11e7-8bb7-027af757fb2c" (UID: "09afe9e0-974e-11e7-8bb7-027af757fb2c") with: secrets "logging-mux" not found
38s       2m        2         logging-fluentd-4nc46                      Pod                                                      Warning   FailedMount         kubelet, ip-172-31-9-234.us-west-2.compute.internal    Unable to mount volumes for pod "logging-fluentd-4nc46_logging(09afe9e0-974e-11e7-8bb7-027af757fb2c)": timeout expired waiting for volumes to attach/mount for pod "logging"/"logging-fluentd-4nc46". list of unattached/unmounted volumes=[muxcerts]
38s       2m        2         logging-fluentd-4nc46                      Pod                                                      Warning   FailedSync          kubelet, ip-172-31-9-234.us-west-2.compute.internal    Error syncing pod

Workaround is to remove references to the missing secrets

Version-Release number of selected component (if applicable):  logging v3.6.173.0.32


How reproducible: Always when openshift_logging_use_mux=false


Steps to Reproduce:
1.  Deploy logging with the inventory below (there are some mux vars present - my assumption is use_mux=false would be the master switch)
.

Actual results:

logging-fluentd pods fail to start with the above events re: missing logging-mux secrets

Expected results:

Successful deployment


[oo_first_master]
ip-172-31-24-59

[oo_first_master:vars]
openshift_deployment_type=openshift-enterprise
openshift_release=v3.6.0

openshift_logging_install_logging=true
openshift_logging_master_url=https://ec2-54-186-107-126.us-west-2.compute.amazonaws.com:8443
openshift_logging_master_public_url=https://ec2-54-186-107-126.us-west-2.compute.amazonaws.com:8443
openshift_logging_kibana_hostname=kibana.0907-ihc.qe.rhcloud.com
openshift_logging_namespace=logging
openshift_logging_image_prefix=registry.ops.openshift.com/openshift3/
openshift_logging_image_version=v3.6.173.0.32
openshift_logging_es_cluster_size=3
openshift_logging_es_pvc_dynamic=true
openshift_logging_es_pvc_size=50Gi
openshift_logging_fluentd_use_journal=true
openshift_logging_fluentd_read_from_head=false
openshift_logging_use_mux=false
openshift_logging_mux_client_mode=maximal
openshift_logging_use_ops=false

openshift_logging_es_cpu_limit=4000m
openshift_logging_fluentd_cpu_limit=500m
openshift_logging_mux_cpu_limit=1000m
openshift_logging_kibana_cpu_limit=200m
openshift_logging_kibana_proxy_cpu_limit=100m
openshift_logging_es_memory_limit=9Gi
openshift_logging_fluentd_memory_limit=512Mi
openshift_logging_mux_memory_limit=2Gi
openshift_logging_kibana_memory_limit=1Gi
openshift_logging_kibana_proxy_memory_limit=256Mi

openshift_logging_mux_file_buffer_storage_type=pvc
openshift_logging_mux_file_buffer_pvc_name=logging-mux-pvc
openshift_logging_mux_file_buffer_pvc_size=4Gi

Comment 1 Rich Megginson 2017-09-12 02:23:43 UTC
You have specified

openshift_logging_mux_client_mode=maximal

and

openshift_logging_use_mux=false

which one should take precedence?

Comment 2 Rich Megginson 2017-09-12 02:24:39 UTC
Ok, I see - the assumption is (and probably the more intuitive assumption) is that use_mux=false controls _all other mux parameters_.

Comment 3 Mike Fiedler 2017-09-12 02:31:12 UTC
+1 to comment 2.   Apologies for the mixed-up inventory.

Comment 4 Jeff Cantrill 2017-09-13 14:00:00 UTC
Do we need to modify ansible to resolve or is this user error?

Comment 5 Rich Megginson 2017-09-13 14:08:34 UTC
@jcantrill yes - if openshift_logging_use_mux=False, then all other mux related parameters should be set to False (if boolean) or removed  (e.g. openshift_logging_mux_client_mode should be undefined)

Comment 7 openshift-github-bot 2017-10-07 16:51:45 UTC
Commits pushed to master at https://github.com/openshift/origin-aggregated-logging

https://github.com/openshift/origin-aggregated-logging/commit/9954e962e62c8ef512ce6c7a0af7974f2afdbaf9
Bug 1490647 - logging-fluentd deployed with openshift_logging_use_mux=false fails to start due to missing

If openshift_logging_use_mux=False and openshift_logging_mux_allow_external=False,
then all other mux related parameters should be set to False (if boolean) or
removed (e.g. openshift_logging_mux_client_mode should be undefined).

To determine if mux is configured in run.sh, check whether the directory
/etc/fluent/muxkeys exits or not.  If it does not exist, MUX_CLIENT_MODE
is unset.

https://github.com/openshift/origin-aggregated-logging/commit/c50b8551f75b69fa4b61885d234b64c97647c2cd
Merge pull request #705 from nhosoi/bz1490647

Automatic merge from submit-queue.

Bug 1490647 - logging-fluentd deployed with openshift_logging_use_mux=false fails to start due to missing

If openshift_logging_use_mux=False and openshift_logging_mux_allow_external=False,
then all other mux related parameters should be set to False (if boolean) or
removed (e.g. openshift_logging_mux_client_mode should be undefined).

To determine if mux is configured in run.sh, check whether the directory
/etc/fluent/muxkeys exits or not.  If it does not exist, MUX_CLIENT_MODE
is unset.

Comment 8 Rich Megginson 2017-10-09 21:28:01 UTC
koji_builds:
  https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=605535
repositories:
  brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:rhaos-3.7-rhel-7-docker-candidate-91975-20171009193313
  brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:latest
  brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:v3.7.0
  brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:v3.7.0-0.146.0.1
  brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:v3.7

Comment 9 openshift-github-bot 2017-10-10 22:17:17 UTC
Commits pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/554a9281265e0234af6a1de4142c67f5f8061de1
Bug 1490647 - logging-fluentd deployed with openshift_logging_use_mux=false fails to start due to missing

If openshift_logging_use_mux=False and openshift_logging_mux_allow_external=False,
then all other mux related parameters should be set to False (if boolean) or
removed (e.g. openshift_logging_mux_client_mode should be undefined).

https://github.com/openshift/openshift-ansible/commit/af04da3ae11cfe5cc80de214a4ec665f1d1676b1
Merge pull request #5693 from nhosoi/bz1490647

Automatic merge from submit-queue.

 Bug 1490647 - logging-fluentd deployed with openshift_logging_use_mux=false fails to start due to missing

If openshift_logging_use_mux=False and openshift_logging_mux_allow_external=False,
then all other mux related parameters should be set to False (if boolean) or
removed (e.g. openshift_logging_mux_client_mode should be undefined).

Comment 10 Anping Li 2017-10-13 04:39:27 UTC
I deployed the logging with similar inventory using openshift-ansible:v3.7.0-0.148.0 successfully. The bug should be fixed.

Comment 14 errata-xmlrpc 2017-11-28 22:10:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188


Note You need to log in before you can comment on or make changes to this bug.