Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1531157 - logging-fluentd v3.9.0-0.16.0.2 immediately starts flooding "missing namespace" errors on startup
logging-fluentd v3.9.0-0.16.0.2 immediately starts flooding "missing namespac...
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging (Show other bugs)
3.9.0
x86_64 Linux
unspecified Severity high
: ---
: 3.9.0
Assigned To: Jeff Cantrill
Mike Fiedler
: TestBlocker
Depends On:
Blocks: 1502764
  Show dependency treegraph
 
Reported: 2018-01-04 12:01 EST by Mike Fiedler
Modified: 2018-03-28 10:17 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-03-28 10:17:25 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
logging-fluentd log (4.49 MB, text/plain)
2018-01-04 12:01 EST, Mike Fiedler
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Github openshift/origin-aggregated-logging/pull/883 None None None 2018-01-04 22:05 EST
Github openshift/origin-aggregated-logging/pull/898 None None None 2018-01-17 18:42 EST
Red Hat Product Errata RHBA-2018:0489 None None None 2018-03-28 10:17 EDT

  None (edit)
Description Mike Fiedler 2018-01-04 12:01:54 EST
Created attachment 1377024 [details]
logging-fluentd log

Description of problem:

The latest (as of 4 Jan) logging-fluentd image (v3.9.0-0.16.0.2) seems broken.  Immediately on startup, the fluentd pod starts flooding error messages complaining about missing namespaces with bad message content.   Partial message below, full log attached.   

There are no pods running on the system.  Docker is configured for json-file.
This issue was not seen with logging-fluentd v3.9.0-0.9.0

2018-01-04 16:43:04 +0000 [error]: record cannot use elasticsearch index name type project_full: record is missing kubernetes.namespace_id field: {"docker"=>{"container_id"=>"cdad990c2155e438df453d8caf4808424539ded32bba674536ff69df06b1e25e"}, "kubernetes"=>{"container_name"=>"fluentd-elasticsearch", "namespace_name"=>"logging", "pod_name"=>"logging-fluentd-7mcsn", "pod_id"=>"38f5d4c7-f16e-11e7-b343-024338e41dd2", "labels"=>{"component"=>"fluentd", "controller-revision-hash"=>"2355984793", "logging-infra"=>"fluentd", "pod-template-generation"=>"1", "provider"=>"openshift"}, "host"=>"ip-172-31-15-26.us-west-2.compute.internal", "master_url"=>"https://kubernetes.default.svc.cluster.local"}, "message"=>"\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

<snip - see attached for full messages>

Version-Release number of selected component (if applicable): logging-fluentd v3.9.0-0.16.2


How reproducible: Always when starting logging-fluentd


Steps to Reproduce:
1. Deploy logging v3.9.0-0.16.2 normally using openshift-ansible  (docker configured on all nodes for json-file)
2. Verify elasticsearch starts correctly
3. oc logs <fluentd pod>  for a system where no other pods are running

Actual results:

See attached errors.  Additionally, no pod logs appear in Elasticsearch indices.   Operations logs are created.


Expected results:

fluentd normal startup


Additional info:
Comment 1 Anping Li 2018-01-17 02:06:32 EST
323M fluentd.logs in 10 min. There are pods in Evicted. 

docker-registry-1-4nqhg       1/1       Running   0          20h
docker-registry-1-4sngb       0/1       Evicted   0          22h
docker-registry-1-78mz5       0/1       Evicted   0          22h
docker-registry-1-stnqn       0/1       Evicted   0          21h
docker-registry-1-tqsjk       0/1       Evicted   0          22h
Comment 3 Mike Fiedler 2018-01-22 14:18:30 EST
Problem still occurs on registry.reg-aws.openshift.com:443/openshift3/logging-fluentd:v3.9.0-0.22.0.0

registry.reg-aws.openshift.com:443/openshift3/logging-fluentd             v3.9.0-0.22.0.0     35b4c7263b16        2 days ago          275.5 MB
Comment 4 Jeff Cantrill 2018-01-22 16:10:25 EST
I don't see where [1] is in the latest puddles [2] which is the only way this issue will be resolved.  Can you help us out.  The gem [1] should be available in 3.6->3.9 puddles

[1] https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=646322
[2] http://download-node-02.eng.bos.redhat.com/rcm-guest/puddles/RHAOS/AtomicOpenShift/3.9/latest/x86_64/os/Packages/
Comment 5 Rich Megginson 2018-01-23 12:15:31 EST
(In reply to Jeff Cantrill from comment #4)
> I don't see where [1] is in the latest puddles [2] which is the only way
> this issue will be resolved.  Can you help us out.  The gem [1] should be
> available in 3.6->3.9 puddles
> 
> [1] https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=646322
> [2]
> http://download-node-02.eng.bos.redhat.com/rcm-guest/puddles/RHAOS/
> AtomicOpenShift/3.9/latest/x86_64/os/Packages/

1.0.1 was tagged into 3.9, 3.8, 3.7, 3.6, and those puddles were rebuilt.  You should be good to go for rebuilding the fluentd images for those releases.
Comment 6 Anping Li 2018-01-23 23:35:49 EST
The fix isn't in logging-fluentd/images/v3.9.0-0.23.0.0.
Comment 8 Mike Fiedler 2018-01-29 13:37:52 EST
Verified on 3.9.0-0.31.0.  logging-fluentd is working normally in this puddle.
Comment 11 errata-xmlrpc 2018-03-28 10:17:25 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489

Note You need to log in before you can comment on or make changes to this bug.