Bug 1920839 - Systemd-journald MachineConfig ERROR in OCP 4.7 cluster logging stack for collecting (kernal, services ,deamons) logs to elasticsearch and its view in Kibana.
Summary: Systemd-journald MachineConfig ERROR in OCP 4.7 cluster logging stack for col...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Documentation
Version: 4.7
Hardware: s390x
OS: Linux
medium
medium
Target Milestone: ---
: 4.7.0
Assignee: Rolfe Dlugy-Hegwer
QA Contact: Kabir Bharti
Petr Kovar
URL:
Whiteboard:
Depends On:
Blocks: ocp-47-z-tracker
TreeView+ depends on / blocked
 
Reported: 2021-01-27 06:19 UTC by Sanjaya
Modified: 2021-06-21 14:53 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-06-21 14:52:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ocp4-7 config error when applying machine config (95.46 KB, image/png)
2021-01-27 06:24 UTC, Sanjaya
no flags Details
OCP 4-6 node service logs (89.42 KB, image/jpeg)
2021-01-27 06:26 UTC, Sanjaya
no flags Details
OCP 4-6 node service logs in Kibana (192.78 KB, image/jpeg)
2021-01-27 06:28 UTC, Sanjaya
no flags Details
journald-setup-and-its-output (81.46 KB, image/png)
2021-01-28 04:29 UTC, Sanjaya
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift openshift-docs pull 28995 0 None open BZ#1920839 Fix variable expansion in logging docs 2021-06-17 12:47:08 UTC
Red Hat Issue Tracker RHDEVDOCS-3074 0 Low Closed Tracker: Bug 1920839 - Systemd-journald MachineConfig ERROR in OCP 4.7 cluster logging stack for collecting (kernal, ser... 2021-06-21 14:53:56 UTC

Description Sanjaya 2021-01-27 06:19:06 UTC
Description of problem:
Hi,

I've tried to configure systemd-journald for cluster logging to collect (kernel, services ,daemons) logs in elasticsearch and its visualization in Kibana UI.  Unfortunately, when executed the command(oc apply -f machineconfiguration-worker.yaml),getting some parsing ERROR  . but same has been verified and its working fine in OCP 4.6.
)

reference Doc- https://docs.openshift.com/container-platform/4.6/logging/config/cluster-logging-systemd.html

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Setup cluster logging instance 
2. Create a journald.conf file with the required settings.
3. Convert the journal.conf file to base64: // export jrnl_cnf=$( cat /journald.conf | base64 -w0 )
4.Create a new MachineConfig object for worker (copied from doc) and apply it.

Actual results:


Expected results:
systemd journal logs ( kernal logs and services and deamons logs ) should be collected in elasticsearch and visualized in kibana.

Additional info:

#cat machineconfiguration-worker.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 50-corp-journald
spec:
  config:
    ignition:
      version: 3.1.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,${jrnl_cnf}
        mode: 0644
        overwrite: true
        path: /etc/systemd/journald.conf


ERROR : #oc apply -f machineconfiguration-worker.yaml, then #oc describe machineconfigpool/worker

parsing Ignition config spec v3.1 failed with error: config is not valid
Report: error at $.storage.files.0.contents.source, line 1 col 75: invalid data character .

Comment 1 Sanjaya 2021-01-27 06:24:57 UTC
Created attachment 1751138 [details]
ocp4-7 config error when applying machine config

Comment 2 Sanjaya 2021-01-27 06:26:21 UTC
Created attachment 1751139 [details]
OCP 4-6 node service logs

This was working fine in ocp 4.6

Comment 3 Sanjaya 2021-01-27 06:28:06 UTC
Created attachment 1751141 [details]
OCP 4-6 node service logs in Kibana

This was working fine in ocp 4.6

Comment 4 Carvel Baus 2021-01-27 19:25:28 UTC
The error appears to be in whatever your environment has $jrnl_cnf defined as. It appears it does not like something when that is expanded. 

Can you provide the output of

 echo $jnrl_cnf 

as well as the contents of the journal.conf file you created? I did not see either attached.

Comment 6 Sanjaya 2021-01-28 04:25:39 UTC
Hi,

please find below output of $jnrl-cnf and input file.
[root@bastion ~]# echo $jrnl_cnf
Q29tcHJlc3M9eWVzIApGb3J3YXJkVG9Db25zb2xlPW5vIApGb3J3YXJkVG9TeXNsb2c9bm8KTWF4UmV0ZW50aW9uU2VjPTFtb250aCAKUmF0ZUxpbWl0QnVyc3Q9MTAwMDAgClJhdGVMaW1pdEludGVydmFsPTFzClN0b3JhZ2U9cGVyc2lzdGVudCAKU3luY0ludGVydmFsU2VjPTFzIApTeXN0ZW1NYXhVc2U9OGcgClN5c3RlbUtlZXBGcmVlPTIwJSAKU3lzdGVtTWF4RmlsZVNpemU9MTBNIAo=
[root@bastion ~]#
[root@bastion ~]# cat /journald.conf
Compress=yes
ForwardToConsole=no
ForwardToSyslog=no
MaxRetentionSec=1month
RateLimitBurst=10000
RateLimitInterval=1s
Storage=persistent
SyncIntervalSec=1s
SystemMaxUse=8g
SystemKeepFree=20%
SystemMaxFileSize=10M
[root@bastion ~]#
[root@bastion ~]# cat machineconfiguration-worker.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 50-corp-journald
spec:
  config:
    ignition:
      version: 3.1.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,${jrnl_cnf}
        mode: 0644
        overwrite: true
        path: /etc/systemd/journald.conf
[root@bastion ~]# oc get mcp
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-7cf67c5854a43275b73735c04016803b   True      False      False      3              3                   3                     0                      14d
worker   rendered-worker-0026e514b5a56359c5a2827f360d4cbe   True      False      True       3              3                   3                     0                      14d


Thanks.

Comment 7 Sanjaya 2021-01-28 04:29:14 UTC
Created attachment 1751563 [details]
journald-setup-and-its-output

Comment 8 Andy McCrae 2021-02-01 10:46:30 UTC
Hi Sanjaya, 

The issue is that the jrnl_cnf variable needs to be expanded in the yaml file before applying the machine config. If you follow this document you should be successful in making the change: https://docs.openshift.com/container-platform/4.6/post_installation_configuration/machine-configuration-tasks.html#machineconfig-modify-journald_post-install-machine-configuration-tasks - note the cat << EOF, which will create the file with the variable expanded. I've made a PR to update the documentation you linked to match this approach: https://github.com/openshift/openshift-docs/pull/

Comment 9 Sanjaya 2021-02-04 08:17:29 UTC
(In reply to Andy McCrae from comment #8)
> Hi Sanjaya, 
> 
> The issue is that the jrnl_cnf variable needs to be expanded in the yaml
> file before applying the machine config. If you follow this document you
> should be successful in making the change:
> https://docs.openshift.com/container-platform/4.6/
> post_installation_configuration/machine-configuration-tasks.
> html#machineconfig-modify-journald_post-install-machine-configuration-tasks
> - note the cat << EOF, which will create the file with the variable
> expanded. I've made a PR to update the documentation you linked to match
> this approach: https://github.com/openshift/openshift-docs/pull/

Thanks Andy, I've followed the above steps as per the attached doc, now the var jrnl_cnf able to expended using "EOF-here document" ,and MCP is working as expected. post that  the "journald.conf" is reflected on each worker nodes,  and able to collect kernel,daemon,service logs from worker nodes too.

#oc describe mc/40-worker-custom-journald
....
      Version:  3.1.0
    Networkd:
    Passwd:
    Storage:
      Files:
        Contents:
          Source:  data:text/plain;charset=utf-8;base64,Q29tcHJlc3M9eWVzCkZvcndhcmRUb0NvbnNvbGU9bm8KRm9yd2FyZFRvU3lzbG9nPW5vCk1heFJldGVudGlvblNlYz0xbW9udGgKUmF0ZUxpbWl0QnVyc3Q9MTAwMDAKUmF0ZUxpbWl0SW50ZXJ2YWw9MXMKU3RvcmFnZT1wZXJzaXN0ZW50ClN5bmNJbnRlcnZhbFNlYz0xcwpTeXN0ZW1NYXhVc2U9OGcKU3lzdGVtS2VlcEZyZWU9MjAlClN5c3RlbU1heEZpbGVTaXplPTEwTQo=
          Verification:
        Filesystem:  root
        Mode:        420
        Path:        /etc/systemd/journald.conf
    Systemd:
  Os Image URL:
Events:          <none>

Comment 10 Dan Li 2021-03-15 16:52:02 UTC
Hi Andy, should this bug still be assigned to you since it's Documentation?

Comment 11 Dan Li 2021-04-01 13:16:10 UTC
Hi Andy, following up - can we link your PR in Comment 8 to this bug or put a full path of the PR in a comment (currently the address is incomplete)

Comment 12 Andy McCrae 2021-06-17 12:47:10 UTC
Apologies, I missed this update - and I didn't paste the full link accidentally.
I've tracked down the change so the full PR URL should be: https://github.com/openshift/openshift-docs/pull/28995 - which I've added as a link to the bz.


Note You need to log in before you can comment on or make changes to this bug.