Bug 1878857

Summary: Logging - Fixing a logic bug in elasticsearch output template
Product: Red Hat Enterprise Linux 8 Reporter: Noriko Hosoi <nhosoi>
Component: rhel-system-rolesAssignee: Noriko Hosoi <nhosoi>
Status: CLOSED ERRATA QA Contact: Guilherme Santos <gdeolive>
Severity: unspecified Docs Contact: Eliane Ramos Pereira <elpereir>
Priority: unspecified    
Version: 8.3CC: elpereir, gdeolive, kanderso, rmeggins, sradco
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rhel-system-roles-1.0-21.el8 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-16 14:23:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Noriko Hosoi 2020-09-14 17:38:57 UTC
Description of problem:
The elasticsearch logging_output has an option retryfailures.

When evaluated for "on", the retryfailures value was denied by "not", which
should not have been. Removing the "not".

How reproducible:
always

Steps to Reproduce:
1. Configure ovirt+elasticsearch with "retryfailures: yes" as follows.
  vars:
    logging_inputs:
      - name: ovirt_collectd_input
        type: ovirt
        subtype: collectd
      - name: ovirt_engine_input
        type: ovirt
        subtype: engine
      - name: ovirt_vdsm_input
        type: ovirt
        subtype: vdsm
    logging_outputs:
      - name: elasticsearch_output
        type: elasticsearch
        server_host: logging-es
        server_port: 9200
        index_prefix: project.
        input_type: ovirt
        retryfailures: on
        <<<snip>>>

2. Run ansible-playbook with the test playbook and inventory, etc.
3. Check retryfaiures value in /etc/rsyslog.d/31-output-elasticsearch-elasticsearch_output.conf
    $ grep retryfailure *.conf
    31-output-elasticsearch-elasticsearch_output.conf:            retryfailures="off"

Actual results:
    retryfailures="off"

Expected results:
    retryfailures="on"

Additional info:
  [Worlaround] To configure retryfailures = "on", set it to off in the elasticsearch logging_output as follows.
    logging_outputs:
      - name: elasticsearch_output
        type: elasticsearch
        server_host: logging-es
        server_port: 9200
        index_prefix: project.
        input_type: ovirt
        retryfailures: on

Comment 5 Noriko Hosoi 2020-10-26 16:07:50 UTC
Steps to reproduce for PR/181.
1. Configure an elasticsearch output with the following parameters.
     dynSearchIndex: false
     bulkmode: false
     dynbulkid: false
     allowUnsignedCerts: true
     retryfailures: false
     usehttps: false
2. Run ansible-playbook with the logging role playbook/inventory and check the generated rsyslog output that they are converted as follows:
     dynSearchIndex="off"
     bulkmode="off"
     dynbulkid="off"
     allowUnsignedCerts="on"
     retryfailures="off"
     usehttps="off"
3. Check rsyslogd is up and running with no errors.

Comment 6 Noriko Hosoi 2020-10-30 05:56:40 UTC
Hello @Guilherme,

Could you please set qa_ack+ in this bz?
Once you set it, we can respin Logging role including this fix.
Thanks!

Comment 7 Noriko Hosoi 2020-11-02 18:50:56 UTC
Hello @gdeolive, Shirly,
If you have any difficulties to set qa_ack+, could you share them with us?
Thanks.

Comment 8 Guilherme Santos 2020-11-09 12:36:33 UTC
It has been tested: https://bugzilla.redhat.com/show_bug.cgi?id=1862134#c1

ovirt-engine-metrics component integrates elasticsearch and ovirt logs through collectd and rsyslog and it's playbooks use ovirt logging roles with retryfailures option

Comment 9 Noriko Hosoi 2020-11-12 23:00:49 UTC
Guilherme, thank you for you updates!

Your qa_ack+ allows us to build an rpm package including this bug fix.

Unfortunately, we have to change this bug's status back to ON_QA since the bug verification has to be made to the specific version of rpm. (Sadly, we cannot count testing against the upstream code as the verification.)

Once the rpm package is ready, "Fixed In Version" in this bz will be updated with the new version ("Fixed In Version" is now empty since there is no rpm build yet). Please use the package and repeat the verification steps.

Thanks!

Comment 10 Noriko Hosoi 2020-11-13 01:53:38 UTC
Notes on Doc:
Setting "Doc Type" to "No Doc Update".
Since the oVirt input + elasticsearch output is new in 8.3.1 (See bz1889893), we do not have to document this bug in the release notes.
Bug 1889893 - Logging - Support oVirt input + elasticsearch output

Comment 13 Guilherme Santos 2021-01-05 14:03:05 UTC
Verified on:
rhel-system-roles-1.0-21.el8.noarch

Steps:
1. I mainly followed the same steps I used here: https://bugzilla.redhat.com/show_bug.cgi?id=1862134#c1
which backed my test with upstream on comment #4. Basically I ran/validated ovirt-engine-metrics playbooks that implement ovirt logging roles.

Results:
Playbooks ran successfully and ovirt-engine-metrics is working as expected (meaning, ovirt and hosts/rhel rsyslog data are being properly sent and fetched from elasticsearch)

Comment 14 Guilherme Santos 2021-01-05 14:36:21 UTC
Typo: on comment #13, I meant comment #8 not 4.
Also, just complementing the steps, this is the rsyslog generated file used for the verification:

# cat /etc/rsyslog.d/31-output-elasticsearch-elasticsearch_metrics_output.conf
(...)
        action(
            type="omelasticsearch"
            name="elasticsearch_metrics_output"
            server="X.X.X.X"
            serverport="9200"
            template="es_template"
            searchIndex="index_template"
            dynSearchIndex="on"
            bulkmode="on"
            writeoperation="create"
            bulkid="id_template"
            dynbulkid="on"
            allowUnsignedCerts="on"
            retryfailures="on"
            retryruleset="try_es"
            usehttps="off"
        )
(...)

Whereas the values used before the playbook run are the following:
(...)
          server_host: '{{ elasticsearch_host }}'
          server_port: '{{ elasticsearch_port|d(9200) }}'
          index_prefix: '{{ rsyslog_elasticsearch_index_prefix_metrics|d("project.ovirt-metrics") }}'
          bulkmode: '{{ rsyslog_elasticsearch_bulkmode_metrics|d(true) }}'
          writeoperation: '{{ rsyslog_elasticsearch_writeoperation_metrics|d("create") }}'
          bulkid: '{{ rsyslog_elasticsearch_bulkid_metrics|d("id_template") }}'
          dynbulkid: '{{ rsyslog_elasticsearch_dynbulkid_metrics|d(true) }}'
          retryfailures: '{{ rsyslog_elasticsearch_retryfailures_metrics|d(false) }}'
          retryruleset: '{{ rsyslog_elasticsearch_retryruleset_metrics|d("try_es") }}'
          usehttps: '{{ rsyslog_elasticsearch_usehttps_metrics|d(true) }}'
          allowUnsignedCerts: '{{ rsyslog_elasticsearch_allowunsignedcerts_metrics|d(false) }}'

(...)

Comment 16 errata-xmlrpc 2021-02-16 14:23:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (rhel-system-roles bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:0533