Description of problem: The default log level is 'info', and Fluentd outputs 'info', 'warn', 'error' and 'fatal' logs by default. We want to decrease verbosity level to log only 'error' messages. We will make this value as a parameter that can be updated for debugging proposes.
Please check with the viaq team why this doesn't come out of the box with the log level you are suggesting.
Also, add the use case to why we need to change this.
Description of problem: Currently when the client fluentd fails to connect to the remote fluentd(mux) it starts logging warning messages every retry. Suggested solution is to decrease verbosity level to log only 'error' messages.
Shirly - status? devel-ack?
Don't think this is an RFE. Please consider removing FutureFeature keyword.
I left the default log level as info and added the log_level as an ansible parameter. Now user can set it in the config.yml as needed.
2 issues found with this RFE that needs fixing: ovirt-engine-metrics-1.1.1-0.0.master.20171001113530.el7.centos.noarch 1. Missing documentation in /etc/ovirt-engine-metrics/config.yml.example Please add it so we have all options mentioned here consistently. 2. This option needs data validation, at the moment you can add any value ('wrong-value') and break fluentd in all your machines. [root@10-37-138-17 ~]# grep log_level /etc/ovirt-engine-metrics/config.yml fluentd_log_level: wrong-value ... run configure-playbook.... [root@10-37-138-17 ~]# grep -R log_level /etc/fluentd/config.d/ /etc/fluentd/config.d/05-ovirt-fluentd-system-configurations.conf: log_level wrong-value /etc/fluentd/config.d/10-http-input.conf: @log_level debug
(In reply to Lukas Svaty from comment #7) > 2 issues found with this RFE that needs fixing: > ovirt-engine-metrics-1.1.1-0.0.master.20171001113530.el7.centos.noarch > > 1. Missing documentation in /etc/ovirt-engine-metrics/config.yml.example > Please add it so we have all options mentioned here consistently. Added a patch with README files for all metrics roles. > > 2. This option needs data validation, at the moment you can add any value > ('wrong-value') and break fluentd in all your machines. > > [root@10-37-138-17 ~]# grep log_level /etc/ovirt-engine-metrics/config.yml > fluentd_log_level: wrong-value > > ... run configure-playbook.... > > [root@10-37-138-17 ~]# grep -R log_level /etc/fluentd/config.d/ > /etc/fluentd/config.d/05-ovirt-fluentd-system-configurations.conf: > log_level wrong-value > /etc/fluentd/config.d/10-http-input.conf: @log_level debug We do not validate every parameter. I don't think this is required. This would mean the user will see that the fluentd failed to start. According to what I see, the log message is very clear "unexpected error error="Unknown log level: level = wrong-value""
In my opinion, We can move this back to ON_QA and add an RFE if needed to validate the values. But please note that there are many parameters that we don't validate and can break fluentd in all your machines.
For QA part I believe verification is required if we add a way to set these parameters, otherwise, admin can just change it in a fluentd configuration file with the same effect and outcome. Don't see a reason to check some parameters (such as env_name) and not to check the others. If you believe this is out of scope 4.2 we can add RFE as you suggested otherwise from the QA perspective I would suggest fixing this.
env_name is a parameter unrelated to fluentd. It has much more complexity since it requires to be a valid openshift namespace. Didi, what is your take on this?
Re validation: So far, we have these in config.yml.example - and please note that you can add there many other things, these are only the documented ones: file/host names: ================ - fluentd_fluentd_host - local_fluentd_ca_cert_path It will be nice to verify them, but IMO bad values should be quite obvious to understand/debug, so not that important. Can still open future RFEs if you want. Arbitrary data: =============== - fluentd_shared_key We can't do any syntax validation, as it has no syntax. Only validation we can do is to try to connect to the remote fluentd and see if we succeed. This is not trivial, but might still be useful. So can also open RFE if you want. - ovirt_env_name Does have syntax, already handled. Not sure we can do more than that, but if we can, might be useful, so perhaps consider opening an RFE. Example: Might be possible to somehow connect to remote fluentd/elastic and ask them what they think about the name - is it in use (so avoid collisions), is it valid (so handle cases where our validation is wrong/not-up-to-date), etc. enum: ===== (Not sure this is the best title, but I think you get what I mean) - fluentd_log_level I agree it should be in config.yml.example. Perhaps we should consider trying to create the file automatically, no idea how. If it's possible to attach some kind of "tag" to ansible vars, we can tag the vars that should be there, and try to write code that will loop over all vars with the tag and create a file with the associated doc comment. Might be cool as a kind of ansible-hacking-experience, not that important... I agree it makes sense to validate it, I'd open another RFE for this. If we add it to the example file with a proper comment, I'd say such an RFE can nicely wait till 4.3.
I'd also like to comment that IMO setting fluentd_log_level to error is an ugly hack, not a solution. If fluentd does not behave the way we want it to, it should be fixed. If it can't currently (didn't check) throttle warnings etc. to make the log less verbose (e.g. have a parameter skip_warnings_time or whatever that makes it not repeat the same warning for the configured time duration, or whatever), then we should open a bug on it.
Please open an FRE for data validation for this variable.
Missing documentation in config.yml.example Tested in ovirt-engine-metrics-1.1.1-0.0.master.20171017080230.el7.centos.noarch
Validation part moved here -> BZ#1504051
verified in ovirt-engine-metrics-1.1.1-0.3.beta2.20171114114644.el7ev.noarch [root@1-1-1-1ansible]# grep -R log_level . ./roles/ovirt_fluentd/http_input/templates/http-input.conf: @log_level debug ./roles/ovirt_fluentd/set_fluentd_system_configurations/README.md:- `fluentd_log_level:` (default: `"info"`) ./roles/ovirt_fluentd/set_fluentd_system_configurations/README.md: configure_ovirt_machines_for_metrics.sh -e "fluentd_log_level=debug" ./roles/ovirt_fluentd/set_fluentd_system_configurations/README.md: fluentd_log_level: debug ./roles/ovirt_fluentd/set_fluentd_system_configurations/defaults/main.yml:fluentd_log_level: info ./roles/ovirt_fluentd/set_fluentd_system_configurations/templates/ovirt-fluentd-system-configurations.conf: log_level {{ fluentd_log_level }}
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017. Since the problem described in this bug report should be resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.