Bug 1464394
| Summary: | Metrics not reporting after successful deployment | ||
|---|---|---|---|
| Product: | [oVirt] ovirt-engine-metrics | Reporter: | Lukas Svaty <lsvaty> |
| Component: | Generic | Assignee: | Shirly Radco <sradco> |
| Status: | CLOSED NOTABUG | QA Contact: | Lukas Svaty <lsvaty> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 1.0.4.3 | CC: | bugs, rmeggins |
| Target Milestone: | ovirt-4.2.0 | Keywords: | TestBlocker |
| Target Release: | --- | Flags: | rule-engine:
ovirt-4.2+
rule-engine: blocker+ |
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-06-26 13:31:26 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Metrics | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1419858, 1458735, 1459764 | ||
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP. These errors occur when the fluentd does not manage to connect to the remote fluentd. I don't believe this is a blocker to the other bug. Rich should check why the remote fluentd is having these errors: "2017-06-26 11:15:48 +0200 [warn]: emit transaction failed: error_class=Fluent::BufferQueueLimitError error=\"queue size exceeds limit\" tag=\"project.ovirt-metrics-lsvaty_test-@kibana-highlighted-field@ovirt@/kibana-highlighted-field@\"" So in the end this was 2 errors, that at least I am aware of. 1. Misconfiguration of Viaq setup, not all hostnames were resolvable from all the machines 2. Misconfiguration of non-ViaQ setup, fluentd was not able to establish connection due to outdated certificate. Due to these closing this issue. |
Description of problem: After successful deployment of engine metrics, nothing is being sent from fluentd. Tried with custom metrics store, as well as viaq setup. I am not sure which logs would you like to see as these nor syslog shows any hint of error. I got the environment can provide anything you need. Version-Release number of selected component (if applicable): ovirt-engine-metrics-1.0.4.3-1.el7ev.noarch How reproducible: 100% Steps to Reproduce: 1. Create config.yml with your metrics store 2. run /usr/share/ovirt-engine-metrics/setup/ansible/configure_ovirt_machines_for_metrics.sh 3. Check fluentd and collectd are successfully running Actual results: # No incoming packets to metrics store # tcpdump -n dst port 24284 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes Additional info: [root@ls-engine1 ~]# date && systemctl status collectd fluentd Fri Jun 23 12:36:50 CEST 2017 ● collectd.service - Collectd statistics daemon Loaded: loaded (/usr/lib/systemd/system/collectd.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/collectd.service.d └─postgresql.conf Active: active (running) since Fri 2017-06-23 11:15:11 CEST; 1h 21min ago Docs: man:collectd(1) man:collectd.conf(5) Main PID: 709 (collectd) CGroup: /system.slice/collectd.service └─709 /usr/sbin/collectd Jun 23 11:15:06 ls-engine1.example.com collectd[709]: plugin_load: plugin "swap" successfully loaded. Jun 23 11:15:06 ls-engine1.example.com collectd[709]: plugin_load: plugin "df" successfully loaded. Jun 23 11:15:06 ls-engine1.example.com collectd[709]: plugin_load: plugin "aggregation" successfully loaded. Jun 23 11:15:06 ls-engine1.example.com collectd[709]: plugin_load: plugin "processes" successfully loaded. Jun 23 11:15:06 ls-engine1.example.com collectd[709]: plugin_load: plugin "postgresql" successfully loaded. Jun 23 11:15:06 ls-engine1.example.com collectd[709]: plugin_load: plugin "write_http" successfully loaded. Jun 23 11:15:11 ls-engine1.example.com collectd[709]: Systemd detected, trying to signal readyness. Jun 23 11:15:11 ls-engine1.example.com systemd[1]: Started Collectd statistics daemon. Jun 23 11:15:11 ls-engine1.example.com collectd[709]: Initialization complete, entering read-loop. Jun 23 11:15:11 ls-engine1.example.com collectd[709]: Successfully connected to database engine (user engine) at server localhost:5432 (server version: 9.2.18, protocol version: 3, pid: 739) ● fluentd.service - Fluentd Loaded: loaded (/usr/lib/systemd/system/fluentd.service; enabled; vendor preset: disabled) Active: active (running) since Fri 2017-06-23 11:14:58 CEST; 1h 21min ago Docs: http://www.fluentd.org/ Main PID: 649 (fluentd) CGroup: /system.slice/fluentd.service ├─649 /usr/bin/ruby /usr/bin/fluentd -c /etc/fluentd/fluent.conf └─710 /usr/bin/ruby /usr/bin/fluentd -c /etc/fluentd/fluent.conf Jun 23 11:15:09 ls-engine1.example.com fluentd[649]: </match> Jun 23 11:15:09 ls-engine1.example.com fluentd[649]: </ROOT> Jun 23 11:15:09 ls-engine1.example.com fluentd[649]: 2017-06-23 11:15:09 +0200 [debug]: listening http on localhost:9880 Jun 23 11:15:09 ls-engine1.example.com fluentd[649]: 2017-06-23 11:15:09 +0200 [info]: following tail of /var/log/ovirt-engine/engine.log Jun 23 11:15:14 ls-engine1.example.com fluentd[649]: 2017-06-23 11:15:14 +0200 [warn]: dead connection found: lsvaty-vm1.example.com, reconnecting... Jun 23 11:15:14 ls-engine1.example.com fluentd[649]: 2017-06-23 11:15:14 +0200 fluent.warn: {"message":"dead connection found: lsvaty-vm1.example.com, reconnecting..."} Jun 23 11:15:14 ls-engine1.example.com fluentd[649]: 2017-06-23 11:15:14 +0200 [info]: connection established to lsvaty-vm1.example.com Jun 23 11:15:14 ls-engine1.example.com fluentd[649]: 2017-06-23 11:15:14 +0200 fluent.info: {"message":"connection established to lsvaty-vm1.example.com"} Jun 23 11:15:19 ls-engine1.example.com fluentd[649]: 2017-06-23 11:15:19 +0200 [warn]: recovered connection to dead node: lsvaty-vm1.example.com Jun 23 11:15:19 ls-engine1.example.com fluentd[649]: 2017-06-23 11:15:19 +0200 fluent.warn: {"message":"recovered connection to dead node: lsvaty-vm1.example.com"}0200 [info]: connection established to lsvaty-vm1.example.com.com Jun 23 11:15:14 ls-engine1.example.com.com fluentd[649]: 2017-06-23 11:15:14 +0200 fluent.info: {"message":"connection established to lsvaty-vm1.example.com.com"} Jun 23 11:15:19 ls-engine1.example.com.com fluentd[649]: 2017-06-23 11:15:19 +0200 [warn]: recovered connection to dead node: lsvaty-vm1.example.com.com Jun 23 11:15:19 ls-engine1.example.com.com fluentd[649]: 2017-06-23 11:15:19 +0200 fluent.warn: {"message":"recovered connection to dead node: lsvaty-vm1.example.com.com"}