Bug 1579876 - MMV stats disappear rendering pmlogger unable to restart
Summary: MMV stats disappear rendering pmlogger unable to restart
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Logging
Version: 6.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium vote
Target Milestone: Released
Assignee: satellite6-bugs
QA Contact: Sanket Jagtap
URL:
Whiteboard:
Depends On: 1586051
Blocks: 1537078
TreeView+ depends on / blocked
 
Reported: 2018-05-18 14:24 UTC by Sanket Jagtap
Modified: 2019-10-07 17:20 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-05-14 12:37:19 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:1222 None None None 2019-05-14 12:37:27 UTC

Description Sanket Jagtap 2018-05-18 14:24:46 UTC
Description of problem:
Problem is related to the new telemetry support that satellite 6.4.0 introduces

Version-Release number of selected component (if applicable):
Build: Satellite 6.4.0 snap3

How reproducible:


Steps to Reproduce:
1. Configure telemetry on the satellite 
2. Watch mmv.* metrics
3.

Actual results:
MMV stats disappear 

Expected results:
MMV stats should not disapper

Additional info:

Comment 2 Lukas Zapletal 2018-05-22 08:41:12 UTC
Yes, as a workaround delete all logs in /var/log/pcp/HOSTNAME/* I need to fix this.

Comment 4 Lukas Zapletal 2018-06-05 12:23:00 UTC
So Nathan from PCP identified the issue in MVV/PCP codebase, we have a patch that was merged upstream. I asked PCP guys to backport it into RHEL 7.5:

https://bugzilla.redhat.com/show_bug.cgi?id=1586051

In the meantime, you can continue testing with this PCP version:

https://copr.fedorainfracloud.org/coprs/lzap/pcp/

Just drop the repo file and upgrade all pcp packages, restart all services:

rm -rf /var/log/pcp/pmlogger/*/*
systemctl restart pmcd pmlogger pmie pmwebd

And start over with testing. Make sure this command does not print any error after one hour or one week of uptime:

echo "log mandatory on 30seconds mmv" | /usr/bin/pmlc -P
Connected to primary pmlogger at local:

You should see lots of mmv metrics:

pminfo | grep mmv

Also Grafana should work fine. If you don't see a metric which you expect, just run the "pmlc" command from above and it will show up.

Comment 8 Sanket Jagtap 2018-09-11 05:55:53 UTC
Build: Satellite 6.4.0 snap21

[root@smqa-x3550m3-03 ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.6 Beta (Maipo)
[root@smqa-x3550m3-03 ~]# rpm -qa | grep pcp
pcp-conf-4.1.0-4.el7.x86_64
pcp-mmvstatsd-0.4-1.el7sat.x86_64
pcp-webapp-grafana-3.12.2-5.el7.noarch
pcp-pmda-apache-4.1.0-4.el7.x86_64
pcp-webapp-vector-3.12.2-5.el7.noarch
pcp-selinux-4.1.0-4.el7.x86_64
pcp-4.1.0-4.el7.x86_64
pcp-webapi-4.1.0-4.el7.x86_64
pcp-libs-4.1.0-4.el7.x86_64


I tested telemetry with 7.6 Beta Vault Build, Pmlogger was still functioning as expected after 3 days. No errors were observed in log. 

I feel this bug should still kept ON_QA until tested with 7.6 GA build, to be sure we don't miss anything when we announce the feature after/for 7.6 GA

Comment 11 Sanket Jagtap 2018-12-18 06:58:36 UTC
Build: Satellite 6.5.0 snap 8 on RHEL7.6

 rpm -qa | grep pcp
pcp-4.1.0-5.el7_6.x86_64
pcp-conf-4.1.0-5.el7_6.x86_64
pcp-mmvstatsd-0.4-2.el7sat.x86_64
pcp-libs-4.1.0-5.el7_6.x86_64
pcp-webapp-grafana-4.1.0-5.el7_6.noarch
pcp-selinux-4.1.0-5.el7_6.x86_64
pcp-webapp-vector-4.1.0-5.el7_6.noarch
pcp-pmda-apache-4.1.0-5.el7_6.x86_64
pcp-webapi-4.1.0-5.el7_6.x86_64


The logger didn't error and was functioning after 48 hours. No errors were recorded in Logs

 pmval mmv.fm_rails_http_request_total_duration.hosts_controller.new

metric:    mmv.fm_rails_http_request_total_duration.hosts_controller.new
host:      ---
semantics: instantaneous value
units:     millisec
samples:   all

                 mean                   min                   max              variance    standard_deviation 
             921.0                 606.0                1236.                 9.922E+04              315.0    
             921.0                 606.0                1236.                 9.922E+04              315.0    
             921.0                 606.0                1236.                 9.922E+04              315.0    
             921.0                 606.0                1236.                 9.922E+04              315.0    
             921.0                 606.0                1236.                 9.922E+04              315.0    
             921.0                 606.0                1236.                 9.922E+04              315.0    
             921.0                 606.0                1236.                 9.922E+04              315.0

Comment 14 errata-xmlrpc 2019-05-14 12:37:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:1222


Note You need to log in before you can comment on or make changes to this bug.