Description of problem: When pmlooger.service fails to start a working pmlogger process, this is not reflected in the systemd unit state. There aren't any journal entries either. Steps to Reproduce: 1. Misconfigure the system so that pmlogger will fail to start # systemctl stop pmcd 2. Start pmlogger # systemctl start pmlogger Actual results: systemctl status reports pmlogger as running, but pmlogger has actually failed. # systemctl status pmlogger ● pmlogger.service - Performance Metrics Archive Logger Loaded: loaded (/usr/lib/systemd/system/pmlogger.service; enabled) Active: active (exited) since Mon 2015-01-26 10:57:04 EET; 2s ago Docs: man:pmlogger(1) Process: 9912 ExecStop=/usr/share/pcp/lib/pmlogger stop (code=exited, status=0/SUCCESS) Process: 10090 ExecStart=/usr/share/pcp/lib/pmlogger start (code=exited, status=0/SUCCESS) Main PID: 10090 (code=exited, status=0/SUCCESS) # cat /var/log/pcp/pmlogger/f21.cockpit.lan/pmlogger.log Log for pmlogger on f21.cockpit.lan started Mon Jan 26 10:57:05 2015 pmlogger: Cannot connect to PMCD on host "local:": Connection refused Log finished Mon Jan 26 10:57:05 2015 Expected results: systemctl status pmlogger should report pmlogger as failed.
The situation is more convoluted than reflected here, I think. pmlogger can be configured to monitor (potentially many) remote systems and does not necessarily have to be configured to record from the local host. It's not always just a case of chkconfig pmlogger on, service start, and one daemon results - multiple loggers or none at all may need to be started (depends on the contents of the /etc/pcp/pmlogger/control configuration file). In summary, "its complicated". There are cron scripts active which verify that the pmloggers that are meant to be running, are running, based on the contents of the control file - so if the unfortunate case arises whereby pmlogger is wanting to monitor locally, and no local pmcd is started yet, the situation will resolve itself in due course. We can improve the situation further however, there's some upstream work being considered that would make this problem scenario go away entirely (some pre-cursor work to enabling pmlogger automatic-pmcd-reconnection). I had not considered that work in light of this problem though (so, thanks!) - perhaps we should be prioritising that work more highly. cheers.
see also http://oss.sgi.com/bugzilla/show_bug.cgi?id=1096
> The situation is more convoluted than reflected here, I think. pmlogger can > be configured to monitor (potentially many) remote systems and does not > necessarily have to be configured to record from the local host. Are you saying that it is impossible to say what the status of pmlogger.service is because it might consist of multiple processes that can each fail independently? I think http://oss.sgi.com/bugzilla/show_bug.cgi?id=1096 is a most excellent list of improvements.
In addition to the case where pmlogger.service is active but no pmlogger process is running, it is also possible that pmlogger.service is inactive but there is in fact a pmlogger process running. Steps from bug 1188193: 1. systemctl enable pmlogger 2. systemctl start pmlogger 3. systemctl stop pmlogger 4. sleep 1h or so 5. pgrep pmlogger
Marius, in this case, note that the pmlogger service is still -enabled-, and so the periodic cron jobs feel entitled to restart/keep-running pmlogger jobs listed in the control file.
observation - perhaps it would help if pmlogger.service was split out into pmlogger.service and pmlogger-farm.service (or some such name). The former would just deal with a single pmlogger monitoring the localhost (aka primary pmlogger). The latter would manage logging one or more remote hosts (if enabled). Thoughts?
(In reply to Frank Ch. Eigler from comment #5) > Marius, in this case, note that the pmlogger service is still -enabled-, > and so the periodic cron jobs feel entitled to restart/keep-running > pmlogger jobs listed in the control file. Yeah, I know. If that is how people expect PCP to work, fine, but the current integration with systemd is still useless and arguably harmful since it adds confusion and frustration for someone who knows systemd but not pcp. IMO. If the cron job feels entitled to start pmlogger, it should do that via "systemctl start pmlogger" or "service pmlogger start" and not behind their backs.
(In reply to Mark Goodwin from comment #6) > observation - perhaps it would help if pmlogger.service was split out into > pmlogger.service and pmlogger-farm.service (or some such name). I think so. Ignoring any compatibility concerns and without any knowledge how pmlogger is actually configured in detail, I would try to use unit templates and instantiate one per pmlogger process. Frank has filed http://oss.sgi.com/bugzilla/show_bug.cgi?id=1096 so I assumed you know about these issues. Why are we even discussing this beyond "patches welcome"? I would be willing to spend a few days producing and testing some patches for http://oss.sgi.com/bugzilla/show_bug.cgi?id=1096. Are you willing to take them?
(In reply to Mark Goodwin from comment #6) > observation - perhaps it would help if pmlogger.service was split out into > pmlogger.service and pmlogger-farm.service (or some such name). The former > would just deal with a single pmlogger monitoring the localhost (aka primary > pmlogger). The latter would manage logging one or more remote hosts (if > enabled). > > Thoughts? This issue goes away once we have pmlogger local context support and default logger using that, doesn't it Mark? (IOW, there is no dependence on a running pmcd at all then, for the default logger, and no dependence between start scripts, etc, etc). It would be a good idea to bump that work up the priority list & not complicate the scripts/configuration futher by splitting 'em, I think. cheers.
> This issue goes away once we have pmlogger local context support and default > logger using that, doesn't it Mark? Can a local context use all pmdas?
(In reply to Marius Vollmer from comment #10) > > This issue goes away once we have pmlogger local context support and default > > logger using that, doesn't it Mark? > > Can a local context use all pmdas? All DSO PMDAs (which is usually the most important ones, like the kernel PMDAs) ... but not all PMDAs.
This message is a reminder that Fedora 21 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 21. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '21'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 21 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 21 changed to end-of-life (EOL) status on 2015-12-01. Fedora 21 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.
Hey Marius, this is a RHEL bug too right? Maybe we should file it there?