Bug 1188193
Summary: | pmlogger is spontaneously restarted after systemctl stop | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Marius Vollmer <mvollmer> |
Component: | pcp | Assignee: | Nathan Scott <nathans> |
Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 21 | CC: | brolley, fche, lberk, mgoodwin, nathans, pcp, scox |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-02-02 15:56:59 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1185740 |
Description
Marius Vollmer
2015-02-02 10:10:46 UTC
This is expected behavior with the class service-pmlogger. There are cron jobs running every 30 mins or so that restart any dead pmloggers, related to the less-frequent log-rotation cron jobs. If you wish to disable pmlogger, you must systemctl disable it, not just systemctl stop it. (With pmmgr, this would not happen.) > There are cron jobs running every 30 mins or so that restart any dead
> pmloggers, related to the less-frequent log-rotation cron jobs.
> If you wish to disable pmlogger, you must
> systemctl disable it, not just systemctl stop it.
FWLIW, this is not quite correct - the /etc/pcp/pmlogger/control file defines a set of expected hosts to be monitored by pmlogger processes. It is the combination of an entry (or entries) in this file and the pmlogger service enablement state that defines whether the cron and init scripts will start pmlogger(s).
By default, we enable a localhost entry in the pmlogger control file (which is the pmlogger you are observing, Marius) but sysadmins can and do certainly add remote monitoring for other hosts too. IOW, please take care if/when disabling this service.
cheers.
(In reply to Frank Ch. Eigler from comment #1) > This is expected behavior with the class service-pmlogger. It is not, however, expected behavior of a systemd unit. When switching data collection on/off via Cockpit, we will both start/stop and enable/disable the pmlogger service, so this bug will not affect us much there. But we probably also want to point out problems with pmlogger, and it would be nice to use the normal mechanisms for that: systemd unit status, including some lines from the journal. We would just point to the generic systemd UI for this and be done. This is an example of the general class of problem I was pointing out at https://github.com/cockpit-project/cockpit/pull/1689#issuecomment-71824146 whereby the system pmlogger.service does more stuff than you need/expect. The system pmmgr gives you more control, and a cockpit-specific pmmgr would give you complete control. > It is not, however, expected behavior of a systemd unit. There's a mismatch between what some of the PCP services (pmlogger, pmie, and pmmgr) do and the facilities systemd provides - PCP needs to be able to control services monitoring multiple hosts, and systemd unit files have only a notion of localhost service. > whereby the system pmlogger.service does more stuff than you need/expect. pmmgr has many of the same issues (in fact, it also tries to do all of pmie service management in addition to pmlogger, so one could argue in reverse that it does far more than you need/expect relative to the regular pmlogger scripts - *shrug*). Anyway, fundamentally, controlling distributed services is a hard problem that doesn't really fit well into either the old-school init or systemd models, and that's the root issue here I think (and yeah, I understand the Cockpit folks are interested in the localhost case only so far). (In reply to Frank Ch. Eigler from comment #4) > whereby the system pmlogger.service does more stuff than you > need/expect. Arguably, this is a case of pmlogger doing less than I expect. Marius, the thing is that the "service pmlogger" in general does more than localhost logging, and for whatever historical reasons, it has cron jobs to back up exit-prone individual subtasks. If you want to piggyback on "service pmlogger", you need to control *both* explicit and implicit restarts. With "service pmmgr" (systemwide pmmgr), no cron jobs are used, so shutdown/restart works more like what you expect, but again systemwide pmmgr in general does more than localhost logging. A private pmmgr-based service would let you opt out of those general cases and give you full control (and still some help in terms of log rotation etc). (A private pmlogger-based service is probably too much work.) Each choice has pros & cons. > [...] exit-prone individual subtasks This is fixable BTW, and increasingly it looks like something we should tackle (pmlogger reconnect) - kenj is hacking in the area currently, so this will likely soon become a reality. > A private pmmgr-based service would let you opt out of those > general cases and give you full control (and still some help > in terms of log rotation etc). Private pmlogger-based setups are possible too (see -c option to pmlogger_check and friends) & without the need for more daemons. > (A private pmlogger-based service is probably too much work.) Its approx the same as pmmgr, but in principle I agree - both are more work than necessary. pmmgr also misses out on local-context opportunities (IOW when operating with no pmcd, and pmlogger as the only PCP daemon) that the GSS folks are interested in, which may well be of interest to the Cockpit folks also. cheers. (In reply to Frank Ch. Eigler from comment #7) > Marius, the thing is that the "service pmlogger" in general does > more than localhost logging, and for whatever historical reasons, > it has cron jobs to back up exit-prone individual subtasks. I understand. I appreciate that it is not trivial to put multiple, independent processes behind a single systemd service, each with their own independent success/failure state. > If you want to piggyback on "service pmlogger", you need to control > *both* explicit and implicit restarts. I wouldn't call it piggybacking. We want to do the right thing, not the easy thing, and we want to help pcp do the right thing as well, for everyone. (The easy thing is the old "cockpit-logger" plus "cockpit-logger-janitor" services which just reuse the pmlogger binary and control it with very little extra code on top.) > With "service pmmgr" (systemwide pmmgr), no cron jobs are used, > so shutdown/restart works more like what you expect, but again > systemwide pmmgr in general does more than localhost logging. I still have to seriously look at pmmgr. > A private pmmgr-based service would let you opt out of those > general cases and give you full control (and still some help > in terms of log rotation etc). We want to opt into the more complex case, so that a knowledgeable person can configure Cockpits use of PCP along with his/her other needs. We would only go back to our own private pmlogger service if we can't get a good enough user experience out of the system pmlogger without too many workarounds. I think we are still mostly good (since we will always enable/start and disable/stop pmlogger.service at the same time), but catching pmlogger failures is awkward and we can't in good faith point people to the pmlogger.service UI because it will not do what they expect. Anyway, this is off-topic for this bug report, sory for rambling on. I'll try to summarize this more coherently later. > (A private pmlogger-based service is probably too much work.) (I think we did it with "cockpit-logger.service", no? That was about one day of work after learning enough about pmlogger. Less work than grabling with the system pmlogger, actually. :-) |