Description of problem: Everyday, I get these messages: Jun 17 00:10:13 <host> systemd[1]: pmlogger.service: New main PID 4059499 does not belong to service, and PID file is not owned by root. Refusing. Jun 17 00:10:13 <host> systemd[1]: pmlogger.service: New main PID 4059499 does not belong to service, and PID file is not owned by root. Refusing. Jun 17 00:11:38 <host> systemd[1]: pmlogger.service: Start operation timed out. Terminating. Jun 17 00:11:38 <host> systemd[1]: pmlogger.service: Failed with result 'timeout'. Jun 17 00:11:38 <host> systemd[1]: Failed to start Performance Metrics Archive Logger. Jun 17 00:11:46 <host> systemd[1]: pmlogger.service: Supervising process 4066656 which is not our child. We'll most likely not notice when it exits. How can I fix the service so that it doesn't give errors every day? Version-Release number of selected component (if applicable): pcp-4.3.2-1.fc30.x86_64 How reproducible: Happens every day. Steps to Reproduce: 1. No action is required to produce this error. I guess you need to have PCP services enabled and running, though. Actual results: errors and warnings Expected results: no errors or warnings Additional info: PCP data is being collected. I don't really know if there is anything wrong with the data, but I assume it is fine.
I can report the same issue, it was happening when my old box was running Fedora 29, then when upgraded to Fedora 30 it continued. Now with my brand new Fedora 30 machine the issue continues. SELinux is not reporting any blocks, and then I had this disabled with F29 so it wasn't that. What it does effect is the server is unable to store any more than a day of performance data. Same version of pcp.
This issue seems to be similar to BZ #1583159 - "Sample service file vncserver@.service is out of date with systemd changes", which was caused by an update to systemd in response to an upstream issue stating that PID run files should be owned by root, not an unprivileged user (such as 'pcp' in our case). The upstream systemd RFE requesting this is https://github.com/systemd/systemd/issues/6632 and the change that introduced the PID ownership check is https://github.com/systemd/systemd/commit/db256aab13d8a89d583ecd2bacf0aca87c66effc. The change subsequently resulted in a spate of issues with various service daemons unable to start correctly because their PID file was not owned by root (e.g. xinetd and other services, see for example https://github.com/systemd/systemd/issues/8085). As a result, systemd was subsequently updated to not follow symlinks, which apparently circumvented the original issue. The end result is that /run/PID files should be owned by root and symlinks should not be used. An example of an updated service file to comply with the new systemd is BZ #1583159. So I guess for PCP we're going to need to update the service unit files for all our services to be compatible with the newer versions of systemd. Exactly what changes to make is yet to be determined.
(In reply to Fernando Viñan-Cano from comment #1) > I can report the same issue, it was happening when my old box was running > Fedora 29, then when upgraded to Fedora 30 it continued. Now with my brand > new Fedora 30 machine the issue continues. > > SELinux is not reporting any blocks, and then I had this disabled with F29 > so it wasn't that. > > What it does effect is the server is unable to store any more than a day of > performance data. Same version of pcp. Fernando, can you provide any more details about your system not being able to store more than one day of data? e.g. compressed tarball of your logs below /var/log/pcp/pmlogger and journalctl -b -u pmlogger. If you don't want to attach them to this BZ, could you mail them to me? This might be a separate issue not related to the systemd service issue reported in this BZ because it looks like the pmlogger service is actually being started on subsequent attempts by the PCP scripts, despite all the systemd complaints about the PID files etc. Thanks
(In reply to Mark Goodwin from comment #3) > (In reply to Fernando Viñan-Cano from comment #1) > > I can report the same issue, it was happening when my old box was running > > Fedora 29, then when upgraded to Fedora 30 it continued. Now with my brand > > new Fedora 30 machine the issue continues. > > > > SELinux is not reporting any blocks, and then I had this disabled with F29 > > so it wasn't that. > > > > What it does effect is the server is unable to store any more than a day of > > performance data. Same version of pcp. > > Fernando, can you provide any more details about your system not being able > to store more than one day of data? e.g. compressed tarball of your logs > below /var/log/pcp/pmlogger and journalctl -b -u pmlogger. If you don't want > to attach them to this BZ, could you mail them to me? This might be a > separate issue not related to the systemd service issue reported in this BZ > because it looks like the pmlogger service is actually being started on > subsequent attempts by the PCP scripts, despite all the systemd complaints > about the PID files etc. > > Thanks Here's the journalctl output, I will email the log folder separately: -- Logs begin at Fri 2019-07-05 11:27:51 CEST, end at Mon 2019-07-15 15:12:12 CEST. -- Jul 15 13:10:51 MAGGIE.star-one.co.uk systemd[1]: Starting Performance Metrics Archive Logger... Jul 15 13:10:52 MAGGIE.star-one.co.uk pmlogger[1304]: Starting pmlogger ... Jul 15 13:10:52 MAGGIE.star-one.co.uk systemd[1]: pmlogger.service: Can't open PID file /run/pcp/pmlogger.pid (yet?) after start: No such file or directory Jul 15 13:11:08 MAGGIE.star-one.co.uk systemd[1]: pmlogger.service: Supervising process 8652 which is not our child. We'll most likely not notice when it exits. Jul 15 13:11:08 MAGGIE.star-one.co.uk systemd[1]: Started Performance Metrics Archive Logger.
FEDORA-2019-97183bed56 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-97183bed56
FEDORA-2019-44b383ec91 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2019-44b383ec91
pcp-4.3.4-1.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-97183bed56
pcp-4.3.4-1.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-44b383ec91
pcp-4.3.4-1.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.
pcp-4.3.4-1.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.
Marking this as a DUP of BZ#1737091 since it's essentially the same issue and should now be fixedupstream in pcp-4.3.4-1 and later. *** This bug has been marked as a duplicate of bug 1737091 ***
I am assuming the even newer version that got installed from updates-testing should also carry the fix. It does not appear to fix the problem. [<username>@<host> ~]$ sudo dnf update --enablerepo=updates-testing pcp ... Last metadata expiration check: 0:00:37 ago on Thu 24 Oct 2019 12:46:43 PM CDT. Dependencies resolved. ==================================================================================================================================== Package Architecture Version Repository Size ==================================================================================================================================== Upgrading: pcp x86_64 5.0.0-1.fc30 updates-testing 1.1 M pcp-conf x86_64 5.0.0-1.fc30 updates-testing 30 k pcp-libs x86_64 5.0.0-1.fc30 updates-testing 506 k pcp-selinux x86_64 5.0.0-1.fc30 updates-testing 26 k pcp-system-tools x86_64 5.0.0-1.fc30 updates-testing 167 k python3-pcp x86_64 5.0.0-1.fc30 updates-testing 142 k Transaction Summary ==================================================================================================================================== Upgrade 6 Packages Total download size: 2.0 M Is this ok [y/N]: y Downloading Packages: (1/6): pcp-conf-5.0.0-1.fc30.x86_64.rpm 73 kB/s | 30 kB 00:00 (2/6): pcp-selinux-5.0.0-1.fc30.x86_64.rpm 239 kB/s | 26 kB 00:00 (3/6): pcp-5.0.0-1.fc30.x86_64.rpm 1.7 MB/s | 1.1 MB 00:00 (4/6): pcp-system-tools-5.0.0-1.fc30.x86_64.rpm 1.2 MB/s | 167 kB 00:00 (5/6): python3-pcp-5.0.0-1.fc30.x86_64.rpm 1.9 MB/s | 142 kB 00:00 (6/6): pcp-libs-5.0.0-1.fc30.x86_64.rpm 337 kB/s | 506 kB 00:01 ------------------------------------------------------------------------------------------------------------------------------------ Total 1.0 MB/s | 2.0 MB 00:01 Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Preparing : 1/1 Running scriptlet: pcp-selinux-5.0.0-1.fc30.x86_64 1/1 Upgrading : pcp-selinux-5.0.0-1.fc30.x86_64 1/12 Running scriptlet: pcp-selinux-5.0.0-1.fc30.x86_64 1/12 Upgrading : pcp-conf-5.0.0-1.fc30.x86_64 2/12 Upgrading : pcp-libs-5.0.0-1.fc30.x86_64 3/12 Running scriptlet: pcp-5.0.0-1.fc30.x86_64 4/12 Upgrading : pcp-5.0.0-1.fc30.x86_64 4/12 Running scriptlet: pcp-5.0.0-1.fc30.x86_64 4/12 Upgrading : python3-pcp-5.0.0-1.fc30.x86_64 5/12 Upgrading : pcp-system-tools-5.0.0-1.fc30.x86_64 6/12 Cleanup : pcp-system-tools-4.3.4-1.fc30.x86_64 7/12 Cleanup : python3-pcp-4.3.4-1.fc30.x86_64 8/12 Running scriptlet: pcp-4.3.4-1.fc30.x86_64 9/12 Cleanup : pcp-4.3.4-1.fc30.x86_64 9/12 Running scriptlet: pcp-selinux-4.3.4-1.fc30.x86_64 10/12 Cleanup : pcp-selinux-4.3.4-1.fc30.x86_64 10/12 Cleanup : pcp-libs-4.3.4-1.fc30.x86_64 11/12 Cleanup : pcp-conf-4.3.4-1.fc30.x86_64 12/12 Running scriptlet: pcp-conf-4.3.4-1.fc30.x86_64 12/12 Verifying : pcp-5.0.0-1.fc30.x86_64 1/12 Verifying : pcp-4.3.4-1.fc30.x86_64 2/12 Verifying : pcp-conf-5.0.0-1.fc30.x86_64 3/12 Verifying : pcp-conf-4.3.4-1.fc30.x86_64 4/12 Verifying : pcp-libs-5.0.0-1.fc30.x86_64 5/12 Verifying : pcp-libs-4.3.4-1.fc30.x86_64 6/12 Verifying : pcp-selinux-5.0.0-1.fc30.x86_64 7/12 Verifying : pcp-selinux-4.3.4-1.fc30.x86_64 8/12 Verifying : pcp-system-tools-5.0.0-1.fc30.x86_64 9/12 Verifying : pcp-system-tools-4.3.4-1.fc30.x86_64 10/12 Verifying : python3-pcp-5.0.0-1.fc30.x86_64 11/12 Verifying : python3-pcp-4.3.4-1.fc30.x86_64 12/12 Upgraded: pcp-5.0.0-1.fc30.x86_64 pcp-conf-5.0.0-1.fc30.x86_64 pcp-libs-5.0.0-1.fc30.x86_64 pcp-selinux-5.0.0-1.fc30.x86_64 pcp-system-tools-5.0.0-1.fc30.x86_64 python3-pcp-5.0.0-1.fc30.x86_64 Complete! [<username>@<host> ~]$ ------------------------------------------------------------------------------- Oct 24 12:49:56 <host> root[13365]: pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log Oct 24 12:49:56 <host> root[13370]: pmlogger_check failed - see /var/log/pcp/pmlogger/pmlogger_check.log Oct 24 12:50:18 <host> kernel: SELinux: Converting 2640 SID table entries... Oct 24 12:50:20 <host> kernel: SELinux: policy capability network_peer_controls=1 Oct 24 12:50:20 <host> kernel: SELinux: policy capability open_perms=1 Oct 24 12:50:20 <host> kernel: SELinux: policy capability extended_socket_class=1 Oct 24 12:50:20 <host> kernel: SELinux: policy capability always_check_network=0 Oct 24 12:50:20 <host> kernel: SELinux: policy capability cgroup_seclabel=1 Oct 24 12:50:20 <host> kernel: SELinux: policy capability nnp_nosuid_transition=1 Oct 24 12:50:20 <host> audit: MAC_POLICY_LOAD auid=1000 ses=2 lsm=selinux res=1 Oct 24 12:50:21 <host> dbus-broker-launch[25002]: avc: received policyload notice (seqno=3) Oct 24 12:50:24 <host> audit[1]: USER_AVC pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='avc: received policyload notice (seqno=3) exe="/usr/lib/systemd/systemd" sauid=0 hostname=? addr=? terminal=?' Oct 24 12:50:24 <host> systemd[1]: Reloading. Oct 24 12:50:24 <host> audit[1220]: USER_AVC pid=1220 uid=81 auid=4294967295 ses=4294967295 subj=system_u:system_r:system_dbusd_t:s0-s0:c0.c1023 msg='avc: received policyload notice (seqno=3) exe="/usr/bin/dbus-broker" sauid=81 hostname=? addr=? terminal=?' Oct 24 12:50:24 <host> systemd[1]: /usr/lib/systemd/system/gssproxy.service:9: PIDFile= references path below legacy directory /var/run/, updating /var/run/gssproxy.pid → /run/gssproxy.pid; please update the unit file accordingly. Oct 24 12:50:24 <host> systemd[1]: /usr/lib/systemd/system/sssd.service:11: PIDFile= references path below legacy directory /var/run/, updating /var/run/sssd.pid → /run/sssd.pid; please update the unit file accordingly. Oct 24 12:50:25 <host> systemd[1]: /usr/lib/systemd/system/mdmonitor.service:6: PIDFile= references path below legacy directory /var/run/, updating /var/run/mdadm/mdadm.pid → /run/mdadm/mdadm.pid; please update the unit file accordingly. Oct 24 12:50:25 <host> systemd[1]: /usr/lib/systemd/system/iscsid.service:10: PIDFile= references path below legacy directory /var/run/, updating /var/run/iscsid.pid → /run/iscsid.pid; please update the unit file accordingly. Oct 24 12:50:25 <host> systemd[1]: /usr/lib/systemd/system/iscsiuio.service:12: PIDFile= references path below legacy directory /var/run/, updating /var/run/iscsiuio.pid → /run/iscsiuio.pid; please update the unit file accordingly. Oct 24 12:50:25 <host> systemd[1]: Configuration file /usr/lib/systemd/system/besclient.service is marked executable. Please remove executable permission bits. Proceeding anyway. Oct 24 12:50:27 <host> sudo[551]: pam_unix(sudo:session): session closed for user root Oct 24 12:50:27 <host> audit[551]: USER_END pid=551 uid=0 auid=1000 ses=2 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 msg='op=PAM:session_close grantors=pam_keyinit,pam_limits,pam_keyinit,pam_limits,pam_systemd,pam_unix acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success' Oct 24 12:50:27 <host> audit[551]: CRED_DISP pid=551 uid=0 auid=1000 ses=2 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 msg='op=PAM:setcred grantors=pam_env,pam_localuser,pam_unix acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success' Oct 24 12:51:12 <host> kernel: mce: CPU2: Core temperature/speed normal Oct 24 12:51:12 <host> kernel: mce: CPU0: Package temperature/speed normal Oct 24 12:51:12 <host> kernel: mce: CPU1: Package temperature/speed normal Oct 24 12:51:12 <host> kernel: mce: CPU3: Core temperature/speed normal Oct 24 12:51:12 <host> kernel: mce: CPU2: Package temperature/speed normal Oct 24 12:51:12 <host> kernel: mce: CPU3: Package temperature/speed normal Oct 24 12:51:45 <host> systemd[1]: pmlogger.service: Start operation timed out. Terminating. Oct 24 12:51:45 <host> audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed' Oct 24 12:51:45 <host> systemd[1]: pmlogger.service: Failed with result 'timeout'. Oct 24 12:51:45 <host> systemd[1]: Failed to start Performance Metrics Archive Logger. Oct 24 12:51:45 <host> systemd[1]: pmlogger.service: Service RestartSec=100ms expired, scheduling restart. Oct 24 12:51:45 <host> systemd[1]: pmlogger.service: Scheduled restart job, restart counter is at 2. Oct 24 12:51:45 <host> audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Oct 24 12:51:45 <host> audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Oct 24 12:51:45 <host> systemd[1]: Stopped Performance Metrics Archive Logger. Oct 24 12:51:45 <host> systemd[1]: Starting Performance Metrics Archive Logger... Oct 24 12:51:45 <host> pmlogger[13436]: Starting pmlogger ... Oct 24 12:51:45 <host> systemd[1]: pmlogger.service: Can't open PID file /run/pcp/pmlogger.pid (yet?) after start: No such file or directory Oct 24 12:51:46 <host> pcp-pmie[7788]: High aggregate context switch rate 29903ctxsw/s@<host> Oct 24 12:51:56 <host> root[19159]: pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log Oct 24 12:51:56 <host> root[19164]: pmlogger_check failed - see /var/log/pcp/pmlogger/pmlogger_check.log ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- Oct 24 12:53:46 <host> pmlogger[19176]: Starting pmlogger ... Oct 24 12:53:46 <host> systemd[1]: pmlogger.service: Can't open PID file /run/pcp/pmlogger.pid (yet?) after start: No such file or directory Oct 24 12:53:57 <host> root[24929]: pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log Oct 24 12:53:57 <host> root[24934]: pmlogger_check failed - see /var/log/pcp/pmlogger/pmlogger_check.log -------------------------------------------------------------------------------
I've given up waiting for this fix, plus I'm not sure how an older reported bug can be called a duplicate of a newer one surely it's the other way around, not that it really matters, plus isn't fixed in 4.3.4-1 as stated.
(In reply to Brandon J. Wyman from comment #12) > I am assuming the even newer version that got installed from updates-testing > should also carry the fix. It does not appear to fix the problem. > > [<username>@<host> ~]$ sudo dnf update --enablerepo=updates-testing pcp > ... > Oct 24 12:53:46 <host> pmlogger[19176]: Starting pmlogger ... > Oct 24 12:53:46 <host> systemd[1]: pmlogger.service: Can't open PID file > /run/pcp/pmlogger.pid (yet?) after start: No such file or directory ... Brandon, the more serious bug involving incorrect systemd dependencies for the pmlogger service unit is fixed in the pcp-5.0.0 version that is in updates-testing. The less serious (and believed to be benign) issue of systemd complaining about missing pid files and so forth is not fixed, as is explained in https://bugzilla.redhat.com/show_bug.cgi?id=1761962#c2 The systemd complaints also occur for other (non-pcp) services that use type=forking, not just pcp / pmlogger. We will be changing the pcp pmlogger.service config over to type=notify to silence those systemd messages in the near future. I'm currently travelling and unable to commit time until next week at the earliest. In the mean-time - unless your pmlogger service is actually failing to start, the issues are benign and just systemd being verbose. Regards
My pmlogger service is constantly failing to start now, so it's worse than before: pcp.x86_64 5.0.0-1.fc30 pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log PRIORITY 3 SYSLOG_FACILITY 3 SYSLOG_IDENTIFIER pcp SYSLOG_TIMESTAMP Oct 28 11:30:53 _BOOT_ID f1022eb69e6b460fabe067d7fe080578 _CAP_EFFECTIVE 0 _COMM logger _GID 985 _HOSTNAME MAGGIE _MACHINE_ID ab73b29407d94f1187cf13f29453ffc3 _PID 2695 _SELINUX_CONTEXT system_u:system_r:unconfined_service_t:s0 _SOURCE_REALTIME_TIMESTAMP 1572258653005928 _SYSTEMD_CGROUP /system.slice/pmlogger_check.service _SYSTEMD_INVOCATION_ID 5cc774ad29fb44c494c6668f189d5e16 _SYSTEMD_SLICE system.slice _SYSTEMD_UNIT pmlogger_check.service _TRANSPORT syslog _UID 985 __CURSOR s=08bcabd5ee904ef2b1e9e69dd6740a20;i=51d;b=f1022eb69e6b460fabe067d7fe080578;m=3640fd6;t=595f5fe81ea96;x=9e47b1cf04c52206 __MONOTONIC_TIMESTAMP 56889302 __REALTIME_TIMESTAMP 1572258653006486 pmlogger_check failed - see /var/log/pcp/pmlogger/pmlogger_check.log PRIORITY 3 SYSLOG_FACILITY 3 SYSLOG_IDENTIFIER pcp SYSLOG_TIMESTAMP Oct 28 11:30:53 _BOOT_ID f1022eb69e6b460fabe067d7fe080578 _GID 985 _HOSTNAME MAGGIE _MACHINE_ID ab73b29407d94f1187cf13f29453ffc3 _PID 2711 _SELINUX_CONTEXT system_u:system_r:unconfined_service_t:s0 _SOURCE_REALTIME_TIMESTAMP 1572258653018970 _TRANSPORT syslog _UID 985 __CURSOR s=08bcabd5ee904ef2b1e9e69dd6740a20;i=51e;b=f1022eb69e6b460fabe067d7fe080578;m=3644208;t=595f5fe821cc8;x=a5f9ecce071f0799 __MONOTONIC_TIMESTAMP 56902152 __REALTIME_TIMESTAMP 1572258653019336
(In reply to Fernando Viñan-Cano from comment #15) > My pmlogger service is constantly failing to start now, so it's worse than > before: > > pcp.x86_64 > 5.0.0-1.fc30 > > pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log ... > > pmlogger_check failed - see /var/log/pcp/pmlogger/pmlogger_check.log ... Fernando, what version of systemd, which distro/platform? Please also attach the above two logs and /var/log/pcp/pmlogger/*/pmlogger.log and also "journalctl -xe -u pmlogger" and "systemctl status pmlogger.service". If you'd rather email them to me directly rather than attach to a public BZ, then please do. Does it only fail during boot? Or even after a systemctl restart pmlogger.service? Thanks
FWIW, I probably saw something similar on Fedora 31 but after completely reinstalling all pcp packages I can't reproduce the issue any more.
(In reply to Mark Goodwin from comment #16) > (In reply to Fernando Viñan-Cano from comment #15) > > My pmlogger service is constantly failing to start now, so it's worse than > > before: > > > > pcp.x86_64 > > 5.0.0-1.fc30 > > > > pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log > ... > > > > pmlogger_check failed - see /var/log/pcp/pmlogger/pmlogger_check.log > ... > > Fernando, what version of systemd, which distro/platform? Please also attach > the above two logs and /var/log/pcp/pmlogger/*/pmlogger.log and also > "journalctl -xe -u pmlogger" and "systemctl status pmlogger.service". If > you'd rather email them to me directly rather than attach to a public BZ, > then please do. > > Does it only fail during boot? Or even after a systemctl restart > pmlogger.service? > > Thanks It's Fedora 30, it fails after a short while if I try to restart it manually, and I'd be happy to reinstall PCP if it could help, but I don't know the best way to do that [root@MAGGIE ~]# cat /var/log/pcp/pmlogger/pmlogger_daily-K.log === compressing PCP archives for host local: === pmlogger_daily: [control.d/local:26] Warning: removing lock file older than 30 minutes ----------. 1 pcp pcp 0 Oct 31 07:54 /var/log/pcp/pmlogger/MAGGIE/lock pmlogger_daily: [control.d/local:26] Warning: cannot get archive basename pmlogger PID=180099 pmcd.pmlogger.archive No value(s) available! pmlogger_daily: [control.d/local:26] Warning: current volume of current pmlogger not known, compression skipped [root@MAGGIE ~]# cat /var/log/pcp/pmlogger/pmlogger_check.log cat: /var/log/pcp/pmlogger/pmlogger_check.log: No such file or directory Oct 31 07:56:00 MAGGIE systemd[1]: pmlogger.service: Service RestartSec=100ms expired, scheduling restart. Oct 31 07:56:00 MAGGIE systemd[1]: pmlogger.service: Scheduled restart job, restart counter is at 392. Oct 31 07:56:00 MAGGIE systemd[1]: Stopped Performance Metrics Archive Logger. Oct 31 07:56:00 MAGGIE systemd[1]: Starting Performance Metrics Archive Logger... Oct 31 07:56:00 MAGGIE pmlogger[186695]: Starting pmlogger ... Oct 31 07:56:00 MAGGIE systemd[1]: pmlogger.service: Can't open PID file /run/pcp/pmlogger.pid (yet?) after start: No such file or directory Oct 31 07:56:05 MAGGIE systemd[1]: pmlogger.service: New main PID 192384 does not belong to service, and PID file is not owned by root. Refusing. Oct 31 07:56:05 MAGGIE systemd[1]: pmlogger.service: New main PID 192384 does not belong to service, and PID file is not owned by root. Refusing. [root@MAGGIE ~]# journalctl -xe -u pmlogger Oct 31 07:58:00 MAGGIE systemd[1]: pmlogger.service: Service RestartSec=100ms expired, scheduling restart. Oct 31 07:58:00 MAGGIE systemd[1]: pmlogger.service: Scheduled restart job, restart counter is at 393. -- Subject: Automatic restarting of a unit has been scheduled -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Automatic restarting of the unit pmlogger.service has been scheduled, as the result for -- the configured Restart= setting for the unit. Oct 31 07:58:00 MAGGIE systemd[1]: Stopped Performance Metrics Archive Logger. -- Subject: A stop job for unit pmlogger.service has finished -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A stop job for unit pmlogger.service has finished. -- -- The job identifier is 153999 and the job result is done. Oct 31 07:58:00 MAGGIE systemd[1]: Starting Performance Metrics Archive Logger... -- Subject: A start job for unit pmlogger.service has begun execution -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger.service has begun execution. -- -- The job identifier is 153999. Oct 31 07:58:00 MAGGIE pmlogger[193427]: Starting pmlogger ... Oct 31 07:58:00 MAGGIE systemd[1]: pmlogger.service: Can't open PID file /run/pcp/pmlogger.pid (yet?) after start: No such file or directory Oct 31 07:58:05 MAGGIE systemd[1]: pmlogger.service: New main PID 199132 does not belong to service, and PID file is not owned by root. Refusing. Oct 31 07:58:05 MAGGIE systemd[1]: pmlogger.service: New main PID 199132 does not belong to service, and PID file is not owned by root. Refusing.
(In reply to Fernando Viñan-Cano from comment #18) > It's Fedora 30, it fails after a short while if I try to restart it > manually, and I'd be happy to reinstall PCP if it could help, but I don't > know the best way to do that > ... > -- The job identifier is 153999. > Oct 31 07:58:00 MAGGIE pmlogger[193427]: Starting pmlogger ... > Oct 31 07:58:00 MAGGIE systemd[1]: pmlogger.service: Can't open PID file > /run/pcp/pmlogger.pid (yet?) after start: No such file or directory > Oct 31 07:58:05 MAGGIE systemd[1]: pmlogger.service: New main PID 199132 > does not belong to service, and PID file is not owned by root. Refusing. > Oct 31 07:58:05 MAGGIE systemd[1]: pmlogger.service: New main PID 199132 > does not belong to service, and PID file is not owned by root. Refusing. f30 originally shipped with systemd-241-7.gita2eaa1c.fc30, but that has now been updated to systemd-241-12.git323cdf4.fc30. WHat is your current version Fernando? Also, Marko, was your systemd updated prior to the problem going away? I ask because I'm running the latest -12 version and do not see this problem of pmlogger failing to start or restart, though I do get some of the messages (believed to be benign). In particular, I do not get the "PID file is not owned by root. Refusing." message. Regards
I have: systemd.i686 241-12.git323cdf4.fc30 I also tried a "dnf reinstall pcp" but that's made no difference even after a reboot.
On an up-to-date F30 and F31 VMs (4 vCPU / 4 GB) with systemd-241-12.git323cdf4.fc30.x86_64 / systemd-243-4.gitef67743.fc31.x86_64 this simple script is enough to reproduce the issue: # cat r #!/bin/sh systemctl stop pmcd pmie pmlogger pmproxy redis > /dev/null 2>&1 dnf remove --noautoremove -y pcp\* rm -rf /etc/pcp* /var/lib/pcp/ /var/log/pcp/ /var/lib/dnf/yumdb/p /var/cache/dnf/pcp* systemctl daemon-reload dnf install -y pcp systemctl daemon-reload systemctl stop pmcd pmie pmlogger pmproxy systemctl start pmcd pmie pmlogger pmproxy # rpm -q pcp pcp-5.0.0-1.fc30.x86_64
Thanks Marko - I ran your script on an up-to-date f31 VM and after it finished, I do see the "pmlogger.service failed because a timeout was exceeded" messages, however after no further action the pmlogger service is up and running: .... Job for pmlogger.service failed because a timeout was exceeded. See "systemctl status pmlogger.service" and "journalctl -xe" for details. Running systemctl status pmlogger.service now shows: [root@f31-systemd mgoodwin]# systemctl status pmlogger.service ● pmlogger.service - Performance Metrics Archive Logger Loaded: loaded (/usr/lib/systemd/system/pmlogger.service; enabled; vendor preset: enabled) Active: active (running) since Mon 2019-11-04 08:04:57 AEDT; 6s ago Docs: man:pmlogger(1) Process: 89860 ExecStart=/usr/share/pcp/lib/pmlogger start (code=exited, status=0/SUCCESS) Main PID: 95257 (pmlogger) Tasks: 1 (limit: 9506) Memory: 3.4M CPU: 6.362s CGroup: /system.slice/pmlogger.service └─95257 /usr/libexec/pcp/bin/pmlogger -P -r -T24h10m -c config.default -v 100mb -m pmlogger_check 20191104.08.04 Nov 04 08:04:52 f31-systemd systemd[1]: Starting Performance Metrics Archive Logger... Nov 04 08:04:52 f31-systemd pmlogger[89860]: Starting pmlogger ... Nov 04 08:04:52 f31-systemd systemd[1]: pmlogger.service: Can't open PID file /run/pcp/pmlogger.pid (yet?) after start: No such file o> Nov 04 08:04:57 f31-systemd systemd[1]: pmlogger.service: Supervising process 95257 which is not our child. We'll most likely not noti> Nov 04 08:04:57 f31-systemd systemd[1]: Started Performance Metrics Archive Logger. [root@f31-systemd mgoodwin]# pcp Performance Co-Pilot configuration on f31-systemd: platform: Linux f31-systemd 5.3.7-301.fc31.x86_64 #1 SMP Mon Oct 21 19:18:58 UTC 2019 x86_64 hardware: 2 cpus, 1 disk, 1 node, 7961MB RAM timezone: AEDT-11 services: pmcd pmproxy pmcd: Version 5.0.0-1, 9 agents, 4 clients pmda: root pmcd proc pmproxy xfs linux mmv kvm jbd2 pmlogger: primary logger: /var/log/pcp/pmlogger/f31-systemd/20191104.08.04 pmie: primary engine: /var/log/pcp/pmie/f31-systemd/pmie.log And the journalctl -xe -u pmlogger.service log shows : -- A start job for unit pmlogger.service has begun execution. -- -- The job identifier is 51208. Nov 04 08:02:51 f31-systemd pmlogger[82555]: Starting pmlogger ... Nov 04 08:02:51 f31-systemd systemd[1]: pmlogger.service: Can't open PID file /run/pcp/pmlogger.pid (yet?) after start: No such file or directory Nov 04 08:02:57 f31-systemd systemd[1]: pmlogger.service: New main PID 89164 does not belong to service, and PID file is not owned by root. Refusing. Nov 04 08:04:51 f31-systemd systemd[1]: pmlogger.service: start operation timed out. Terminating. Nov 04 08:04:51 f31-systemd systemd[1]: pmlogger.service: Failed with result 'timeout'. -- Subject: Unit failed -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- The unit pmlogger.service has entered the 'failed' state with result 'timeout'. Nov 04 08:04:51 f31-systemd systemd[1]: Failed to start Performance Metrics Archive Logger. -- Subject: A start job for unit pmlogger.service has failed -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger.service has finished with a failure. -- -- The job identifier is 51208 and the job result is failed. Nov 04 08:04:52 f31-systemd systemd[1]: pmlogger.service: Service RestartSec=100ms expired, scheduling restart. Nov 04 08:04:52 f31-systemd systemd[1]: pmlogger.service: Scheduled restart job, restart counter is at 1. -- Subject: Automatic restarting of a unit has been scheduled -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Automatic restarting of the unit pmlogger.service has been scheduled, as the result for -- the configured Restart= setting for the unit. Nov 04 08:04:52 f31-systemd systemd[1]: Stopped Performance Metrics Archive Logger. -- Subject: A stop job for unit pmlogger.service has finished -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A stop job for unit pmlogger.service has finished. -- -- The job identifier is 51547 and the job result is done. Nov 04 08:04:52 f31-systemd systemd[1]: Starting Performance Metrics Archive Logger... -- Subject: A start job for unit pmlogger.service has begun execution -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger.service has begun execution. -- -- The job identifier is 51547. Nov 04 08:04:52 f31-systemd pmlogger[89860]: Starting pmlogger ... Nov 04 08:04:52 f31-systemd systemd[1]: pmlogger.service: Can't open PID file /run/pcp/pmlogger.pid (yet?) after start: No such file or directory Nov 04 08:04:57 f31-systemd systemd[1]: pmlogger.service: Supervising process 95257 which is not our child. We'll most likely not notice when it exits. Nov 04 08:04:57 f31-systemd systemd[1]: Started Performance Metrics Archive Logger. -- Subject: A start job for unit pmlogger.service has finished successfully -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger.service has finished successfully. -- -- The job identifier is 51547. So it looks like the service start initially failed with a timeout, and then the restart=always kicked in after 100ms, resulting in the service starting correctly (the second time). Is this exactly what everyone else is seeing? i.e. do you end up with the service actually running correctly, despite the initial timeout? Or is the pmlogger service still NOT running at the completion of the script? thanks
This morning this was logged for pmlogger_daily: [root@MAGGIE ~]# systemctl status pmlogger_daily ● pmlogger_daily.service - Process archive logs Loaded: loaded (/usr/lib/systemd/system/pmlogger_daily.service; static; vendor preset: disabled) Active: failed (Result: timeout) since Mon 2019-11-04 00:12:01 CET; 7h ago Docs: man:pmlogger_daily(1) Process: 216795 ExecStart=/usr/libexec/pcp/bin/pmlogger_daily $PMLOGGER_DAILY_PARAMS (code=exited, status=0/SUCCESS) Main PID: 216795 (code=exited, status=0/SUCCESS) Nov 04 00:10:01 MAGGIE systemd[1]: Starting Process archive logs... Nov 04 00:12:01 MAGGIE systemd[1]: pmlogger_daily.service: Start operation timed out. Terminating. Nov 04 00:12:01 MAGGIE systemd[1]: pmlogger_daily.service: Failed with result 'timeout'. Nov 04 00:12:01 MAGGIE systemd[1]: Failed to start Process archive logs. [root@MAGGIE ~]# journalctl -xe -u pmlogger_daily-- Defined-By: systemd-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel-- -- The unit pmlogger_daily.service has successfully entered the 'dead' state. Nov 03 00:11:48 MAGGIE systemd[1]: Started Process archive logs. -- Subject: A start job for unit pmlogger_daily.service has finished successfully -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger_daily.service has finished successfully. -- -- The job identifier is 366735. Nov 04 00:10:01 MAGGIE systemd[1]: Starting Process archive logs... -- Subject: A start job for unit pmlogger_daily.service has begun execution -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger_daily.service has begun execution. -- -- The job identifier is 579462. Nov 04 00:12:01 MAGGIE systemd[1]: pmlogger_daily.service: Start operation timed out. Terminating. Nov 04 00:12:01 MAGGIE systemd[1]: pmlogger_daily.service: Failed with result 'timeout'. -- Subject: Unit failed -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- The unit pmlogger_daily.service has entered the 'failed' state with result 'timeout'. Nov 04 00:12:01 MAGGIE systemd[1]: Failed to start Process archive logs. -- Subject: A start job for unit pmlogger_daily.service has failed -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger_daily.service has finished with a failure. -- -- The job identifier is 579462 and the job result is failed. The service monitor in Cockpit shows pmlogger_daily as failed, and starting it manually causes the same error. November 4, 2019 8:10 AM Failed to start Process archive logs. systemd 8:10 AM pmlogger_daily.service: Failed with result 'exit-code'. systemd 8:10 AM pmlogger_daily.service: Main process exited, code=exited, status=1/FAILURE systemd 8:10 AM Starting Process archive logs... systemd 8:10 AM Failed to start Process archive logs. systemd 8:10 AM pmlogger_daily.service: Failed with result 'exit-code'. systemd 8:10 AM pmlogger_daily.service: Main process exited, code=exited, status=1/FAILURE systemd 8:10 AM pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily.log pcp 8:10 AM Starting Process archive logs... systemd 12:12 AM Failed to start Process archive logs. systemd I then ran the script and this was logged: [root@MAGGIE ~]# systemctl status pmlogger.service ● pmlogger.service - Performance Metrics Archive Logger Loaded: loaded (/usr/lib/systemd/system/pmlogger.service; enabled; vendor preset: enabled) Active: activating (start) since Mon 2019-11-04 08:04:03 CET; 1min 44s ago Docs: man:pmlogger(1) Process: 226303 ExecStart=/usr/share/pcp/lib/pmlogger start (code=exited, status=0/SUCCESS) Tasks: 0 (limit: 393) Memory: 1.9M CGroup: /system.slice/pmlogger.service Nov 04 08:04:03 MAGGIE systemd[1]: pmlogger.service: Service RestartSec=100ms expired, scheduling restart. Nov 04 08:04:03 MAGGIE systemd[1]: pmlogger.service: Scheduled restart job, restart counter is at 2. Nov 04 08:04:03 MAGGIE systemd[1]: Stopped Performance Metrics Archive Logger. Nov 04 08:04:03 MAGGIE systemd[1]: Starting Performance Metrics Archive Logger... Nov 04 08:04:03 MAGGIE pmlogger[226303]: Starting pmlogger ... Nov 04 08:04:03 MAGGIE systemd[1]: pmlogger.service: Can't open PID file /run/pcp/pmlogger.pid (yet?) after start: No such file or directory Nov 04 08:04:08 MAGGIE systemd[1]: pmlogger.service: New main PID 232026 does not belong to service, and PID file is not owned by root. Refusing. Nov 04 08:04:08 MAGGIE systemd[1]: pmlogger.service: New main PID 232026 does not belong to service, and PID file is not owned by root. Refusing. [root@MAGGIE ~]# journalctl -xe -- Subject: A stop job for unit pmlogger.service has finished -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A stop job for unit pmlogger.service has finished. -- -- The job identifier is 657493 and the job result is done. Nov 04 08:04:03 MAGGIE systemd[1]: Starting Performance Metrics Archive Logger... -- Subject: A start job for unit pmlogger.service has begun execution -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger.service has begun execution. -- -- The job identifier is 657493. Nov 04 08:04:03 MAGGIE pmlogger[226303]: Starting pmlogger ... Nov 04 08:04:03 MAGGIE systemd[1]: pmlogger.service: Can't open PID file /run/pcp/pmlogger.pid (yet?) after start: No such file or directory Nov 04 08:04:08 MAGGIE systemd[1]: pmlogger.service: New main PID 232026 does not belong to service, and PID file is not owned by root. Refusing. Nov 04 08:04:08 MAGGIE systemd[1]: pmlogger.service: New main PID 232026 does not belong to service, and PID file is not owned by root. Refusing. Nov 04 08:04:10 MAGGIE systemd[1]: pmlogger_check.service: Succeeded. -- Subject: Unit succeeded -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- The unit pmlogger_check.service has successfully entered the 'dead' state. Nov 04 08:04:10 MAGGIE systemd[1]: Started Check pmlogger instances are running. -- Subject: A start job for unit pmlogger_check.service has finished successfully -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger_check.service has finished successfully. -- -- The job identifier is 657398. Nov 04 08:04:10 MAGGIE audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger_check comm="syste> Nov 04 08:04:10 MAGGIE audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger_check comm="system The pmlogger service is still running but is shown as "Activating (starting)" rather than "Active (running)".
I think I've made things worse since running that script, this is just a sample: November 5, 2019 5:25 AM Failed to start Performance Metrics Archive Logger. systemd 5:25 AM pmlogger.service: New main PID 242536 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 5:23 AM pmlogger.service: New main PID 236432 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 5:23 AM Failed to start Performance Metrics Archive Logger. systemd 5:21 AM pmlogger.service: New main PID 229862 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 5:21 AM Failed to start Performance Metrics Archive Logger. systemd 5:19 AM pmlogger.service: New main PID 222366 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 5:19 AM Failed to start Performance Metrics Archive Logger. systemd 5:17 AM pmlogger.service: New main PID 214518 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 5:17 AM Failed to start Performance Metrics Archive Logger. systemd 5:15 AM pmlogger.service: New main PID 206115 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 5:15 AM Failed to start Performance Metrics Archive Logger. systemd 5:13 AM pmlogger.service: New main PID 198474 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 5:13 AM Failed to start Performance Metrics Archive Logger. systemd 5:11 AM pmlogger.service: New main PID 191237 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 5:11 AM Failed to start Performance Metrics Archive Logger. systemd 5:09 AM pmlogger.service: New main PID 184405 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 5:09 AM Failed to start Performance Metrics Archive Logger. systemd 5:07 AM pmlogger.service: New main PID 177929 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 5:07 AM Failed to start Performance Metrics Archive Logger. systemd 5:05 AM pmlogger.service: New main PID 171365 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 5:05 AM Failed to start Performance Metrics Archive Logger. systemd 5:03 AM pmlogger.service: New main PID 164854 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 5:03 AM Failed to start Performance Metrics Archive Logger. systemd 5:01 AM pmlogger.service: New main PID 158367 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 5:01 AM Failed to start Performance Metrics Archive Logger. systemd 4:59 AM pmlogger.service: New main PID 151718 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:59 AM Failed to start Performance Metrics Archive Logger. systemd 4:57 AM pmlogger.service: New main PID 145079 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:57 AM Failed to start Performance Metrics Archive Logger. systemd 4:55 AM pmlogger.service: New main PID 138537 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:55 AM Failed to start Performance Metrics Archive Logger. systemd 4:53 AM pmlogger.service: New main PID 132039 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:52 AM Failed to start Performance Metrics Archive Logger. systemd 4:51 AM pmlogger.service: New main PID 125556 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:50 AM Failed to start Performance Metrics Archive Logger. systemd 4:49 AM pmlogger.service: New main PID 118993 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:48 AM Failed to start Performance Metrics Archive Logger. systemd 4:47 AM pmlogger.service: New main PID 112518 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:46 AM Failed to start Performance Metrics Archive Logger. systemd 4:45 AM pmlogger.service: New main PID 106018 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:44 AM Failed to start Performance Metrics Archive Logger. systemd 4:43 AM pmlogger.service: New main PID 99440 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:42 AM Failed to start Performance Metrics Archive Logger. systemd 4:41 AM pmlogger.service: New main PID 92965 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:40 AM Failed to start Performance Metrics Archive Logger. systemd 4:39 AM pmlogger.service: New main PID 86390 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:38 AM Failed to start Performance Metrics Archive Logger. systemd 4:37 AM pmlogger.service: New main PID 79914 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:36 AM Failed to start Performance Metrics Archive Logger. systemd 4:35 AM pmlogger.service: New main PID 73385 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:34 AM Failed to start Performance Metrics Archive Logger. systemd 4:32 AM pmlogger.service: New main PID 66828 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:32 AM Failed to start Performance Metrics Archive Logger. systemd 4:30 AM pmlogger.service: New main PID 60353 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:30 AM Failed to start Performance Metrics Archive Logger. systemd 4:28 AM pmlogger.service: New main PID 53806 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:28 AM Failed to start Performance Metrics Archive Logger. systemd 4:26 AM pmlogger.service: New main PID 47089 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:26 AM Failed to start Performance Metrics Archive Logger. systemd 4:24 AM pmlogger.service: New main PID 40598 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:24 AM Failed to start Performance Metrics Archive Logger. systemd 4:22 AM pmlogger.service: New main PID 34041 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:22 AM Failed to start Performance Metrics Archive Logger. systemd 4:20 AM pmlogger.service: New main PID 27559 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:20 AM Failed to start Performance Metrics Archive Logger. systemd 4:18 AM pmlogger.service: New main PID 21064 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:18 AM Failed to start Performance Metrics Archive Logger. systemd 4:16 AM pmlogger.service: New main PID 14488 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:16 AM Failed to start Performance Metrics Archive Logger. systemd 4:14 AM pmlogger.service: New main PID 7998 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:14 AM Failed to start Performance Metrics Archive Logger. systemd 4:12 AM pmlogger.service: New main PID 1073 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:12 AM Failed to start Performance Metrics Archive Logger. systemd 4:10 AM pmlogger.service: New main PID 239873 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:10 AM Failed to start Performance Metrics Archive Logger. systemd 4:08 AM pmlogger.service: New main PID 232602 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:08 AM Failed to start Performance Metrics Archive Logger. systemd 4:06 AM pmlogger.service: New main PID 224727 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:06 AM Failed to start Performance Metrics Archive Logger. systemd 4:04 AM pmlogger.service: New main PID 216547 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:04 AM Failed to start Performance Metrics Archive Logger. systemd 4:02 AM pmlogger.service: New main PID 209010 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:02 AM Failed to start Performance Metrics Archive Logger. systemd 4:00 AM pmlogger.service: New main PID 201856 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 4:00 AM Failed to start Performance Metrics Archive Logger. systemd 3:58 AM pmlogger.service: New main PID 194773 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 3:58 AM Failed to start Performance Metrics Archive Logger. systemd 3:56 AM pmlogger.service: New main PID 188154 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 3:56 AM Failed to start Performance Metrics Archive Logger. systemd 3:55 AM pmlogger.service: New main PID 181900 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 3:54 AM pmlogger.service: New main PID 175805 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 3:54 AM Failed to start Performance Metrics Archive Logger. systemd 3:52 AM pmlogger.service: New main PID 169358 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 3:52 AM Failed to start Performance Metrics Archive Logger. systemd 3:50 AM pmlogger.service: New main PID 162777 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 3:50 AM Failed to start Performance Metrics Archive Logger. systemd 3:48 AM pmlogger.service: New main PID 156285 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 3:48 AM Failed to start Performance Metrics Archive Logger. systemd 3:46 AM pmlogger.service: New main PID 149807 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 3:46 AM Failed to start Performance Metrics Archive Logger. systemd 3:44 AM pmlogger.service: New main PID 143224 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 3:44 AM Failed to start Performance Metrics Archive Logger. systemd 3:42 AM pmlogger.service: New main PID 136743 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 3:42 AM Failed to start Performance Metrics Archive Logger. systemd 3:40 AM pmlogger.service: New main PID 130271 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 3:40 AM Failed to start Performance Metrics Archive Logger. systemd 3:38 AM pmlogger.service: New main PID 123758 does not belong to service, and PID file is not owned by root. Refusing. systemd 2 3:38 AM Failed to start Performance Metrics Archive Logger.
> i.e. do you end up with the service actually running correctly, despite the initial timeout? Or is the pmlogger service still NOT running at the completion of the script? It should be mentioned what on latest Ubuntu PCP packages fail to install properly due to this issue.
re-opening this one, despite it being DUP'd elsewhere since people seem to be using this BZ rather than any of the others. After a fair bit of poking around with difficult-to-debug systemd configs and PCP log management scripts, it looks like we basically have what's known as a "readyness protocol mismatch" with what systemd is expecting. Given the PCP rc scripts pre-date systemd by about 2 decades, it's not surprising. Basically, systemd runs the PCP pmlogger.service rc script (/usr/share/pcp/lib/pmlogger), which then runs pmlogger_check to start the pmlogger service (to start at least the primary logger, but also any other loggers configured in the control file or control.d directory in a logging farm configuration). pmlogger_check then forks off a background shell function which then forks off each pmlogger process with appropriate options (as per the control file), and then busy waits in a loop checking with pmlc for the new pmlogger process. Once each pmlogger is started, pmlogger_check writes the pid files and does various other things and then exits - and then the rc script exits. With a Type=forking config, this is supposed to signal to systemd that the service has started. Unfortunately however, systemd is impatient and doesn't cope very well with the double forking and usually ends up killing the entire process tree (witness signal 15 messages in pmlogger.log), and reporting the failure as a service start timeout. Since we have restart=always in the systemd config, yet another rc script is then re-launched, which often succeeds because the initial pmlogconf work was previously completed for a new installation, and we end up with one or more pmlogger daemons active. To complicate things, some PCP QA tests that have been interrupted may have left the pmlogger systemd config with Restart=no, but this only affects systems running QA - i.e. mostly PCP developers. The fix will involve converting to the modern Type=notify readyness protocol - so each pmlogger sends an sd_notify message to systemd that it has started and completed initialization. This should function regardless of how much forking goes on with the rc and log management scripts. We'll also need to split out each pmlogger in a farm configuration so they're individually managed by systemd as distinct service units (there is a templating facility for this). Similar changes will be needed for other PCP services. References: https://unix.stackexchange.com/questions/401590/systemd-timeout-because-it-doesnt-detect-daemon-forking https://unix.stackexchange.com/questions/200280/systemd-kills-service-immediately-after-start/200365#200365 https://unix.stackexchange.com/questions/336031/systemd-service-restarts-every-90-seconds/336067#336067 .. and many others - this is not an uncommon problem with legacy service daemons being adapted to the systemd ecosystem.
Thanks Mark, let us know if we can help with getting any more info or for testing. Also, what's the best thing I can do for my system? Before I ran the script it would log problems once per day, now since the script it's spamming the log every few minutes. Should I simply uninstall PCP or can I reconfigure it somehow to the way it was before?
Hi Fernando, I can repro this issue on a virgin install (beginning with no PCP installed at all). This fails to start the pmlogger service with a timeout - as reported. If I then: systemctl restart pmlogger.service it seems to work OK and we end up with the pmlogger service running. Are you seeing that too? Also, can you please try and start the pmie service after a virgin install? The pmie systemd config is very similar to the pmlogger service and the rc scripts are also basically the same (just different service name and paths etc). This works for both a virgin install and for a restart. Can you please check that too? systemctl start pmie.service Thanks -- Mark
Doesn't look good. I used the commands from the earlier script to remove the current installation then reinstalled - I omitted to do the service commands. Then I did the tasks you requested: [root@MAGGIE ~]# systemctl status pmlogger.service ● pmlogger.service - Performance Metrics Archive Logger Loaded: loaded (/usr/lib/systemd/system/pmlogger.service; enabled; vendor preset: enabled) Active: inactive (dead) Docs: man:pmlogger(1) Nov 08 07:49:44 MAGGIE systemd[1]: pmlogger.service: Service RestartSec=100ms expired, scheduling restart. Nov 08 07:49:44 MAGGIE systemd[1]: pmlogger.service: Scheduled restart job, restart counter is at 1109. Nov 08 07:49:44 MAGGIE systemd[1]: Stopped Performance Metrics Archive Logger. Nov 08 07:49:44 MAGGIE systemd[1]: Starting Performance Metrics Archive Logger... Nov 08 07:49:44 MAGGIE pmlogger[82993]: Starting pmlogger ... Nov 08 07:49:44 MAGGIE systemd[1]: pmlogger.service: Can't open PID file /run/pcp/pmlogger.pid (yet?) after start: No such file > Nov 08 07:49:49 MAGGIE systemd[1]: pmlogger.service: New main PID 88678 does not belong to service, and PID file is not owned by> Nov 08 07:49:49 MAGGIE systemd[1]: pmlogger.service: New main PID 88678 does not belong to service, and PID file is not owned by> Nov 08 07:49:51 MAGGIE systemd[1]: pmlogger.service: Succeeded. Nov 08 07:49:51 MAGGIE systemd[1]: Stopped Performance Metrics Archive Logger. [root@MAGGIE ~]# systemctl restart pmlogger.service A dependency job for pmlogger.service failed. See 'journalctl -xe' for details. [root@MAGGIE ~]# journalctl -xe -- -- The process' exit code is 'exited' and its exit status is 1. Nov 08 07:59:30 MAGGIE systemd[1]: pmlogger_check.service: Failed with result 'exit-code'. -- Subject: Unit failed -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- The unit pmlogger_check.service has entered the 'failed' state with result 'exit-code'. Nov 08 07:59:30 MAGGIE systemd[1]: Failed to start Check pmlogger instances are running. -- Subject: A start job for unit pmlogger_check.service has failed -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger_check.service has finished with a failure. -- -- The job identifier is 392873 and the job result is failed. Nov 08 07:59:30 MAGGIE systemd[1]: Dependency failed for Performance Metrics Archive Logger. -- Subject: A start job for unit pmlogger.service has failed -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger.service has finished with a failure. -- -- The job identifier is 392778 and the job result is dependency. Nov 08 07:59:30 MAGGIE systemd[1]: pmlogger.service: Job pmlogger.service/start failed with result 'dependency'. Nov 08 07:59:30 MAGGIE audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='> Nov 08 07:59:41 MAGGIE root[94420]: pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log Nov 08 07:59:41 MAGGIE root[94425]: pmlogger_check failed - see /var/log/pcp/pmlogger/pmlogger_check.log [root@MAGGIE ~]# systemctl start pmie.service Job for pmie.service failed because a timeout was exceeded. See "systemctl status pmie.service" and "journalctl -xe" for details. [root@MAGGIE ~]# systemctl status pmie.service ● pmie.service - Performance Metrics Inference Engine Loaded: loaded (/usr/lib/systemd/system/pmie.service; enabled; vendor preset: enabled) Active: activating (start) since Fri 2019-11-08 08:02:49 CET; 35s ago Docs: man:pmie(1) Process: 96118 ExecStart=/usr/share/pcp/lib/pmie start (code=exited, status=0/SUCCESS) Tasks: 2 (limit: 394) Memory: 3.1M CGroup: /system.slice/pmie.service ├─96199 /usr/bin/sh /usr/share/pcp/lib/pmie start └─96203 pmcd_wait Nov 08 08:02:49 MAGGIE systemd[1]: pmie.service: Service RestartSec=100ms expired, scheduling restart. Nov 08 08:02:49 MAGGIE systemd[1]: pmie.service: Scheduled restart job, restart counter is at 1. Nov 08 08:02:49 MAGGIE systemd[1]: Stopped Performance Metrics Inference Engine. Nov 08 08:02:49 MAGGIE systemd[1]: Starting Performance Metrics Inference Engine... Nov 08 08:02:49 MAGGIE pmie[96118]: Starting pmie ... Nov 08 08:02:49 MAGGIE systemd[1]: pmie.service: Can't open PID file /run/pcp/pmie.pid (yet?) after start: No such file or direc> [root@MAGGIE ~]# journalctl -xe Nov 08 08:03:21 MAGGIE systemd[1]: pmlogger.service: Service RestartSec=100ms expired, scheduling restart. Nov 08 08:03:21 MAGGIE systemd[1]: pmlogger.service: Scheduled restart job, restart counter is at 2. -- Subject: Automatic restarting of a unit has been scheduled -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Automatic restarting of the unit pmlogger.service has been scheduled, as the result for -- the configured Restart= setting for the unit. Nov 08 08:03:21 MAGGIE systemd[1]: Stopped Performance Metrics Archive Logger. -- Subject: A stop job for unit pmlogger.service has finished -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A stop job for unit pmlogger.service has finished. -- -- The job identifier is 393750 and the job result is done. Nov 08 08:03:21 MAGGIE audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='> Nov 08 08:03:21 MAGGIE audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='u> Nov 08 08:03:21 MAGGIE systemd[1]: Starting Performance Metrics Archive Logger... -- Subject: A start job for unit pmlogger.service has begun execution -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger.service has begun execution. -- -- The job identifier is 393750. Nov 08 08:03:21 MAGGIE pmlogger[96400]: Starting pmlogger ... Nov 08 08:03:21 MAGGIE systemd[1]: pmlogger.service: Can't open PID file /run/pcp/pmlogger.pid (yet?) after start: No such file > [root@MAGGIE ~]#
Committed an upstream patch that fixes the service timeout reproducer test posted by Marko in Comment#21, along with a bunch of QA updates. commit 0f3a1fa7d0d7a898acf2ee4a6f23b626d4fb7362 Author: Mark Goodwin <mgoodwin> Date: Wed Nov 13 09:50:57 2019 +1100 logutil: fix pmlogger service timeout on virgin install RHBZ#1721223 - pmlogger.service: Failed with result 'timeout' PCP GH#792 Do not invoke pmlogger_daily -K to compress archives from pmlogger_check on a virgin install - it confuses systemd's Type=forking exit status checks, resulting in the pmlogger service timing out. There is nothing to compress anyway! This fixes the installation issues reported by Marko in BZ#1721223 but does not update the service configs - they are still Type=forking and so there will still be some (benign) complaints from systemd as reported in the BZ. Related QA updates are following. The above has been pushed to the current master branch (5.0.2 devel), but has not been built or pushed to any Fedora or Deb/Ubuntu repos. @Fernando and others, after building and installing latest master branch with this patch if you're still seeing issues with the pmlogger service or pmlogger_daily and pmlogger_check scripts failing and/or the pmie service failing to start then we need to examine your system and installation. It Works For Me (and Marko tested it too). Thanks
@Mark, thanks, once I see it pushed to my machine when I check for updates I'll check what happens on my system and let you know.
I would also support a new PCP upstream release to address this or at least a new updated build for the affected distributions with selected fixes included (instead of waiting over a month for the next scheduled PCP release). Thanks.
(In reply to Marko Myllynen from comment #32) > I would also support a new PCP upstream release to address this or at least > a new updated build for the affected distributions with selected fixes > included (instead of waiting over a month for the next scheduled PCP > release). Thanks. agree we should ship an early update to fix the install issue - a month is a long time. However, I think it would first be a good idea to ask Fernando to test a build from current master branch to if the issues he's seeing still persist.
@Mark happy to test, just tell me how. I'm not a Linux admin, this server is just a hobby thing I use to run a few things, so no idea how to "test a build from current master branch"
This bug appears to have been reported against 'rawhide' during the Fedora 32 development cycle. Changing version to 32.
Closed as per the fix from Comment #30 and verified with Marko's recipe from Comment #21. The additional work for changing the pmlogger service to use type=notify is a separate issue being tracked in upstream issue https://github.com/performancecopilot/pcp/issues/798
In Fedora 31 for ARMv7 this error still exists.
I'm now on 5.1.0-1.fc31 and this problem still exists and just fills up the log with these messages constantly. Is this ever going to be fixed?
(In reply to Fernando Viñan-Cano from comment #40) > I'm now on 5.1.0-1.fc31 and this problem still exists and just fills up the > log with these messages constantly. > > Is this ever going to be fixed? Hi Fernando, it should be fixed now. Perhaps after your upgrade you need to run systemctl daemon-reload and then systemctl restart pmlogger ? If that doesn't work, please send your logs and a tarball of /etc/pcp. Thanks
Hi Mark, I've rebooted at least 3 times since I did the update and it's still the same, but I tried those commands and got this: [root@MAGGIE ~]# systemctl restart pmlogger Job for pmlogger.service failed because the service did not take the steps required by its unit configuration. See "systemctl status pmlogger.service" and "journalctl -xe" for details. The details are: May 04 13:14:26 MAGGIE.mydomain.com systemd[1]: Failed to start Performance Metrics Archive Logger. -- Subject: A start job for unit pmlogger.service has failed -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger.service has finished with a failure. -- -- The job identifier is 13372 and the job result is failed. May 04 13:14:26 MAGGIE.mydomain.com audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:> May 04 13:14:26 MAGGIE.mydomain.com systemd[1]: Starting Check pmlogger instances are running... -- Subject: A start job for unit pmlogger_check.service has begun execution -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger_check.service has begun execution. -- -- The job identifier is 13479. May 04 13:14:27 MAGGIE.mydomain.com audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:> May 04 13:14:27 MAGGIE.mydomain.com audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s> May 04 13:14:27 MAGGIE.mydomain.com systemd[1]: pmlogger.service: Scheduled restart job, restart counter is at 4. -- Subject: Automatic restarting of a unit has been scheduled -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Automatic restarting of the unit pmlogger.service has been scheduled, as the result for -- the configured Restart= setting for the unit. May 04 13:14:27 MAGGIE.mydomain.com systemd[1]: Stopped Performance Metrics Archive Logger. -- Subject: A stop job for unit pmlogger.service has finished -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A stop job for unit pmlogger.service has finished. -- -- The job identifier is 13582 and the job result is done. May 04 13:14:27 MAGGIE.mydomain.com systemd[1]: Starting Performance Metrics Archive Logger... -- Subject: A start job for unit pmlogger.service has begun execution -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- A start job for unit pmlogger.service has begun execution. -- -- The job identifier is 13582. I have gathered all the files into archive and will email you them separately - hope that's ok. Cheers, Fernando
Fernando, the logs you sent me indicate pmlogger is starting but exiting very soon after (and then being repeatedly restarted by systemd). I don't see any evidence of a systemd unit configuration problem. The PCP archives are being created, but have a corrupted label record and I can't read any of them. e.g. here's what I get with some debugging enabled trying to examine one of your recent faulty archives : $ pminfo -Dlog,logmeta -a 20200504.11.16-00 __pmLogOpen: inspect file "./20200504.11.16-00.0" __pmLogOpen: inspect file "./20200504.11.16-00.index" __pmFopen("./20200504.11.16-00.index", "r"): no decompress __pmLogOpen: inspect file "./20200504.11.16-00.meta" __pmFopen("./20200504.11.16-00.meta", "r"): no decompress __pmFopen("./20200504.11.16-00.0", "r"): no decompress __pmLogChkLabel: fd=3 vol=0 trailer read -> 4 (expect 4) or bad trailer len=8650752 (expected 132) pminfo: Cannot open archive "20200504.11.16-00": Illegal label record at start of a PCP archive log file This may indicate a bug in pmlogger / libpcp. Can you please try running pmlogger manually, e.g. $ pmlogger -c config.default -h local: -l- -s 5 -t 1 /tmp/testlog Please send me the output of the above command (-l- will print messages to stdout) and also /var/lib/pcp/config/pmlogger/config.default and also /tmp/testlog.* If the above command fails, please: rm /var/lib/pcp/config/pmlogger/config.default and then systemctl restart pmlogger (this will regenerate a new config.default since yours may be corrupted somehow). After restarting the pmlogger service, please also send me the output of: systemctl status pmlogger; journalctl -xe -u pmlogger.service Also, back in Comment #20, you said you have systemd.i686 241-12.git323cdf4.fc30 installed. Are you running a 32bit machine? Regards --
Hi Mark, Ran the command Last login: Mon May 4 13:10:44 2020 from 192.168.1.19 [root@MAGGIE ~]# pmlogger -c config.default -h local: -l- -s 5 -t 1 /tmp/testlog Log for pmlogger on MAGGIE.star-one.co.uk started Tue May 5 07:32:37 2020 preprocessor cmd: /usr/libexec/pcp/bin/pmcpp -rs /var/lib/pcp/config/pmlogger/config.default -I /var/lib/pcp/config/pmlogger Config parsed Starting logger for host "MAGGIE.mydomain.com" via "local:" Archive basename: /tmp/testlog pmlogger: Sample limit reached, exiting Log finished Tue May 5 07:32:40 2020 Will send you the rest via email in due course. As for #20, no 64bit even at the time of the comment, but it looks right now. [root@MAGGIE ~]# dnf list systemd* Last metadata expiration check: 2:09:43 ago on Tue 05 May 2020 05:26:48 CEST. Installed Packages systemd.x86_64 243.8-1.fc31 @updates systemd-bootchart.x86_64 233-5.fc31 @fedora systemd-libs.x86_64 243.8-1.fc31 @updates systemd-pam.x86_64 243.8-1.fc31 @updates systemd-rpm-macros.noarch 243.8-1.fc31 @updates systemd-udev.x86_64 243.8-1.fc31 @updates 243.8-1.fc31
I am still getting these problems even though I am now upgraded to Fedora 32, and I used the uninstall/cleanup script from once before to try and sort out the PCP package. What is going on? June 29, 2020 8:19 AM Failed to start pmie_check.service. systemd 8:19 AM pmie_check.service: Failed at step EXEC spawning /usr/libexec/pcp/bin/pmie_check: No such file or directory systemd 8:19 AM pmie_check.service: Failed to execute command: No such file or directory systemd 8:19 AM Failed to start pmlogger_check.service. systemd 8:19 AM pmlogger_check.service: Failed at step EXEC spawning /usr/libexec/pcp/bin/pmlogger_check: No such file or directory systemd 8:19 AM pmlogger_check.service: Failed to execute command: No such file or directory systemd 8:19 AM pmlogger_daily-poll.timer: Unit to trigger vanished. systemd 8:19 AM pmie_daily.timer: Unit to trigger vanished. systemd 8:19 AM pmlogger_daily.timer: Unit to trigger vanished. systemd 8:19 AM pmie_check.timer: Unit to trigger vanished. systemd 8:19 AM pmlogger_check.timer: Unit to trigger vanished. systemd 8:15 AM Failed to start Performance Metrics Archive Logger. systemd 2 8:15 AM pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log root 8:15 AM Failed to start Performance Metrics Archive Logger. systemd 8:15 AM pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log root 8:14 AM Failed to start Performance Metrics Archive Logger. systemd 8:14 AM pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log root 8:14 AM Failed to start Performance Metrics Archive Logger. systemd 8:14 AM pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log root 8:14 AM Failed to start Performance Metrics Archive Logger. systemd 8:14 AM pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log root 8:06 AM Failed to start Check pmlogger instances are running. systemd 8:06 AM Failed to start Performance Metrics Archive Logger. systemd 8:06 AM Failed to start Check pmlogger instances are running. systemd 8:06 AM Failed to start Performance Metrics Archive Logger. systemd 8:06 AM pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log root 8:06 AM Failed to start Performance Metrics Archive Logger. systemd 8:06 AM pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log root 8:06 AM Failed to start Performance Metrics Archive Logger. systemd 8:06 AM pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log root 8:06 AM Failed to start Performance Metrics Archive Logger. systemd 8:06 AM pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log root 8:06 AM Failed to start Performance Metrics Archive Logger. systemd 8:06 AM pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log root 8:06 AM Failed to start Performance Metrics Archive Logger. systemd 8:06 AM pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log root 8:05 AM pam_systemd(crond:session): Failed to get user record: Connection timed out crond [root@MAGGIE ~]# cat /var/log/pcp/pmlogger/pmlogger_daily-K.log cat: /var/log/pcp/pmlogger/pmlogger_daily-K.log: No such file or directory Installed Packages pcp.x86_64 5.1.1-1.fc32 @updates pcp-conf.x86_64 5.1.1-1.fc32 @updates pcp-libs.x86_64 5.1.1-1.fc32 @updates pcp-selinux.x86_64 5.1.1-1.fc32 @updates
You installation of PCP is broken - this: pmie_check.service: Failed at step EXEC spawning /usr/libexec/pcp/bin/pmie_check: No such file or directory says you are missing critical files that are part of PCP: $ rpm -ql pcp | grep pmlogger_check /usr/lib/systemd/system/pmlogger_check.path /usr/lib/systemd/system/pmlogger_check.service /usr/lib/systemd/system/pmlogger_check.timer /usr/libexec/pcp/bin/pmlogger_check ... which is unrelated to the issue in this BZ.
I get this when I run that command: [root@MAGGIE ~]# rpm -ql pcp | grep pmlogger_check /usr/lib/systemd/system/pmlogger_check.path /usr/lib/systemd/system/pmlogger_check.service /usr/lib/systemd/system/pmlogger_check.timer /usr/libexec/pcp/bin/pmlogger_check Are you telling me now that this doesn't completely re-install PCP? systemctl stop pmcd pmie pmlogger pmproxy redis > /dev/null 2>&1 dnf remove --noautoremove -y pcp\* rm -rf /etc/pcp* /var/lib/pcp/ /var/log/pcp/ /var/lib/dnf/yumdb/p /var/cache/dnf/pcp* systemctl daemon-reload dnf install -y pcp cockpit-pcp systemctl daemon-reload systemctl stop pmcd pmie pmlogger pmproxy systemctl start pmcd pmie pmlogger pmproxy If that is wrong, how I completely reinstall it?
Fedora 32 armv7l, Fedora 32 x86_64: Unit pmlogger_check.path not found. $ rpm -ql pcp | grep pmlogger_check /usr/lib/systemd/system/pmlogger_check.service /usr/lib/systemd/system/pmlogger_check.timer /usr/libexec/pcp/bin/pmlogger_check
Hi all, The current release of PCP for f32 is pcp-5.2.0-1.fc32.x86_64.rpm https://dl.fedoraproject.org/pub/fedora/linux/updates/32/Everything/x86_64/Packages/p/pcp-5.2.0-1.fc32.x86_64.rpm contains all of the PCP service units and none of them reference pmlogger_check.path. So exactly what version of PCP are you running? Have you done a 'systemctl daemon-reload' after upgrading? What is still referencing pmlogger_check.path? find /usr/lib/systemd/system /etc/systemd/system -type f | xargs grep pmlogger_check.path thanks
pcp version 5.2.0 $ find /usr/lib/systemd/system /etc/systemd/system -type f | xargs grep pmlogger_check.path grep: /usr/lib/systemd/system/system-systemdx2dcryptsetup.slice: Нет такого файла или каталога
(In reply to Кощеев from comment #50) > pcp version 5.2.0 > > $ find /usr/lib/systemd/system /etc/systemd/system -type f | xargs grep > pmlogger_check.path > grep: /usr/lib/systemd/system/system-systemdx2dcryptsetup.slice: Нет такого > файла или каталога well that didnt find anything, but we only looked for files, not syslinks. How about: systemctl list-dependencies | grep pmlogger_check Maybe we have an upgrade issue where deprecated units or dependencies aren't being cleaned up properly. To remove stale dependencies: systemctl disable pmlogger_check.path thanks
> systemctl list-dependencies | grep pmlogger_check Empty result. > systemctl disable pmlogger_check.path Removed /etc/systemd/system/pmlogger.service.requires/pmlogger_check.path.
From the logwatch report: Packages Updated: cockpit-pcp-226-1.fc32.armv7hl -> cockpit-pcp-227-1.fc32.armv7hl **Unmatched Entries** Sep 9 04:11:30 ERROR Error in POSTIN scriptlet in rpm package cockpit-pcp