1901130 – Add container images for pmcd and pmlogger

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1901130 - Add container images for pmcd and pmlogger

Summary: Add container images for pmcd and pmlogger

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	pcp-container
Sub Component:
Version:	8.3
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	rc
Target Release:	8.0
Assignee:	Andreas Gerstmayr
QA Contact:	Jan Kurik
Docs Contact:	Apurva Bhide
URL:
Whiteboard:
Depends On:	1854035
Blocks:
TreeView+	depends on / blocked

Reported:	2020-11-24 14:52 UTC by Peter Portante
Modified:	2021-09-17 12:44 UTC (History)
CC List:	5 users (show)
Fixed In Version:	pcp-container-5-30
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-05-19 00:24:06 UTC
Type:	Feature Request
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Allow direct access to pmcd in the pcp container (2.43 KB, patch) 2020-11-25 23:34 UTC, Nathan Scott	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2021:2012	0	None	None	None	2021-05-19 00:24:08 UTC

Description Peter Portante 2020-11-24 14:52:53 UTC

It'd be great if we could have a pmcd container image, which would encapsulate everything the pmcd.service would need to run in isolation.  That way, one could just "podman run --net=host pmcd:latest" and we'd have a container with all the processes which a remote pmlogger could talk to.

Then if we had a "pmlogger" container which we could mount into its name space a configuration file directing it to pull from all the remote pmcd hosts, we'd have a tight solution easily deploying in RHEL or OpenShift environments.

And in particular, the pbench team could leverage those containers directly when running benchmarks on environments where PCP is not installed already.

Comment 1 Nathan Scott 2020-11-24 21:07:13 UTC

Here's my notes / reservations from #pbench slack ...

I can see some pitfalls with this multi-container approach BTW, may prove to be a rats nest (pros and cons, however).

A 'small pmcd container with all PMDAs' is not a small container at all.  The PMDAs bring in all of perl and all python, and a whole heap of dependencies.

OTOH, having PMDAs in their own containers solves that, but introduces new problems (pmcd has a strong builtin expectation that it will be starting PMDAs as children).

I also worry from a product / marketing POV, having alot of containers feeds the punters who say 'pcp is too complex' ... the existing one container is a big win from that POV.  Some of the daemons cooperate closely too - eg pmie is also being used to monitor pmcd and auto-correct  (a pcp-pmie container would also be needed in the 'many containers' model)

In summary: from a testing & release POV, and administrative POV, I think the existing pcp container actually has alot going for it.  It may be many 1000x's simpler to document how to turn the existing container into a 'logging enabled' container, a 'pmie enabled' container, to have optional PMDAs enabled, etc.

Comment 2 Nathan Scott 2020-11-25 23:28:42 UTC

Also from #pbench slack ... (smaller part of larger discussion)

Another issue I remembered overnight is that the pcp-pmlogger container needs systemd too (for timers, it relies on those for compression and other general pcp archive housekeeping).  These 'small standalone one-daemon' pcp containers seem to always end up needing more than we at first thought.

Comment 3 Nathan Scott 2020-11-25 23:34:29 UTC

Created attachment 1733566 [details]
Allow direct access to pmcd in the pcp container

Comment 4 Nathan Scott 2020-11-25 23:36:02 UTC

Following on a little from earlier discussions:

The attached patch adds the exposing of the pmcd port to https://pkgs.fedoraproject.org/container/pcp

I noticed we already have a classic pmlogger setup there, incl. a VOLUME declaration for it, so that side of things was a no-op.

Not sure how best to tackle the switch-some-services-off-by-default aspects Peter was looking for today - should this be another environment variable alongside HOST_MOUNT and REDIS_SERVERS perhaps?  e.g. a space-separated list of services, with default value like:
PCP_SERVICES="pmcd pmie pmproxy pmlogger"

Maybe we also want a PCP_TARGET_HOSTS variable (array of hostnames) to facilitate a pmlogger+pmie farm setup?  (mirroring the ansible pcp role)

Comment 5 Andreas Gerstmayr 2020-11-26 15:51:08 UTC

(In reply to Nathan Scott from comment #4)
> The attached patch adds the exposing of the pmcd port to
> https://pkgs.fedoraproject.org/container/pcp
> 
> I noticed we already have a classic pmlogger setup there, incl. a VOLUME
> declaration for it, so that side of things was a no-op.

I've tested it, works great. Thanks!
I've applied the patch to the fedora container. There's an approx 1 week outage at the moment (https://pagure.io/fedora-infrastructure/issue/9489), will do a new build after that.
For the RHEL container I'll need a BZ with all the appropriate flags (or should I hijack this one? :p).

> Not sure how best to tackle the switch-some-services-off-by-default aspects
> Peter was looking for today - should this be another environment variable
> alongside HOST_MOUNT and REDIS_SERVERS perhaps?  e.g. a space-separated list
> of services, with default value like:
> PCP_SERVICES="pmcd pmie pmproxy pmlogger"

Sounds good, I'll move the systemctl enable to the container-entrypoint script and enable all services of this env var.

> Maybe we also want a PCP_TARGET_HOSTS variable (array of hostnames) to
> facilitate a pmlogger+pmie farm setup?  (mirroring the ansible pcp role)

Not sure about that one. How much logic do we want in the entrypoint script?
I'd suggest to keep it as minimal as possible, for advanced customization people should use this container as a base and add all customizations in a new container.
Fwiw, you can also use a container bind mount to mount the pmlogger configs into the container.

Comment 6 Andreas Gerstmayr 2020-11-26 17:31:18 UTC

(In reply to Andreas Gerstmayr from comment #5)
> (In reply to Nathan Scott from comment #4)
> > Not sure how best to tackle the switch-some-services-off-by-default aspects
> > Peter was looking for today - should this be another environment variable
> > alongside HOST_MOUNT and REDIS_SERVERS perhaps?  e.g. a space-separated list
> > of services, with default value like:
> > PCP_SERVICES="pmcd pmie pmproxy pmlogger"
> 
> Sounds good, I'll move the systemctl enable to the container-entrypoint
> script and enable all services of this env var.

Implemented upstream: https://src.fedoraproject.org/container/pcp/c/a4590f7949a8efcb8c6ef9fff4df8ca064872e66

pcp-zeroconf actually enables most of the PCP services (in the %post part of the specfile), so this commit is enabling/disabling the services based on the PCP_SERVICES environment variable.

Comment 7 Nathan Scott 2020-11-26 22:11:05 UTC

(In reply to Andreas Gerstmayr from comment #5)
> [...]
> For the RHEL container I'll need a BZ with all the appropriate flags (or
> should I hijack this one? :p).

Yep - just use this one I reckon.

I notice also the config file additions recently into the pcp container repo.
There's becoming a bit of spread of configs across trees now which may become
a maintenance burden down the track (pcp, ansible-pcp, pcp-container have dup
configs).  I wonder if we could/should add a container-building task into the
ansible-pcp repo to remove some duplication?  Not sure, there might be issues
relating to build deps using Ansible in RHEL/Fedora... just a thought anyway.

Comment 8 Andreas Gerstmayr 2020-11-27 13:50:30 UTC

(In reply to Nathan Scott from comment #7)
> (In reply to Andreas Gerstmayr from comment #5)
> > [...]
> > For the RHEL container I'll need a BZ with all the appropriate flags (or
> > should I hijack this one? :p).
> 
> Yep - just use this one I reckon.
> 
> I notice also the config file additions recently into the pcp container repo.
> There's becoming a bit of spread of configs across trees now which may become
> a maintenance burden down the track (pcp, ansible-pcp, pcp-container have dup
> configs).  I wonder if we could/should add a container-building task into the
> ansible-pcp repo to remove some duplication?  Not sure, there might be issues
> relating to build deps using Ansible in RHEL/Fedora... just a thought anyway.

You mean building the container from the system roles using something like ansible-bender? We could to that upstream, but I'm certain our internal container tooling doesn't support that.

Yep, config file mgmt is indeed a burden, especially for sysadmins which manage tons of different services, where the config files of random services may need updates once in a while ;)
A bit unrelated, but how is the system role managing different config files across versions? Let's say I'm managing different PCP versions (RHEL 6, 7, 8 for example) with the same role. The role would need multiple configuration file templates, one for each PCP version (only if the config file changed of course).

Back on topic, I don't think it's feasible because we have to stick with the tooling support we have - I just run a colored diff between the template and PCP default config; same I have to do for the Grafana packaging, where the config file gets many additions with each release as well and I need to update the distribution default config (which uses different paths than the original config).

Comment 9 Andreas Gerstmayr 2020-11-27 14:21:20 UTC

(In reply to Andreas Gerstmayr from comment #8)
> A bit unrelated, but how is the system role managing different config files
> across versions? Let's say I'm managing different PCP versions (RHEL 6, 7, 8
> for example) with the same role. The role would need multiple configuration
> file templates, one for each PCP version (only if the config file changed of
> course).

Just checked, the redis role has multiple files (for each redis release) and symlinks (for each distro). Neat :)
The PCP configs don't have versioning (afaics not required yet, but possibly required in the future?).

Sorry for off-topic.

Comment 10 Nathan Scott 2020-11-29 23:01:07 UTC

(In reply to Andreas Gerstmayr from comment #9)
> (In reply to Andreas Gerstmayr from comment #8)
> > A bit unrelated, but how is the system role managing different config files
> > across versions? Let's say I'm managing different PCP versions (RHEL 6, 7, 8
> > for example) with the same role. The role would need multiple configuration
> > file templates, one for each PCP version (only if the config file changed of
> > course).
> 
> Just checked, the redis role has multiple files (for each redis release) and
> symlinks (for each distro). Neat :)
> The PCP configs don't have versioning (afaics not required yet, but possibly
> required in the future?).

Yep - Ansible has platform version info which we use for one or two configs,
and more commonly for knowing which packages to install on which platforms.

> Sorry for off-topic.

No problem at all - and no problem if the current status quo is the best set of
tradeoffs.  I mainly raised the issue to just make sure we're both aware of the
duplicated files going forward & the need to sync things up, back-compat, etc.

cheers.

Comment 12 Jan Kurik 2021-01-22 04:47:05 UTC

Run of all the PCP services (pmcd, pmlogger, pmie, pmproxy) in the PCP container can be controlled by PCP_SERVICES env. variable passed to the container.
Unfortunately pmlogger and pmie services contain the following line in their systemd-unit file:

Wants=pmcd.service

This starts pmcd when pmlogger or pmie are started, even the pmcd is not on the list of services which should be started.


i.e. the following command starts only pmproxy service:

podman run -d --systemd always --privileged --net host -v /tmp/tmp.1BHcseB41n/pmlogger:/var/log/pcp/pmlogger -v /tmp/tmp.1BHcseB41n/logs:/mnt -v /:/host:ro,rslave -e HOST_MOUNT=/host -e PCP_SERVICES=pmproxy

However the following command starts pmlogger and pmcd, even only pmlogger is requested:

podman run -d --systemd always --privileged --net host -v /tmp/tmp.1BHcseB41n/pmlogger:/var/log/pcp/pmlogger -v /tmp/tmp.1BHcseB41n/logs:/mnt -v /:/host:ro,rslave -e HOST_MOUNT=/host -e PCP_SERVICES=pmlogger


There is an upstream discussion about the behavior of the system startup scripts and unit files: https://github.com/performancecopilot/pcp/issues/1200 related to this issue.

Comment 22 Jan Kurik 2021-03-04 18:35:22 UTC

The pcp-container image can run with standalone pmcd, pmlogger or pmproxy service only. Standalone pmie is expected in RHEL-8.5 when commit 68aaaa763fe76a0f1384064763fd517f28458f54 from PCP github repo arrives to RHEL.
Considering as verified.

Comment 25 errata-xmlrpc 2021-05-19 00:24:06 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (updated rhel8/pcp container image), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2012

Note You need to log in before you can comment on or make changes to this bug.