Bug 1956608 - [RFE] switch PCP podman metrics to using the REST API
Summary: [RFE] switch PCP podman metrics to using the REST API
Keywords:
Status: MODIFIED
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: pcp
Version: 8.5
Hardware: All
OS: All
high
high
Target Milestone: rc
: 8.5
Assignee: Nathan Scott
QA Contact: Jan Kurik
Apurva Bhide
URL:
Whiteboard:
Depends On: 1813845
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-04 04:18 UTC by Nathan Scott
Modified: 2021-05-14 04:31 UTC (History)
10 users (show)

Fixed In Version: pcp-5.3.0-4.el8
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)

Description Nathan Scott 2021-05-04 04:18:59 UTC
The libvarlink interface the PCP podman metrics agent is using to access values has been deprecated and will shortly be retired.  The PCP podman code should be rewritten to use the new REST API.

Comment 1 Tom Sweeney 2021-05-04 22:22:56 UTC
It is my understanding that without this change, PCP will not be able to monitor Podman starting in RHEL 8.4.  Once converted, hopefully for RHEL 8.5 and RHEL 9.0, then PCP will once again be able to monitor Podman.

Please let me know if there's any help/pointers that you need from the Podman side of things to accomplish this.

Comment 2 Nathan Scott 2021-05-07 06:54:40 UTC
(In reply to Tom Sweeney from comment #1)
> It is my understanding that without this change, PCP will not be able to
> monitor Podman starting in RHEL 8.4.  Once converted, hopefully for RHEL 8.5
> and RHEL 9.0, then PCP will once again be able to monitor Podman.
> 
> Please let me know if there's any help/pointers that you need from the
> Podman side of things to accomplish this.

Thanks Tom - one possible issue I've encountered is that REST API queries
to the pods/json endpoint appear to hang when there are no pods defined -
is this a known issue?  (I'm on F34, happy to open a separate BZ if its a
new one).

$ sudo curl -s --unix-socket /run/podman/podman.sock http://d/v3.0.0/libpod/containers/json
[{"AutoRemove":false,"Command":["bash"],"Created":"2021-05-07T15:18:58.11280636+10:00","CreatedAt":"","Exited":false,"ExitedAt":-62135596800,"ExitCode":0,"Id":"d747ee1448150d051b34a8d675c9c8de49a9db78406da1b1fcad4c88523a1459","Image":"registry.fedoraproject.org/fedora:latest","ImageID":"eb7134a03cd6bd8a3de99c16cf174d66ad2d93724bac3307795efcd8aaf914c5","IsInfra":false,"Labels":{"license":"MIT","name":"fedora","vendor":"Fedora Project","version":"32"},"Mounts":[],"Names":["eloquent_euler"],"Namespaces":{},"Networks":["podman"],"Pid":3201686,"Pod":"","PodName":"","Ports":null,"Size":null,"StartedAt":1620364738,"State":"running","Status":""}]

$ sudo curl -s --unix-socket /run/podman/podman.sock http://d/v3.0.0/libpod/pods/json
[ hangs ]
^C

$ sudo curl -s --unix-socket /run/podman/podman.sock http://d/v3.0.0/libpod/containers/json
[ also hangs now ]
^C

(however, accesses to a regular users podman socket continue just fine -
for both cases, possibly because I have both pods and containers defined
and running?  not sure)

Thanks!

Comment 4 Matthew Heon 2021-05-07 17:47:57 UTC
Sounds like a bug. Can you go ahead and open a fresh BZ for it?

Comment 5 Nathan Scott 2021-05-10 03:52:34 UTC
(In reply to Matthew Heon from comment #4)
> Sounds like a bug. Can you go ahead and open a fresh BZ for it?

Sure thing - https://bugzilla.redhat.com/show_bug.cgi?id=1958732

Comment 6 Nathan Scott 2021-05-14 02:01:53 UTC
This is fixed upstream now ... (we definitely need a podman REST
API bug fix too, but that's now a separate BZ).  In the meantime
I'll get a PCP build through so that the varlink retirement work
can proceed in parallel.


commit d1ab871011f7be688e9c25ed2589f48253ba9f56
Author: Nathan Scott <nathans@redhat.com>
Date:   Thu May 13 17:11:44 2021 +1000

    pmdapodman: switch from libvarlink to the podman REST API
    
    Recent versions of podman have retired the use of libvarlink,
    preferring end users access that functionality via a REST API
    nowadays.  This change converts PCP to using these REST calls
    and removes all references to libvarlink (QA, build, specfile
    and so on).  There's some minor selinux impact too.
    
    Additionally, I've implemented exporting of podman labels via
    PCP metric labels now also.  Additionally, the agent supports
    access to both root and rootless container metrics now.  I'll
    tackle pmdaroot support for this too (container.* metrics) as
    a follow-up commit.
    
    There's one caveat to all this at present: accessing pods via
    the REST API causes podman to hang - I'm working with podman
    developers to diagnose and fix that - Red Hat BZ #1958732.
    
    Resolves Red Hat bug #1956608
    Resolves https://github.com/performancecopilot/pcp/issues/657
    Related to https://github.com/performancecopilot/pcp/issues/913


Note You need to log in before you can comment on or make changes to this bug.