Bug 1956608

Summary: [RFE] switch PCP podman metrics to using the REST API
Product: Red Hat Enterprise Linux 8 Reporter: Nathan Scott <nathans>
Component: pcpAssignee: Nathan Scott <nathans>
Status: CLOSED ERRATA QA Contact: Jan Kurik <jkurik>
Severity: high Docs Contact: Apurva Bhide <abhide>
Priority: high    
Version: 8.5CC: agerstmayr, dwalsh, jhonce, jkurik, mgoodwin, mheon, nathans, patrickm, pthomas, tsweeney
Target Milestone: rcKeywords: FutureFeature, Triaged
Target Release: 8.5   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: pcp-5.3.1-4.el8 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-09 17:50:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1813845, 1958732, 1962019    
Bug Blocks:    

Description Nathan Scott 2021-05-04 04:18:59 UTC
The libvarlink interface the PCP podman metrics agent is using to access values has been deprecated and will shortly be retired.  The PCP podman code should be rewritten to use the new REST API.

Comment 1 Tom Sweeney 2021-05-04 22:22:56 UTC
It is my understanding that without this change, PCP will not be able to monitor Podman starting in RHEL 8.4.  Once converted, hopefully for RHEL 8.5 and RHEL 9.0, then PCP will once again be able to monitor Podman.

Please let me know if there's any help/pointers that you need from the Podman side of things to accomplish this.

Comment 2 Nathan Scott 2021-05-07 06:54:40 UTC
(In reply to Tom Sweeney from comment #1)
> It is my understanding that without this change, PCP will not be able to
> monitor Podman starting in RHEL 8.4.  Once converted, hopefully for RHEL 8.5
> and RHEL 9.0, then PCP will once again be able to monitor Podman.
> 
> Please let me know if there's any help/pointers that you need from the
> Podman side of things to accomplish this.

Thanks Tom - one possible issue I've encountered is that REST API queries
to the pods/json endpoint appear to hang when there are no pods defined -
is this a known issue?  (I'm on F34, happy to open a separate BZ if its a
new one).

$ sudo curl -s --unix-socket /run/podman/podman.sock http://d/v3.0.0/libpod/containers/json
[{"AutoRemove":false,"Command":["bash"],"Created":"2021-05-07T15:18:58.11280636+10:00","CreatedAt":"","Exited":false,"ExitedAt":-62135596800,"ExitCode":0,"Id":"d747ee1448150d051b34a8d675c9c8de49a9db78406da1b1fcad4c88523a1459","Image":"registry.fedoraproject.org/fedora:latest","ImageID":"eb7134a03cd6bd8a3de99c16cf174d66ad2d93724bac3307795efcd8aaf914c5","IsInfra":false,"Labels":{"license":"MIT","name":"fedora","vendor":"Fedora Project","version":"32"},"Mounts":[],"Names":["eloquent_euler"],"Namespaces":{},"Networks":["podman"],"Pid":3201686,"Pod":"","PodName":"","Ports":null,"Size":null,"StartedAt":1620364738,"State":"running","Status":""}]

$ sudo curl -s --unix-socket /run/podman/podman.sock http://d/v3.0.0/libpod/pods/json
[ hangs ]
^C

$ sudo curl -s --unix-socket /run/podman/podman.sock http://d/v3.0.0/libpod/containers/json
[ also hangs now ]
^C

(however, accesses to a regular users podman socket continue just fine -
for both cases, possibly because I have both pods and containers defined
and running?  not sure)

Thanks!

Comment 4 Matthew Heon 2021-05-07 17:47:57 UTC
Sounds like a bug. Can you go ahead and open a fresh BZ for it?

Comment 5 Nathan Scott 2021-05-10 03:52:34 UTC
(In reply to Matthew Heon from comment #4)
> Sounds like a bug. Can you go ahead and open a fresh BZ for it?

Sure thing - https://bugzilla.redhat.com/show_bug.cgi?id=1958732

Comment 6 Nathan Scott 2021-05-14 02:01:53 UTC
This is fixed upstream now ... (we definitely need a podman REST
API bug fix too, but that's now a separate BZ).  In the meantime
I'll get a PCP build through so that the varlink retirement work
can proceed in parallel.


commit d1ab871011f7be688e9c25ed2589f48253ba9f56
Author: Nathan Scott <nathans>
Date:   Thu May 13 17:11:44 2021 +1000

    pmdapodman: switch from libvarlink to the podman REST API
    
    Recent versions of podman have retired the use of libvarlink,
    preferring end users access that functionality via a REST API
    nowadays.  This change converts PCP to using these REST calls
    and removes all references to libvarlink (QA, build, specfile
    and so on).  There's some minor selinux impact too.
    
    Additionally, I've implemented exporting of podman labels via
    PCP metric labels now also.  Additionally, the agent supports
    access to both root and rootless container metrics now.  I'll
    tackle pmdaroot support for this too (container.* metrics) as
    a follow-up commit.
    
    There's one caveat to all this at present: accessing pods via
    the REST API causes podman to hang - I'm working with podman
    developers to diagnose and fix that - Red Hat BZ #1958732.
    
    Resolves Red Hat bug #1956608
    Resolves https://github.com/performancecopilot/pcp/issues/657
    Related to https://github.com/performancecopilot/pcp/issues/913

Comment 13 Nathan Scott 2021-05-19 07:54:35 UTC
Jan, I think we may need to clarify this one - we'll need a different BZ (that will go to podman folk) for the issues you're seeing.

This BZ is about the libvarlink dependency, which AFAIK is actually resolved.  The podman PMDA as-is cannot work in 8.4 due to podman changes we were not informed of before now, so we can't really blame the updates in this BZ on that pre-existing problem.

IOW oing on the actual issue in this BZ, failed-QA is IMO not appropriate here after all (I know we discussed this, but I didn't think of this last night) - I think we should mark this dependency aspect as resolved (if the dependency is really verified as gone), and open new BZs (linked to the existing Fedora one I opened) for the new aspects of this failure.

Comment 18 errata-xmlrpc 2021-11-09 17:50:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pcp bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4171