Bug 1672649 - Add dnf.package.Package API for getting pkgid of package from repo in DNF plugin
Summary: Add dnf.package.Package API for getting pkgid of package from repo in DNF plugin
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: dnf
Version: ---
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: rc
: 8.1
Assignee: Jaroslav Mracek
QA Contact: Jan Blazek
URL:
Whiteboard:
Depends On: 1681084
Blocks: 1701002
TreeView+ depends on / blocked
 
Reported: 2019-02-05 14:15 UTC by Jan Pazdziora
Modified: 2020-11-14 16:13 UTC (History)
7 users (show)

Fixed In Version: dnf-4.2.7-1.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-05 22:21:12 UTC
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:3583 None None None 2019-11-05 22:21:20 UTC

Description Jan Pazdziora 2019-02-05 14:15:18 UTC
Description of problem:

In my DNF plugin, I'd like to be able to find package in my custom metadata file based on package's pkgid.

In the *-primary.xml.gz file I see

[...]
<package type="rpm">
  <name>zsh</name>
  <arch>x86_64</arch>
  <version epoch="0" ver="5.5.1" rel="6.el8"/>
  <checksum type="sha256" pkgid="YES">305a2c9dd70c3d00293aca508478851612beeedbf28caa0db9bdde1dac6b2e9f</checksum>
[...]

and in *-filelists.xml.gz there is

[...]
<package pkgid="305a2c9dd70c3d00293aca508478851612beeedbf28caa0db9bdde1dac6b2e9f" name="zsh" arch="x86_64">
  <version epoch="0" ver="5.5.1" rel="6.el8"/>
  <file>/etc/skel/.zshrc</file>
  <file>/etc/zlogin</file>
  <file>/etc/zlogout</file>
  <file>/etc/zprofile</file>
[...]

So there is an element marked with pkgid="YES" in primary whose value seems to be used as @pkgid in secondard metadata files, to match records between multiple metadata files of one repo.

Looking at the sources in

   /usr/lib/python3.6/site-packages/dnf/package.py

there is

    @property
    def _pkgid(self):
        if self.hdr_chksum is None:
            return None
        (_, chksum) = self.hdr_chksum
        return binascii.hexlify(chksum)

there. So it seems to be deriving the information from hdr_chksum instead of using the value from the repository metadata but that should not matter.

However, iterating over dnf.package.Packages returned by self.base.transaction.install_set in my DNF plugin's transaction, the _pkgid seem None.

Version-Release number of selected component (if applicable):

dnf-4.0.9.2-1.el8.noarch
python3-dnf-4.0.9.2-1.el8.noarch

How reproducible:

Deterministic.

Steps to Reproduce:
1. Create DNF plugin file /usr/lib/python3.6/site-packages/dnf-plugins/pkgid.py:

from dnf import Plugin
class pkgid(Plugin):
	name = "pkgid"
	def transaction(self):
		for i in self.base.transaction.install_set:
			print("Installed package " + str(i) + " (" + str(type(i)) + ") pkgid " + str(i._pkgid))

2. Run dnf install -y zsh

Actual results:

  Running scriptlet: zsh-5.5.1-6.el8.x86_64                      1/1 
  Verifying        : zsh-5.5.1-6.el8.x86_64                      1/1 
Installed products updated.
Installed package zsh-5.5.1-6.el8.x86_64 (<class 'dnf.package.Package'>) pkgid None

Expected results:

  Running scriptlet: zsh-5.5.1-6.el8.x86_64                      1/1 
  Verifying        : zsh-5.5.1-6.el8.x86_64                      1/1 
Installed products updated.
Installed package zsh-5.5.1-6.el8.x86_64 (<class 'dnf.package.Package'>) pkgid 305a2c9dd70c3d00293aca508478851612beeedbf28caa0db9bdde1dac6b2e9f

Additional info:

Of course, apart from having decent value, ideally the getter/property shouldn't start with underscore.

Comment 1 Jan Pazdziora 2019-02-05 14:22:07 UTC
Using plugin /usr/lib/python3.6/site-packages/dnf-plugins/location.py

from dnf import Plugin
class location(Plugin):
	name = "location"
	def transaction(self):
		for i in self.base.transaction.install_set:
			print("Installed package " + str(i) + " (" + str(type(i)) + ") location " + str(i.location))

which displays location (href) of the installed package, the value from the metadata is there:

  Running scriptlet: zsh-5.5.1-6.el8.x86_64                        1/1 
  Verifying        : zsh-5.5.1-6.el8.x86_64                        1/1 
Installed products updated.
Installed package zsh-5.5.1-6.el8.x86_64 (<class 'dnf.package.Package'>) location Packages/zsh-5.5.1-6.el8.x86_64.rpm

So some metadata is available, and I believe pkgid should be too.

Comment 2 Jan Pazdziora 2019-02-05 15:15:43 UTC
When I add

		for i in self.base.transaction.remove_set:
			print("Removed package " + str(i) + " (" + str(type(i)) + ") pkgid " + str(i._pkgid))

to the pkgid.py plugin source from comment 0 and run

# dnf reinstall -y zsh

I get

  Verifying        : zsh-5.5.1-6.el8.x86_64                                                                                                                         1/2 
  Verifying        : zsh-5.5.1-6.el8.x86_64                                                                                                                         2/2 
Installed products updated.
Installed package zsh-5.5.1-6.el8.x86_64 (<class 'dnf.package.Package'>) pkgid None
Removed package zsh-5.5.1-6.el8.x86_64 (<class 'dnf.package.Package'>) pkgid b'b8ad60bda6a5f13f1fbde3a266792f0f72439a9e'

That value seems to be matching

# rpm -q --qf '%{SHA1HEADER}\n' zsh
b8ad60bda6a5f13f1fbde3a266792f0f72439a9e

for the installed package. So perhaps my expectation for _pkgid to return the checksum marked with pkgid="YES" from the metadata is not justified because for the metadata file, that checksum is checksum of the whole rpm file, while for package in rpm database, it is the SHA1HEADER. But then there should be some other getter to get at that value from primary.xml.

I also wonder if the use of SHA1HEADER (and not SHA256HEADER) is long-term sustainable.

Comment 3 Jan Pazdziora 2019-02-06 08:14:28 UTC
I've found that using p.chksum[1].hex() gives me the string matching the checksum value from the primary.xml and the pkgid values used in filelists and other. However, that does not seem to be documented at https://dnf.readthedocs.io/en/latest/api_package.html and that documentation does not point to any parent class from which the value could be derived.

Also, I assume that the pkgid="YES" attribute in

<checksum type="sha256" pkgid="YES">305a2c9dd70c3d00293aca508478851612beeedbf28caa0db9bdde1dac6b2e9f</checksum>

means that for this particular repository, this is the element whose text() is used for referencing package entries between primary and filelists and other. Which would suggest that in the future, different element could be used for that purpose, so hardcoding the use of

p.chksum[1].hex()

does not sound as generic solution.

Comment 7 Jaroslav Mracek 2019-03-20 16:03:03 UTC
Please could you try a function pkg.returnIdSum()? On a second position there is a checksum originated from metadata.

In libsolv we have a possibility to search for SOLVABLE_CHECKSUM or SOLVABLE_PKGID. Not sure if there is a difference, but my question is what you would prefer? Do you want a checksum or id that is used to combine metadata from different files for a particular package.

Comment 8 Jaroslav Mracek 2019-03-20 19:45:39 UTC
I can also confirm that there is no difference between data receiving using SOLVABLE_CHECKSUM or SOLVABLE_PKGID type (based on Fedora data set).

Comment 10 Jan Pazdziora 2019-03-29 10:39:12 UTC
I confirm that replacing

   p.chksum[1].hex()

with

   p.returnIdSum()[1]

works on Fedoras 28+ and RHEL 8. Is the second one preferred from the long-time point of view?

Comment 11 Jaroslav Mracek 2019-03-29 17:16:56 UTC
returnIdSum is not marked as API, but we can change. What you prefer?

Comment 12 Jan Pazdziora 2019-03-29 21:21:50 UTC
I'd like *some* API call to get the information. I don't care which call gets marked as API, primarily because I am not familiar with the future plans to say which call is more likely to stay and be easy to support long term.

Comment 13 Jan Pazdziora 2019-04-05 11:21:20 UTC
Hello, what should I plan for here? Use p.returnIdSum()[1], p.chksum[1].hex(), or something else?

Comment 15 Jan Pazdziora 2019-04-11 06:58:55 UTC
Moving back to assigned because it's not clear to me what the decision is.

Comment 16 Jaroslav Mracek 2019-05-10 12:42:17 UTC
I documented package attribute chcksum as API (https://github.com/rpm-software-management/dnf/pull/1396). Please is it what you request or is there something else?

Comment 17 Jan Pazdziora 2019-05-14 13:50:14 UTC
I've added a comment to the pull request to make the purpose of that value a bit clearer. Otherwise good, we will go with p.chksum[1].hex(). Thanks.

Comment 25 errata-xmlrpc 2019-11-05 22:21:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:3583


Note You need to log in before you can comment on or make changes to this bug.