Bug 1120253 - dnf uses different rpmdb version checksum than yum (due to output change)
Summary: dnf uses different rpmdb version checksum than yum (due to output change)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: dnf
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Daniel Mach
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-16 14:27 UTC by James Antill
Modified: 2018-12-12 13:18 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-12-12 13:18:47 UTC


Attachments (Terms of Use)

Description James Antill 2014-07-16 14:27:29 UTC
Description of problem:

yum had:

class PackageSackVersion:
[...]
    def update(self, pkg, csum):
        self._num += 1
        self._chksum.update(str(pkg))
        if csum is not None:
            self._chksum.update(csum[0])
            self._chksum.update(csum[1])

...and it's mostly the same in dnf, except that dnf changed what str(pkg) produces (moving from pkg.ui_envra to pkg.ui_nevra, in yum API terms). This changes all the checksums produced.

Comment 1 Ales Kozumplik 2014-07-16 14:33:52 UTC
Hi James,

We don't provide any guarantees about how the checksum is generated exccept that it is unique for the given set of packages. What DNF does is working fine from this specification's standpoint. I wonder whether we even expose the checksums somewhere.

Comment 2 James Antill 2014-07-16 15:00:30 UTC
 They are in history/ts-load/etc., and exposed via. "yum version" (although last I checked that didn't work in dnf) ... ostree has recently got support to look at them for groups of trees, so it'll be a giant snafu if it has to guess which type of rpmdbv needs to be generated, or have to generate both yum and dnf rpmdbvs.

 It's a one line fix at this point, after DNF goes live it'll be almost impossible to fix and everything will have to live with the incompatibility forever.

Comment 3 Ales Kozumplik 2014-07-21 07:14:19 UTC
Please describe the use case in more detail: why does ostree need to make a hashsum of the rpmdb, why do you think it needs to be the same one for Yum and for DNF? What kind of problems would different checksum from Yum and from DNF create for ostree? Ostree should be able to make its own rpmdb hashsum no? Are there any real problems this has caused already or is this bug a speculation?

In DNF we always try to do the right thing. Even if this is "one line fix at this point"---this doesn't excuse any change from being unsystematic or wrong. If I change this now there is no guerantee somebody else will find the checksum generation inconsistent and change it to something else later unless the use case and the reasoning for a particular algorithm is clearly described.

Comment 4 James Antill 2014-08-29 05:10:58 UTC
> why does ostree need to make a hashsum of the rpmdb

For the same reason yum makes it ... so users can see/use it and compare package sets easily across machines. I understand dnf still hasn't implemented the version/tsload commands yet, but it is still visible in history (and presumably the next maintainer will fix the missing command regressions).

> why do you think it needs to be the same one for Yum and for DNF?

 Because having two identifiers for the same things is almost worse than having none?

> In DNF we always try to do the right thing. Even if this is "one line fix at this point"---this doesn't excuse any change from being unsystematic or wrong. If I change this now there is no guerantee somebody else will find the checksum generation inconsistent and change it to something else later unless the use case and the reasoning for a particular algorithm is clearly described.

I would have hoped that the words "backwards compatibility" would have been enough here, even given your previous actions.
This wasn't a change you thought about and made, you changed the output of packages on a whim and didn't check to see if you'd broken anything accidentally. You had. It affects all users and programs that want to use the feature.

Comment 5 Ales Kozumplik 2014-09-08 07:07:26 UTC
James, let's keep the discussion on a technical level. I assure you we put a lot of weight on backwards compatibility but at the same time balance it with other priorities, maintainability, flexibility and consistency.

If I understand it correctly: what ostree needs in DNF now is the 'version' command that for the same set of packages shows the same hash as Yum. Can you confirm that would be sufficient? Thanks!

Comment 6 Ales Kozumplik 2014-09-08 07:27:40 UTC
Also, as Panu pointed out to me, would you please explain what benefit of 'yum version' is there in this use case (i.e. determining whether two installations have the same set of packages) against simply doing:

$ rpm -qa|sort|sha1sum

Thank you.

Comment 7 Jaroslav Reznik 2015-03-03 17:03:00 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22

Comment 8 James Antill 2015-03-12 19:45:08 UTC
If you guys are going to fix this regression, then the sooner the better. Esp. after F22 has gone GA I think we'll be stuck with this stupid incompatibility forever.

Comment 9 Radek Holy 2015-03-12 20:35:32 UTC
James, could you please answer Ales' questions?

Comment 10 James Antill 2015-03-26 04:10:05 UTC
What questions?

This:

> Also, as Panu pointed out to me, would you please explain what benefit of 'yum version' is there in this use case (i.e. determining whether two installations have the same set of packages) against simply doing:

> $ rpm -qa|sort|sha1sum

That's a bit like asking what benefit there is to yum list when you can just as easily do rpm -qa|sort|grep .... but I don't even care if you implement dnf version again, I'm just asking that you fix the fact that rpmdb versions in dnf and yum don't match anymore due to a random one line UI change Ales did in dnf.
If you don't change it then rpmdb's outside of the main package manager (and remember they'll be at least 3 in F22/el8, with the distinct possibility of 2 in el7) will require knowing if it's dnf or yum.

Comment 11 Radek Holy 2015-03-26 09:23:47 UTC
Thank you. As you can see, we have problems to understand the problem that you've reported. So, please, be patient.

I understand the implementation of the checksums as something internal to DNF. So far, I even thought that they are completely unnecessary and we shouldn't print them at all. Now, I understand that they may help user to compare system states between different historical transactions. But we still didn't declare anywhere the properties of the checksums, so so far no user can expect that they are compatible with other programs.

(In reply to James Antill from comment #10)
> That's a bit like asking what benefit there is to yum list when you can just
> as easily do rpm -qa|sort|grep ....

Well, the added value of "dnf list" is that it can list the packages available in repositories. You cannot achieve that with "rpm" easily. On the other hand, I don't see which value can DNF add into the potential "dnf version" command.

So, can you elaborate on the use case that includes ostree which you have mentioned above? Why the user needs to use DNF for such thing? Also, if it should work regardless of which package manager is used on the given system, I think even more that the user should rather use RPM and not force all the package managers to implement this feature.

Comment 12 Radek Holy 2015-03-26 09:43:18 UTC
Oh, and I want to make clear one thing. We are not lazy to change the way how are the checksums generated. We just don't want users to use DNF in a way in which it wasn't designed. So, if it turns out that it is something that DNF should support (and for this we need to understand the use case), we'll start thinking about the proper implementation.

Comment 13 Radek Holy 2015-04-02 09:13:45 UTC
Actually, thinking about it more, since in theory one can install such package set that may result in another state with the same checksum, I think again that the checksums are just an internal information that shouldn't be exposed to the user at all. (Or we should implement it better...)

Comment 14 James Antill 2015-04-22 04:15:49 UTC
> Well, the added value of "dnf list" is that it can list the packages available in repositories. On the other hand, I don't see which value can DNF add into the potential "dnf version" command.

 Just s/list/version/

> We are not lazy to change the way how are the checksums generated.

 I'd assume not given it's a trivial 1 character change which broke it, and a simple one line change to fix it again without affecting anything else.

> We just don't want users to use DNF in a way in which it wasn't designed.

Good luck with that.

> So, if it turns out that it is something that DNF should support (and for this we need to understand the use case), we'll start thinking about the proper implementation.

Yeh, sure, whatever ... there aren't many ways that 1 character change can happen. But, hey, at least it's incompatible.

> Actually, thinking about it more, since in theory one can install such package set that may result in another state with the same checksum, I think again that the checksums are just an internal information that shouldn't be exposed to the user at all. (Or we should implement it better...)

I'm not sure what you mean here, that they aren't affected by tsflags=nodocs/_install_langs? That's true, but it's a fine line between that and looking for random file changes etc. ... at which point it's not really a package checksum but a file system one.

The checksums are currently exposed in dnf history, at least (hiding them there is only going to cause pain, unless you also remove the checking functionality, but you have 4-5 package managers for f22 now so I'd assume you'd find that even more helpful).
 As you say, you don't implement "version" or "tsload" and maybe you don't have to reimplement them or anything like them.

Comment 15 Radek Holy 2015-04-22 08:32:50 UTC
What I mean is that in theory, the following situation may happen:
1) you have only 'foo' and 'bar' installed
2) the checksum may be e.g. 41065134d285f43fefe709f4fb7ae39413288e00
3) you install 'baz' on top of that
4) but coincidentally, the resulting checksum may be 41065134d285f43fefe709f4fb7ae39413288e00 again (due to the nature of hashes)

Because of that I think that the checksum has no informational value for the user and thus there is no need to expose it.

And since the checksums are printed by accident (I mean, someone just took the code from YUM), this is not a feature and thus we cannot talk about incompatibilities between different DNF versions. And since it is not a feature, the only incompatibility between YUM and DNF (I mean in context of this report) is that DNF provides no supported means to display RPMDB checksums. And until we understand the use case that requires such a feature, there is no intention to support it AFAIK.

Comment 16 James Antill 2015-05-18 19:57:45 UTC
(In reply to Radek Holy from comment #15)
> What I mean is that in theory, the following situation may happen:

1. The rpmdbvs (as shown to the user) would be 2:410...e00 and 3:410...e00 which are different.

2. Even if you changed your "example" to remove bar and install baz this would be incredibly unlikely (maybe even impossible due to the constraints on input).

The reality is that this situation never happens, and the rpmdb version just works for showing machines with different packages (esp. when rpm-qa is identical).

> Because of that I think that the checksum has no informational value for the user and thus there is no need to expose it.

We added it in response to user demand, and I spoke to syadmins who were aware of it and really wanted us to backport it to rhel5 after we created it.
Yes, everyone (and all software) can create their own (almost certainly worse) reimplementation of it so users now have 2/3/4/666 reimplementations that mean mostly the same thing ... but how is that a net gain?

> And since the checksums are printed by accident (I mean, someone just took
> the code from YUM), this is not a feature and thus we cannot talk about
> incompatibilities between different DNF versions.

When you reimpemtent ts-load you'll want the checks that are in yum, in fact I find it hard to believe you won't regret removing the checks in history/etc.

Comment 17 Honza Silhan 2015-07-20 11:04:09 UTC
The change back to yum computed checksums would cause incompatibility with older DNF versions.

Comment 18 Daniel Mach 2018-07-18 13:09:56 UTC
We're going to revisit this to eventually revert back to YUM3 behavior and remain compatile.

Comment 19 Daniel Mach 2018-12-12 13:18:47 UTC
This has been resolved in following PRs:
https://github.com/rpm-software-management/dnf/pull/1276
https://github.com/rpm-software-management/libdnf/pull/650

The algorithm has changed, because rpmdb version in yum3 depended on checksums in repodata and it did not entirely relate to rpmdb content.
The new algorithm is reproducible from shell and gives identical results every time.
See commit messages for more details.


Note You need to log in before you can comment on or make changes to this bug.