Bug 1548586 - get_best_query().filter(latest=True) is returning incorrect results
Summary: get_best_query().filter(latest=True) is returning incorrect results
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: dnf
Version: 28
Hardware: Unspecified
OS: Linux
urgent
unspecified
Target Milestone: ---
Assignee: rpm-software-management
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: PrioritizedBug
: 1548635 (view as bug list)
Depends On:
Blocks: IoT 1636239
TreeView+ depends on / blocked
 
Reported: 2018-02-23 22:33 UTC by Brian Lane
Modified: 2018-11-22 17:31 UTC (History)
11 users (show)

Fixed In Version: dnf-4.0.4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1636239 (view as bug list)
Environment:
Last Closed: 2018-11-22 17:31:52 UTC
Type: Bug


Attachments (Terms of Use)
Reproducer script (2.31 KB, text/x-python)
2018-02-23 22:34 UTC, Brian Lane
no flags Details
Output from reproducer (5.71 KB, text/plain)
2018-02-23 22:35 UTC, Brian Lane
no flags Details

Description Brian Lane 2018-02-23 22:33:45 UTC
In lorax we use dnf queries to select packages to be installed. Today I've run into two problems which look like they are related.

When adding the package "system-logos" the query is returning both generic-logos and fedora-logos, which conflict with each other. I had expected it to pick one or the other. Also, when using the F27 Everything repo with the F27 updates repo it includes 2 different versions of fedora-logos. Not the latest.

Related to that, when selecting '*-firmware' it returns multiple versions of the firmware, instead of just the newest from the updates repo. eg. two versions of iwl1000-firmware. I'm not sure when this behavior changed, but it used to work on F26.

I'll included a python script to reproduce this.

The same behavior happens with F27 and rawhide.
rawhide dnf version - dnf-2.7.5-8.fc28.noarch
F27 dnf version - dnf-2.7.5-2.fc27.noarch

Comment 1 Brian Lane 2018-02-23 22:34:38 UTC
Created attachment 1400046 [details]
Reproducer script

Run this to reproduce the problem.

Comment 2 Brian Lane 2018-02-23 22:35:55 UTC
Created attachment 1400047 [details]
Output from reproducer

Note the multiple packages returned by the query+filter, and the multiple versions of firmware, eg. iwl1000-firmware from both repos with different versions.

Comment 3 Jaroslav Mracek 2018-02-27 12:27:49 UTC
I think that the first problem is not a bug. The get_best_query returns all packages that were represented by provided string. Because you used a provide therefore multiple package names can return. The correct solution would by to pass the result query to base.goal.install(select=sltr, optional=(not strict)) where:
sltr=dnf.selector.Selector()
sltr = slts.set(pkg=query)


The second part with with filter(latest=True) is fixed in upstream. Please can you check it for our copr repo ("dnf copr enable rpmsoftwaremanagement/dnf-nightly") - fixed in libdnf-0.13.0+

Please if you thing that issue is somewhere else, or upstream version does't work like you expect, don't hesitate to reopen the bug report.

Comment 4 Jaroslav Mracek 2018-02-27 12:35:15 UTC
*** Bug 1548635 has been marked as a duplicate of this bug. ***

Comment 5 Adam Williamson 2018-02-27 17:14:15 UTC
"The get_best_query returns all packages that were represented by provided string."

In that case, what does "best" mean?

Comment 6 Jaroslav Mracek 2018-02-28 07:57:16 UTC
I am not sure, but get_best_query() tries if provided string is NEVRA or its part, then if it is a valid provide or, file provide. I think that best was supposed as as best choice in best order. Like if string return positive result for nevra search and provide it returns only result for nevra search. Or nevra often can be parsed as name or name-version. Then if both provides a result only one will be returned according to given priority (in forms or default priority).

Comment 7 Dusty Mabe 2018-05-31 17:22:16 UTC
can we get this bug fixed in f28 ? I'm hitting an issue in lorax in f28: https://github.com/weldr/lorax/issues/368

Comment 8 Jaroslav Mracek 2018-06-01 07:45:09 UTC
The problem can be fixed in F28 in about next 30 days with dnf-3.0 release.

Comment 9 Dusty Mabe 2018-06-01 12:28:49 UTC
should we open another bug to track that?

Comment 10 Peter Robinson 2018-06-03 14:34:54 UTC
(In reply to Jaroslav Mracek from comment #8)
> The problem can be fixed in F28 in about next 30 days with dnf-3.0 release.

That's not ideal, it's a regression causing rel-eng issues in particular around composing atomic updates.

Comment 11 Chad 2018-06-04 14:29:49 UTC
I ran into this today as well.  I agree with Peter; 30 days isn't ideal or acceptable.

Comment 12 Peter Robinson 2018-06-04 14:40:57 UTC
Actually thinking about this at all a regression _AT_ _ALL_ like this for a stable release is actually completely unacceptable!!

Where is the test process to ensure no regression? This is core functionality now affecting (at a minimum):
* Fedora Atomic host
* Fedora Atomic Workstation
* IoT

Comment 13 Matthew Miller 2018-06-04 16:17:54 UTC
Sooooo.... we are past 30 days. Any news?

Comment 14 Dusty Mabe 2018-06-04 16:23:41 UTC
(In reply to Matthew Miller from comment #13)
> Sooooo.... we are past 30 days. Any news?

AFAICT the comment about 30 days was made 3 days ago.

Comment 15 Matthew Miller 2018-06-04 16:26:42 UTC
(Oh, sorry, calendar math error. I see that we are not past thirty days, just three. Still. That seems like a long time!)

Comment 16 Matthew Miller 2018-06-06 14:28:14 UTC
As I understand it, Atomic Host has a workaround in place. Peter, do you have a similar workaround for IoT, or is this blocking you?

Comment 17 Colin Walters 2018-06-06 17:16:41 UTC
rpm-ostree uses a (currently forked) version of libdnf and some of the behavior around queries is different.  To my knowledge rpm-ostree isn't affected by this.

Comment 18 Colin Walters 2018-06-06 17:20:06 UTC
For a lot of our editions, it's actually just a pungi-imposed constraint that the generated artifacts use an Anaconda version (generated by lorax) from the same package set.  We can trivially unblock ourselves in a lot of these cases by just using a known-good installer image, and updating the "known good" version one periodically, etc.

Comment 19 Dusty Mabe 2018-06-06 17:20:57 UTC
(In reply to Matthew Miller from comment #16)
> As I understand it, Atomic Host has a workaround in place. 

Yes the workaround is here: https://infrastructure.fedoraproject.org/cgit/ansible.git/commit/roles/bodhi2/backend/templates?id=7855fe096908181b5fb272651659a196596083e2

(In reply to Colin Walters from comment #17)
> rpm-ostree uses a (currently forked) version of libdnf and some of the
> behavior around queries is different.  To my knowledge rpm-ostree isn't
> affected by this.

This was an issue in building the ISO. Not in rpm-ostree itself.

Comment 20 Dusty Mabe 2018-06-06 17:23:22 UTC
(In reply to Colin Walters from comment #18)
> For a lot of our editions, it's actually just a pungi-imposed constraint
> that the generated artifacts use an Anaconda version (generated by lorax)
> from the same package set.  We can trivially unblock ourselves in a lot of
> these cases by just using a known-good installer image, and updating the
> "known good" version one periodically, etc.

yeah I agree. Having an installer that was maintained separately wouldn't be a bad idea. I've mentioned it before: https://pagure.io/pungi-fedora/pull-request/598#comment-50701

Comment 21 Brian Lane 2018-07-19 17:49:00 UTC
Still hitting this in f28 (at least) with libdnf-0.11.1-3.fc28.x86_64 and nothing in updates-testing. Any ETA on a fix?

Comment 22 Jaroslav Mracek 2018-07-20 07:07:46 UTC
Please can you try dnf-3.0 from cops repo (dnf copr enable rpmsoftwaremanagement/dnf-nightly)?

Comment 23 Brian Lane 2018-07-20 18:57:28 UTC
(In reply to Jaroslav Mracek from comment #22)
> Please can you try dnf-3.0 from cops repo (dnf copr enable
> rpmsoftwaremanagement/dnf-nightly)?

Yes, libdnf-0.16.0-0.11g230dc638.fc28.x86_64, solve the problem for me.

Comment 24 Adam Williamson 2018-09-04 21:05:57 UTC
Note that this prevents a test I'm currently working on from being useful for stable releases. I'd like to have an openQA test run on candidate updates that creates a netinst image using lorax then runs install tests with it; obviously for stable releases it should include all packages from the release repo, packages from the stable updates repo, and packages from the candidate update. But because of this bug, the image build always fails if both the release repo and the stable updates repo are used.

Comment 25 Matthew Miller 2018-09-12 15:45:53 UTC
Just to make the above comment more obvious since I'd missed the implication:

We'd really like this bug fixed in the stable F28 release, not just in future releases, because it impedes testing of updates.

Comment 26 Jaroslav Mracek 2018-10-04 21:16:13 UTC
The backporting will be difficult due to massive changes in upstream that cannot be even compiled in libdnf-0.11.1. Additionally the new behavior will brake dnf and probably other tools, because in some cases dnf-2.7.5 rely on behavior of libdnf-0.11, therefore it is risky and requires a lot of resources for backporting and testing (postpone other work). Are you ok with that?

Comment 27 Adam Williamson 2018-10-04 21:26:14 UTC
If this will work from F29 forwards, that sounds like more effort and risk than would be worthwhile for the purposes of doing update testing on F28 and F27.

Marking private as this is a reply to a private comment, but did it really need to be private? I see nothing in it that is confidential.

Comment 28 Matthew Miller 2018-10-04 21:41:05 UTC
I'm okay with that as long as Fedora QA is.

I appreciate there's a lot of work and everything is overwhelming, so if this is something we can reduce in favor of things being better on F29+, that's probably the right call.

Also, yes, I share Adam's wish to make these comments un-private, if that's okay with you.

Comment 29 Brian Lane 2018-10-04 21:46:34 UTC
(In reply to Jaroslav Mracek from comment #26)
> The backporting will be difficult due to massive changes in upstream that
> cannot be even compiled in libdnf-0.11.1. Additionally the new behavior will
> brake dnf and probably other tools, because in some cases dnf-2.7.5 rely on
> behavior of libdnf-0.11, therefore it is risky and requires a lot of
> resources for backporting and testing (postpone other work). Are you ok with
> that?

So does this mean I'm going to need to kludge together my own solution to this in lorax?


Note You need to log in before you can comment on or make changes to this bug.