Bug 1649921

Summary: DNF should skip "default" entries in comps.xml if they introduce conflicts
Product: [Fedora] Fedora Reporter: Stephen Gallagher <sgallagh>
Component: dnfAssignee: Jaroslav Mracek <jmracek>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 31CC: awilliam, dmach, jantill, kevin, kparal, mblaha, mhatina, Michael.Riss, mkolman, packaging-team-maint, pasik, redhat, robatino, rpm-software-management, sbueno, thomas.tomdan, vmukhame
Target Milestone: ---Keywords: CommonBugs, Reopened, Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: https://fedoraproject.org/wiki/Common_F30_bugs#fedora-release-conflicts https://fedoraproject.org/wiki/Common_F30_bugs#upgrade-fedora-release-conflict
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-22 15:33:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Stephen Gallagher 2018-11-14 19:25:52 UTC
Description of problem:
If I `dnf install @somegroup` that has an entry in it marked as "default", I would expect DNF to ignore that entry (or warn in the transaction summary) and proceed with the transaction, rather than require me to pass --skip-broken. 

Version-Release number of selected component (if applicable):
dnf-4.0.4-2.fc30.noarch
libdnf-0.22.0-8.fc30.x86_64


How reproducible:
Every time


Steps to Reproduce:
1. `docker pull fedora:rawhide`
2. `docker run --rm --tty -i fedora:rawhide /bin/bash`
3. `docker install @server-product-environment`

Actual results:
Error: 
 Problem: problem with installed package fedora-release-30-0.13.noarch
  - package fedora-release-30-0.13.noarch conflicts with system-release provided by fedora-release-server-30-0.13.noarch
  - package fedora-release-server-30-0.13.noarch conflicts with system-release provided by fedora-release-30-0.13.noarch
  - conflicting requests
(try to add '--allowerasing' to command line to replace conflicting packages or '--skip-broken' to skip uninstallable packages)


Expected results:
Since fedora-release-server is not *required* by comps.xml, it should be skipped and the transaction should proceed correctly.

Additional info:

Comment 1 Daniel Mach 2018-11-26 12:38:06 UTC
Semantics of mandatory/default/optional came from Anaconda and other GUI installers. It means:
* mandatory: package gets installed (checked in UI, can't be unchecked)
* default: package gets installed, but can be unchecked
* optional: package can be installed (unchecked, can be checked)

If we implemented the proposed behavior, it would impact how comps installation works in other cases. We believe that using --allowerasing is the correct approach. You can also use `dnf swap` command to replace the fedora-release package prior installing the group.

Comment 2 Stephen Gallagher 2018-11-26 13:11:38 UTC
(In reply to Daniel Mach from comment #1)
> Semantics of mandatory/default/optional came from Anaconda and other GUI
> installers. It means:
> * mandatory: package gets installed (checked in UI, can't be unchecked)
> * default: package gets installed, but can be unchecked
> * optional: package can be installed (unchecked, can be checked)
> 
> If we implemented the proposed behavior, it would impact how comps
> installation works in other cases. We believe that using --allowerasing is
> the correct approach. You can also use `dnf swap` command to replace the
> fedora-release package prior installing the group.

Reopening.

You misunderstood the request. I neither want nor expect that the behavior would be that the fedora-release-server package would be installed here. Exactly the opposite, in fact. If something is not mandatory (either optional or default), an attempt to install it when installing a comps group should skip it, so long as doing so would not cause the transaction to be unresolvable.

In this case, fedora-release-server is extraneous; yes, I *could* pass --skip-broken explicitly, but my expectation as a user is that DNF should just ignore it if it's not a mandatory part of the group.

Also CCing Anaconda folks for their input. I agree there could be some impact on the installer case, such as if conflicting groups were selected in the netinstall, but I imagine such cases would be fairly minimal.

Comment 3 Stephen Gallagher 2018-12-12 21:53:47 UTC
*** Bug 1657493 has been marked as a duplicate of this bug. ***

Comment 4 Stephen Gallagher 2019-04-09 19:26:35 UTC
*** Bug 1698187 has been marked as a duplicate of this bug. ***

Comment 5 Stephen Gallagher 2019-04-10 12:39:52 UTC
DNF Team, could we get an update here? We're starting to get more reports from this in Fedora 30 (such as BZ #1698187)

adamw/kparal: I think we should deal with this in a Common Bugs entry, at least until this gets resolved. My suggestion for text (feel free to massage):


== Conflicts in fedora-release when trying to upgrade to F30 ==
If you see conflicts between two fedora-release packages when trying to run the upgrade transaction, manually remove one of them before attempting the upgrade.

== Conflicts in fedora-release when trying to install desktop or server groups ==
If you attempt to install the Fedora Server or one of the desktop environment groups on an existing system, you may see the following message:
```
Error: 
 Problem: problem with installed package fedora-release-workstation-30-0.24.noarch
  - package fedora-release-workstation-30-0.24.noarch conflicts with system-release provided by fedora-release-matecompiz-30-0.25.noarch
  - package fedora-release-matecompiz-30-0.25.noarch conflicts with system-release provided by fedora-release-workstation-30-0.24.noarch
  - package fedora-release-workstation-30-0.25.noarch conflicts with system-release provided by fedora-release-matecompiz-30-0.25.noarch
  - package fedora-release-matecompiz-30-0.25.noarch conflicts with system-release provided by fedora-release-workstation-30-0.25.noarch
  - conflicting requests
(try to add '--allowerasing' to command line to replace conflicting packages or '--skip-broken' to skip uninstallable packages)
```

(I'll use "workstation" and "matecompiz" as stand-ins for "current system" and "version being installed", but the instructions are the same for any set of conflicts.

You have four choices for how to resolve this issue:

* [RECOMMENDED] Pass `--excludepkg fedora-release-matecompiz` to DNF. This will remove just this package from the transaction and allow the rest of the group to be resolved normally. This option will retain your system's "identity" to what it was from initial installation.

* Pass `--excludepkg fedora-release-workstation` to DNF. This will remove just this package from the transaction and allow the rest of the group to be resolved normally. This option will *change your system identity* such that it will now report as a MATE Compiz system rather than a Workstation one.

* Pass `--skip-broken`. This is similar to passing `--excludepkg fedora-release-matecompiz` except that it will also skip any other potential conflicts which may result in an incomplete group install. Make sure to carefully examine the transaction summary before proceeding. This option will retain your system's "identity" to what it was from initial installation.

* Pass `--allowerasing`. This is similar to passing `--excludepkg fedora-release-workstation` except that it may also result in replacing other packages on your system unexpectedly. Make sure to carefully examine the transaction summary before proceeding. This option will retain your system's "identity" to what it was from initial installation.

Comment 6 Adam Williamson 2019-04-10 15:05:14 UTC
@sgallagh FWIW I have specifically requested the *opposite* behaviour in the past, and that's what dmach was trying to tell you.

It is in fact quite a big problem if dnf just skips packages that are not 'mandatory' in comps, because that can frequently result in broken images and systems and it can be hard to figure out why.

What dmach described is the behaviour yum had and the behaviour I (and others) worked with the dnf team for several years to get it to implement. yum did not treat 'default' and 'optional' as meaning "if these packages are selected for install but are not installable, that's fine, just proceed" - it treated that as a fatal error unless --skip-broken was passed. This was known and established behaviour for many years and things grew up around it.

At one point dnf actually behaved as you described, and the result was that we constantly got images and installs that were missing packages which should have been present - exactly *because of* this behaviour. I actually wrote the PR that changed dnf back to behaving the way yum did: https://github.com/rpm-software-management/dnf/pull/1038

So I'm afraid I definitely can't support your request here.

In theory if someone commits to going through comps with a fine-tooth comb and marking everything as 'mandatory' that ought to be marked as 'mandatory' under your expectation of how things should work, and does that for RHEL too, and then educates everyone who commits stuff to comps about this new expectation...then we can maybe change this.

Comment 7 Stephen Gallagher 2019-04-10 15:18:02 UTC
(In reply to Adam Williamson from comment #6)
> @sgallagh FWIW I have specifically requested the *opposite* behaviour in the
> past, and that's what dmach was trying to tell you.
> 
> It is in fact quite a big problem if dnf just skips packages that are not
> 'mandatory' in comps, because that can frequently result in broken images
> and systems and it can be hard to figure out why.
> 
> What dmach described is the behaviour yum had and the behaviour I (and
> others) worked with the dnf team for several years to get it to implement.
> yum did not treat 'default' and 'optional' as meaning "if these packages are
> selected for install but are not installable, that's fine, just proceed" -
> it treated that as a fatal error unless --skip-broken was passed. This was
> known and established behaviour for many years and things grew up around it.
> 
> At one point dnf actually behaved as you described, and the result was that
> we constantly got images and installs that were missing packages which
> should have been present - exactly *because of* this behaviour. I actually
> wrote the PR that changed dnf back to behaving the way yum did:
> https://github.com/rpm-software-management/dnf/pull/1038
> 
> So I'm afraid I definitely can't support your request here.
> 
> In theory if someone commits to going through comps with a fine-tooth comb
> and marking everything as 'mandatory' that ought to be marked as 'mandatory'
> under your expectation of how things should work, and does that for RHEL
> too, and then educates everyone who commits stuff to comps about this new
> expectation...then we can maybe change this.

OK... can we do this another way and add a new type? "if-no-conflict" or something? Then we could have the behavior I asked for *just* for those packages that we know have to work around conflicts (the fedora-release-* ones in particular).

I figure the behavior should then be the same as "default" unless there is a conflict, then they can be excluded to resolve the conflict.

Comment 8 Adam Williamson 2019-04-10 15:26:37 UTC
I mean, it's probably possible? We'd have to tweak the dnf implementation again, and add the new type to libcomps, I guess.

Comment 9 Stephen Gallagher 2019-04-10 15:31:12 UTC
Well, let's keep this open to figure out the right long-term solution to this problem, but in the meantime can we get https://fedoraproject.org/wiki/Common_F30_bugs updated with the workarounds I listed above?

Comment 10 Stephen Gallagher 2019-04-11 12:28:24 UTC
Oops, I had a mistake in the Common Bugs text above. The last sentence of the fourth option should have been "This option will *change your system identity* such that it will now report as a MATE Compiz system rather than a Workstation one." I fixed this on https://fedoraproject.org/wiki/Common_F30_bugs just now.

Comment 11 kevin 2019-05-26 17:27:04 UTC
This just hit me during an upgrade and caused a black screen and gave me a good ol' scare. Resolved as follows:

I upgraded from F29 to F30. On rebooting, all I see is a black screen with Generating “/run/initramfs/rdsosreport.txt”

What do I do? I tried all four grub options with the same result. I saw no obvious errors while upgrading.

Edit1: After some time, it dropped into a terminal and I ran $(cat /run/initramfs/rdsosreport.txt) and one thing that stands out is:

systemctl: Failed to switch root: Specified switch root path '/sysroot' does not seem to be an OS tree. os-release file is missing.

Edit2: It seems my main filesystem is mounted read-only under /sysroot. Running $(ls -l /sysroot/etc/os-release) shows it’s a symlink to “…/usr/lib/os-release”. Running $(ls -l /sysroot/usr/lib/os-release) shows it’s a symlink to “./os.release.d/os-release-fedora”. Running $(ls -l /sysroot/usr/os.release.d) shows it does not exist.

Edit3: I’m able to mount the main filesystem as read-write with $(mount -o remount,rw /sysroot). Now I just need to figure out how to correct the os-release file.

Edit4: I was able to boot successfully with:

rm /sysroot/etc/os-release
cp /usr/lib/os-release /sysroot/etc/os-release
/sysroot/bin/sync
reboot

Then I found some conflicts in dnf and ran $(dnf install --allowerasing fedora-release-workstation-30-3) which removed fedora-release-matecompiz-30-3 and I don’t know why that was installed. Then I ran:

sudo rm /etc/os-release
sudo ln -s /usr/lib/os-release /etc/os-release
sudo reboot

What I originally blew through naively with --allowerasing that caused this:

2019-05-26T01:41:03Z CRITICAL Error: 
 Problem: problem with installed package fedora-release-matecompiz-30-3.noarch
  - package fedora-release-matecompiz-30-3.noarch conflicts with system-release provided by fedora-release-workstation-30-3.noarch
  - package fedora-release-workstation-30-3.noarch conflicts with system-release provided by fedora-release-matecompiz-30-3.noarch
  - package fedora-release-workstation-30-3.noarch conflicts with system-release provided by fedora-release-matecompiz-30-1.noarch
  - package fedora-release-matecompiz-30-1.noarch conflicts with system-release provided by fedora-release-workstation-30-3.noarch
  - conflicting requests
2019-05-26T01:41:03Z INFO (try to add '--allowerasing' to command line to replace conflicting packages or '--skip-broken' to skip uninstallable packages)

Comment 12 Ben Cotton 2019-08-13 16:53:26 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle.
Changing version to '31'.

Comment 13 Ben Cotton 2019-08-13 18:56:23 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle.
Changing version to 31.

Comment 14 Jaroslav Mracek 2019-10-22 15:27:14 UTC
(In reply to kevin from comment #11)
> This just hit me during an upgrade and caused a black screen and gave me a
> good ol' scare. Resolved as follows:
> 
> I upgraded from F29 to F30. On rebooting, all I see is a black screen with
> Generating “/run/initramfs/rdsosreport.txt”
> 
> What do I do? I tried all four grub options with the same result. I saw no
> obvious errors while upgrading.
> 
> Edit1: After some time, it dropped into a terminal and I ran $(cat
> /run/initramfs/rdsosreport.txt) and one thing that stands out is:
> 
> systemctl: Failed to switch root: Specified switch root path '/sysroot' does
> not seem to be an OS tree. os-release file is missing.
> 
> Edit2: It seems my main filesystem is mounted read-only under /sysroot.
> Running $(ls -l /sysroot/etc/os-release) shows it’s a symlink to
> “…/usr/lib/os-release”. Running $(ls -l /sysroot/usr/lib/os-release) shows
> it’s a symlink to “./os.release.d/os-release-fedora”. Running $(ls -l
> /sysroot/usr/os.release.d) shows it does not exist.
> 
> Edit3: I’m able to mount the main filesystem as read-write with $(mount -o
> remount,rw /sysroot). Now I just need to figure out how to correct the
> os-release file.
> 
> Edit4: I was able to boot successfully with:
> 
> rm /sysroot/etc/os-release
> cp /usr/lib/os-release /sysroot/etc/os-release
> /sysroot/bin/sync
> reboot
> 
> Then I found some conflicts in dnf and ran $(dnf install --allowerasing
> fedora-release-workstation-30-3) which removed
> fedora-release-matecompiz-30-3 and I don’t know why that was installed. Then
> I ran:
> 
> sudo rm /etc/os-release
> sudo ln -s /usr/lib/os-release /etc/os-release
> sudo reboot
> 
> What I originally blew through naively with --allowerasing that caused this:
> 
> 2019-05-26T01:41:03Z CRITICAL Error: 
>  Problem: problem with installed package
> fedora-release-matecompiz-30-3.noarch
>   - package fedora-release-matecompiz-30-3.noarch conflicts with
> system-release provided by fedora-release-workstation-30-3.noarch
>   - package fedora-release-workstation-30-3.noarch conflicts with
> system-release provided by fedora-release-matecompiz-30-3.noarch
>   - package fedora-release-workstation-30-3.noarch conflicts with
> system-release provided by fedora-release-matecompiz-30-1.noarch
>   - package fedora-release-matecompiz-30-1.noarch conflicts with
> system-release provided by fedora-release-workstation-30-3.noarch
>   - conflicting requests
> 2019-05-26T01:41:03Z INFO (try to add '--allowerasing' to command line to
> replace conflicting packages or '--skip-broken' to skip uninstallable
> packages)

I am sorry but this completely not related topic to original bug report. Please open new bug against distribution.

Comment 15 Jaroslav Mracek 2019-10-22 15:33:37 UTC
The reqyest(In reply to Stephen Gallagher from comment #7)
> OK... can we do this another way and add a new type? "if-no-conflict" or
> something? Then we could have the behavior I asked for *just* for those
> packages that we know have to work around conflicts (the fedora-release-*
> ones in particular).
> 
> I figure the behavior should then be the same as "default" unless there is a
> conflict, then they can be excluded to resolve the conflict.


I believe that your request could be fulfill by conditional packages. Group package foo that requires foo. That all what we can provide. I am closing it as wantfix because it conflict with other requests or it will break metadata of comps.