Bug 2228827

Summary: Removing Modular tree from composes means we have a stale Modular tree in the public rawhide repo
Product: [Fedora] Fedora Reporter: Adam Williamson <awilliam>
Component: fedora-reposAssignee: Miro Hrončok <mhroncok>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 38CC: fedoraproject, kevin, mboddu, mhroncok, pbrobinson, thrcka, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: AcceptedBlocker openqa
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-24 07:38:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2143444, 2226798    

Description Adam Williamson 2023-08-03 11:11:25 UTC
Per https://fedoraproject.org/wiki/Changes/RetireModularity , the Modular variant was dropped from today's Rawhide compose after https://pagure.io/pungi-fedora/pull-request/1186 was merged. This breaks pretty much all upgrades to Rawhide. All previous installs had the modular repos enabled by default, and these have `skip_if_unavailable` set to `False`. When upgrading, dnf will try and find those repos for the new release; since it can't find them now, and `skip_if_unavailable` is False, it fails out. To get a successful upgrade you would have to add something like `--disablerepo=*modular*`(not tested) to disable all the modular repos; for graphical upgrades you'd have to use the GUI tool's interfaces to disable the modular repos before upgrading, I guess.

I'm filing this as a bug against fedora-repos because one fix I can think of would be to change the repo definitions on stable releases (F37 and F38) to say `skip_if_unavailable=True`. That should cause upgrades to work okay (not yet tested). That could potentially cause problems with transactions that *ought* to involve modular packages if the repo were transiently unavailable, but we could just assume that modularity usage is so low by now that we don't need to worry about it too much?

Other possible options would be to just ship dummy empty repos in the appropriate places till F41, or just document the necessary repo disablements (but I don't really like that option).

Reproducible: Always

Steps to Reproduce:
1. Install Fedora 37 or 38 normally.
2. Try to upgrade to Rawhide according to https://docs.fedoraproject.org/en-US/quick-docs/upgrading/
Actual Results:  
The upgrade process fails, complaining that the modular repos cannot be found.

Expected Results:  
The upgrade process should succeed.

Comment 1 Adam Williamson 2023-08-03 11:14:25 UTC
Proposing as a Beta blocker per https://fedoraproject.org/wiki/Fedora_39_Beta_Release_Criteria#Upgrade_requirements - "For each one of the release-blocking package sets, it must be possible to successfully complete a direct upgrade from a fully updated, clean default installation of each of the last two stable Fedora releases with that package set installed."

Comment 2 Miro Hrončok 2023-08-03 12:18:04 UTC
I won't be able to look a this today, as I am going offline now, but I acknowledge this is a problem that needs to be solved. Shipping an empty modular repository for Fedora 39 and 40 sounds like the safest option to me.

Comment 3 Miro Hrončok 2023-08-04 09:14:00 UTC
I'll explain the options in an email thread and ask folks for opinions.

Comment 5 Miro Hrončok 2023-08-04 10:21:25 UTC
Adam, do you have a reproducer?

I've tried today and the mirrored Fedora 39 modular repo seem to still exist at https://download.fedoraproject.org/pub/fedora/linux/development/rawhide/Modular/

Comment 6 Adam Williamson 2023-08-04 10:36:52 UTC
ooooh. so, the openQA tests actually tweak the dnf config to point to the compose tree, because we need to make sure we're testing the right thing. I didn't remember about that wrinkle.

It looks like when we sync the compose out, it's not a "destructive" sync - we don't wipe the target location before syncing. So the official location *does* still have a Modular tree, which is a hangover from the last compose that had that tree (20230802.n.0).

I'm not sure how we feel about that? Just having the last set of modular packages there...frozen in time...for ever...doesn't seem great. But, it does prevent the upgrade problem happening immediately. I'll update the bug title. Sorry, if I weren't terminally conference sleep-deprived I would probably have noticed that sooner!

Comment 7 Adam Williamson 2023-08-05 08:42:29 UTC
Another thing to think about here: I *think* (I'm not sure) that if normal branching policy is followed, the 'upgrades fail' bug will hit F39 after branching, because I think when we branch we just dump the first branched compose into the new official repo location. That compose won't have a Modular tree. Unless someone actually copies the stale one over from the rawhide repo, the bug will show up.

Comment 8 Kevin Fenzi 2023-08-06 18:51:56 UTC
Yes, we will need to do something after the first branched compose syncs out. 

I think it would be best to manually sync a empty repo to there and also to the current modular repos in rawhide. I don't think we should allow people to install modules that are unmaintained and could be very broken. Then, after say F41 branching we just stop making that manual repo?

Tomas: what do you think?

Comment 9 Adam Williamson 2023-08-09 21:01:46 UTC
Bumping this back to Beta blocker, as indeed, upgrades to F39 really are broken in 'the real world' - F37 and F38 installs have the modular repos defined, but there is no Modular tree at https://dl.fedoraproject.org/pub/fedora/linux/development/39/ and the various 'modular' repos don't exist in mirrormanager for F39, so we get 404s from mirrormanager.

Comment 10 Kevin Fenzi 2023-08-14 20:57:49 UTC
We setup redirects in mirrormanager to redirect the modular repos for f39+ to the main repos. 

So, I think this should be fixed now, can you confirm?

Comment 11 Adam Williamson 2023-08-14 21:49:14 UTC
I'll remove the workarounds and test, thanks.

Redirecting to the main repos will lead to all the metadata being downloaded again and parsed again, though, won't it? That seems a bit suboptimal. Will it double memory use too?

Comment 12 Kevin Fenzi 2023-08-14 22:15:42 UTC
I was thinking that dnf would see they had the same repoxml.xml and not do both, but further pondering I am not sure why I thought that or if it's true. ;( 

If this proves to be a problem, we can try another solution. ;(

Comment 13 Miro Hrončok 2023-08-21 15:36:46 UTC
I believe this is now fixed.

The "Fedora Modular 39" exists. Same for Fedora 40.

I've tried running the following commands from a Fedora 38 podman container:

# dnf --releasever=40 --setopt=module_platform_id=platform:f40 --assumeno distro-sync
# dnf --releasever=40 --setopt=module_platform_id=platform:f40 --assumeno distro-sync


Please verify that real upgrades work.

Comment 14 Miro Hrončok 2023-08-21 15:37:12 UTC
Obviously, I've meant:

# dnf --releasever=39 --setopt=module_platform_id=platform:f39 --assumeno distro-sync
# dnf --releasever=40 --setopt=module_platform_id=platform:f40 --assumeno distro-sync

Comment 15 Adam Williamson 2023-08-23 19:46:05 UTC
+3 in https://pagure.io/fedora-qa/blocker-review/issue/1173 , marking accepted blocker (there's nothing to push here, just cleaning up the tickets).

Comment 16 Adam Williamson 2023-08-24 07:38:27 UTC
OK, I think I tested this on openQA stg and it looks alright. Not sure how much duplication of metadata there is, but I guess we can deal with that if anyone complains...