Bug 924162
Summary: | A software selection with dependency errors is allowed to proceed if the dependency check runs twice | ||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Ľuboš Kardoš <lkardos> | ||||||||||||||||||||||||||||||
Component: | yum | Assignee: | Packaging Maintenance Team <packaging-team-maint> | ||||||||||||||||||||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||||||||||||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||||||||||||||||||||||
Priority: | unspecified | ||||||||||||||||||||||||||||||||
Version: | 19 | CC: | admiller, awilliam, dshea, ffesti, firas.alkafri, g.kaviyarasu, james.antill, jonathan, jorton, jreznik, jzeleny, kparal, mfabian, mkolman, mkrizek, packaging-team-maint, robatino, rtguille, satellitgo, sbueno, srinivas_g_gowda, tim.lauridsen, vanmeeuwen+fedora, zpavlas | ||||||||||||||||||||||||||||||
Target Milestone: | --- | ||||||||||||||||||||||||||||||||
Target Release: | --- | ||||||||||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||||||||
Whiteboard: | AcceptedBlocker | ||||||||||||||||||||||||||||||||
Fixed In Version: | yum-3.4.3-99.fc19 | Doc Type: | Bug Fix | ||||||||||||||||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||||||||||||||||
Last Closed: | 2013-06-26 04:28:49 UTC | Type: | Bug | ||||||||||||||||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||||||||||||||
Embargoed: | |||||||||||||||||||||||||||||||||
Bug Depends On: | |||||||||||||||||||||||||||||||||
Bug Blocks: | 834090, 920667 | ||||||||||||||||||||||||||||||||
Attachments: |
|
Description
Ľuboš Kardoš
2013-03-21 10:18:12 UTC
Created attachment 713729 [details]
anaconda.log
Created attachment 713730 [details]
program.log
Created attachment 713731 [details]
packaging.log
*** Bug 926549 has been marked as a duplicate of this bug. *** From packaging.log: 06:05:00,472 WARN packaging: gvfs-afc-1.15.4-1.fc19.i686 requires libimobiledevice.so.3 06:05:00,472 WARN packaging: libgpod-0.8.2-8.fc19.i686 requires libimobiledevice.so.3 06:05:00,472 WARN packaging: upower-0.9.19-2.fc19.i686 requires libimobiledevice.so.3 is this just those again? oh, I see, another bug on anaconda behaviour. Discussed at 2013-03-27 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2013-03-27/f19alpha-blocker-review-3.2013-03-27-16.01.log.txt . Agreed that this specific behaviour - the ability to 'override' the dependency errors - is not a blocker, it doesn't hit any of the criteria and we didn't think there was any reason to add a criterion for this kind of problem. I've definitely seen this in the past, but I cannot reproduce it with the latest anaconda plus updates for other (seemingly unrelated) fixes. Can you give this another test with anaconda-19.16 and let me know if it's still a problem? Thanks. Yes, I can reproduce with anaconda-19.16 but because now there aren't dependency problems in GNOME Desktop environment, steps to reproduce are sligtly different: 1. Start anaconda proceed to the "SOFTWARE SELECTION" spoke. 2. Select "GNOME Desktop" environment. 3. Check "Robotics" from add-ons. 4. Click "Done" (Now you can see error message: "Error checking software selection") 5. Click on "SOFTWARE SELECTION". 6. In "SOFTWARE SELECTION" spoke make no changes and click "Done". Now error message dissappers but dependency problem is still here and you can see dependency error during installation. For whatever reason, the dependency solving is working the second time through. I wonder what thing we're not cleaning up properly between runs. *** Bug 673221 has been marked as a duplicate of this bug. *** Any ideas what we might be failing to do here? Sorry, not sure I can help.. There are actually THREE depsolve runs in packaging.log. I've split it with: $ perl -ne 'if (/checking dependencies/) { ++$n; open OUT, ">ds$n.log"; next; }; print OUT substr($_, 40); close OUT if /Depsolve time/' packaging.log $ for i in ds*.log; do grep 'as install for' $i|wc -l; done 882 129 883 1 and 3 failed, 2 is okay. Opening it with vimdiff, ds1 and ds3 are very similar, but ds2 is completely different. Also the depsolve time was ~1s instead of ~5s. I'd say that a lot of packages were not pulled in. I have seen this problem again with F19 TC2 x86_64 netinst. The repo was broken because of sssd. Created attachment 760668 [details]
hub in initial state
Created attachment 760669 [details]
software details after you click on the bottom bar
Created attachment 760671 [details]
hub seemingly OK (BUG!!) after you click Done in software selection
Created attachment 760672 [details]
dependency error after you disk was modified, only quit possible
Created attachment 760674 [details]
anaconda.log
Created attachment 760675 [details]
anaconda-yum.conf
Created attachment 760676 [details]
ifcfg.log
Created attachment 760677 [details]
packaging.log
Created attachment 760678 [details]
program.log
Created attachment 760679 [details]
storage.log
Created attachment 760681 [details]
syslog
Since I'm the usual troublemaker, I'm proposing this as FinalBlocker. It was rejected just for Alpha. The justification: Software repositories are sometimes broken. The main 'fedora' repository will be frozen in an OK state, but 'updates' repository can be broken any time. This can affect users during the whole F19 life cycle. If anaconda wants to allow installation even with broken repos, that's fine. But it should be an informed decision of the user. Currently the user just opens the software spoke, hits Done, and everything _seems_ to be OK. He was not warned about any possible consequences. He naturally proceeds with installation. His disk layout is destroyed and re-created and only then a packaging error is displayed, ending the installation and leaving the system non-functional. Because bootloader installation did not proceed, but disk changes were performed, this can leave some of his other OSes unbootable, even though their contents might be fine. All of this could have been prevented if the error stays displayed, instead of getting hidden. So this is kind of "don't destroy user data" criterion. If the partitions weren't modified before installation failure, everything would be fine. Discussed at 2013-06-17 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2013-06-17/f19final-blocker-review-6.2013-06-17-16.01.log.txt . We accepted this as a blocker with a moderate stretch of the criterion "When doing a graphical install using the dedicated installer images, the installer must be able to install each of the release blocking desktops, as well as the minimal package set." in the case of using the 'updates' repo (which is default for a net install) at a time when it has a depsolving problem. This is a bit of a stretch, but we really feel there at least ought to be a proper warning in this case. Note that the maintainer ping-pong is in an uncertain state here. Chris, you re-assigned it to packaging-team-maint on 06-13 without further comment, but in comment #13, they seemed to disclaim responsibility: "Sorry, not sure I can help.. There are actually THREE depsolve runs in packaging.log...1 and 3 failed, 2 is okay. Opening it with vimdiff, ds1 and ds3 are very similar, but ds2 is completely different. Also the depsolve time was ~1s instead of ~5s. I'd say that a lot of packages were not pulled in." So they seem to be saying anaconda is somehow doing an invalid depsolv here - testing only a small package set, for some reason, which passes - and it's not the packaging tools' fault. You two need to straighten out whose responsibility this is. If you look at comment #12, you'll see I was asking for information on what we might need to be doing in anaconda to clean things up for a second successful run. I already know there are multiple runs, and I already know the time in one was very short and that a lot of packages were not pulled in, and I also already know there's invalid depsolving. What I need to know is: Do I need to be calling something from yum to clean up between depsolving runs? What I do not need is a lecture on responsibility. Sorry, I'm not trying to lecture, just trying to make sure that someone acknowledges this is currently their ball, rather than it just sitting there with nothing happening. > Do I need to be calling something from yum to clean up between depsolving runs?
IMO, "del yb.tsInfo" should be enough, but I'm not sure at all..
The big hammer here (what "ts reset", in shell, does ... among other things) is: yb.closeRpmDB() ...you might be able to get away without the last two (yb._up = None and yb.comps = None), but I wouldn't want to recommend less than that. Back to you Chris. If there is any other information you need, let us know. clumens mentioned on IRC today that: <adamw> clumens: did you get everything you needed from yum folks for 924162 ? (we're doing blocker review) <clumens> adamw: not really, we're already calling yumWhateverWhateverDB so it looks like some more co-ordinated debugging may be required here. A quick lazy grep gives me: [adamw@adam anaconda (f19-branch %)]$ grep -10 -R closeRpmDB * pyanaconda/packaging/yumpayload.py- self._yum._pkgSack = None pyanaconda/packaging/yumpayload.py- pyanaconda/packaging/yumpayload.py- self._yum.repos.pkgSack = MetaSack() pyanaconda/packaging/yumpayload.py- pyanaconda/packaging/yumpayload.py- for repo in self._yum.repos.repos.values(): pyanaconda/packaging/yumpayload.py- repo._sack = None pyanaconda/packaging/yumpayload.py- pyanaconda/packaging/yumpayload.py- def deleteYumTS(self): pyanaconda/packaging/yumpayload.py- with _yum_lock: pyanaconda/packaging/yumpayload.py- log.debug("deleting yum transaction info") pyanaconda/packaging/yumpayload.py: self._yum.closeRpmDB() pyanaconda/packaging/yumpayload.py- del self._yum.tsInfo pyanaconda/packaging/yumpayload.py- del self._yum.ts pyanaconda/packaging/yumpayload.py- pyanaconda/packaging/yumpayload.py- def preStorage(self): pyanaconda/packaging/yumpayload.py- self.release() pyanaconda/packaging/yumpayload.py- with _yum_lock: pyanaconda/packaging/yumpayload.py- self._yum.close() pyanaconda/packaging/yumpayload.py- pyanaconda/packaging/yumpayload.py- ### pyanaconda/packaging/yumpayload.py- ### METHODS FOR WORKING WITH REPOSITORIES https://git.fedorahosted.org/cgit/anaconda.git/tree/pyanaconda/packaging/yumpayload.py?h=f19-branch Looking at the first packaging.log, if you dig around a bit, you'll see the following line: 06:04:29,742 CRIT yum.YumBase: Warning: Group core does not have any packages to install. There's a lot of other lines like that. Now this isn't exactly the same as the problem reported in this bug, but it is suspiciously similar - for whatever reason, some yum structures are simply missing the data required to operate. This fits with my theory that whatever cleanup/resetting anaconda is doing, we're either doing too much or not enough. Note that we are definitely already calling closeRpmDB. If I look in yum a bit more and do some debugging, it looks like all groups and packages are getting skipped in selectGroup because all packages are listed as "blacklisted-available". And that gets set by _groupInstalledData, because the following is returning an empty list: ipkgs = self.rpmdb.searchNames([pkg_name]) So, what thing do I either need to do or not need to do for the second run to make sure the rpmdb is populated with enough stuff to return a non-empty list? Couldn't we at least go with a warning dialog solution as we're getting to the point we need the fix if we want to release on time? It's pretty late for a new string although. And setting back needinfo, sorry, forgot to unchech the box and even as BZ is slow as ... I was slow to cancel submission. jreznik: from my reading of the issue it is not subject to a 'simple' fix like that. The problem is ultimately that the package set consistency check is being run multiple times, some of which runs use an entirely incorrect set of packages and pass when they should fail. We don't actually *know* when the package set is inconsistent, so when should we pop up a warning? Chris may be able to correct me, but I think that's the case. Looks like James has a patch for this built in koji already. Building for F19 now: http://koji.fedoraproject.org/koji/taskinfo?taskID=5536337 Good news: "Fix igroups not being reset in "ts reset" and anaconda. BZ 924162." Bad news: "Update to latest HEAD." In fact, as the current stable yum build is -95, as things stand we have to take all this churn just to fix this single blocker bug: * Mon Jun 24 2013 James Antill <james at fedoraproject.org> - 3.4.3-99 - Update to latest HEAD. - Fix igroups not being reset in "ts reset" and anaconda. BZ 924162. - Check for bad checksum types at runtime. BZ 902357. - fix --cacheonly edge case. BZ 975619. * Wed Jun 19 2013 James Antill <james at fedoraproject.org> - 3.4.3-98 - Update to latest HEAD. - Add simple way to specify a group of pkgs. for version. BZ 913461. - Change group install => group upgrade for installed groups. BZ 833087. - Give more text when telling user about y-c-t, mention history. BZ 974576. - Fix the tolerant yum.conf text. - Skip installonly limit, if something has set check_future_rpmdbv. BZ 962757. * Mon Jun 17 2013 James Antill <james at fedoraproject.org> - 3.4.3-97 - Update to latest HEAD. - Minor updates to fssnapshot command. - metalink: Don't assume maxconnections is always set. BZ 974826. - DeltaPackage: provide returnIdSum(). BZ 974394. * Fri Jun 14 2013 James Antill <james at fedoraproject.org> - 3.4.3-96 - Update to latest HEAD. - Add real logging to updateinfo parsing. - Merge fssnap command. - Extend repos.findRepos() API so it can handle repolist input. - Add a prelistenabledrepos plugin point. - Auto. enable disabled repos. in repo-pkgs. - Do cacheRequirement() tests before doCheck(). I could really wish packages, especially critpath packages, would be more conservative with changes during Final freeze. I thought -97 was already in F19, but apparently forgot to make update after building. Even then the "big" change is in -96 adding "yum fssnap" and that does nothing if the user doesn't run that command/config. So this looks worse than it is (pretty much all the other stuff are simple bug fixes), but I can revert the merge and add the one line patch if required. if you're confident it shouldn't break anything, and clumens doesn't see any red flags, i think we can give it a shot. let's just hope it's okay. yum-3.4.3-99.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/yum-3.4.3-99.fc19 Seems to be fixed in F19 RC1. With Martin we have removed some packages from DVD repo and adjusted the metadata. We verified that the problem was present in TC6, but was fixed in RC1. Setting as verified. yum-3.4.3-99.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report. |