Bug 924162

Summary: A software selection with dependency errors is allowed to proceed if the dependency check runs twice
Product: [Fedora] Fedora Reporter: Ľuboš Kardoš <lkardos>
Component: yumAssignee: Packaging Maintenance Team <packaging-team-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: admiller, awilliam, dshea, ffesti, firas.alkafri, g.kaviyarasu, james.antill, jonathan, jorton, jreznik, jzeleny, kparal, mfabian, mkolman, mkrizek, packaging-team-maint, robatino, rtguille, satellitgo, sbueno, srinivas_g_gowda, tim.lauridsen, vanmeeuwen+fedora, zpavlas
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: AcceptedBlocker
Fixed In Version: yum-3.4.3-99.fc19 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-06-26 04:28:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 834090, 920667    
Attachments:
Description Flags
anaconda.log
none
program.log
none
packaging.log
none
hub in initial state
none
software details after you click on the bottom bar
none
hub seemingly OK (BUG!!) after you click Done in software selection
none
dependency error after you disk was modified, only quit possible
none
anaconda.log
none
anaconda-yum.conf
none
ifcfg.log
none
packaging.log
none
program.log
none
storage.log
none
syslog none

Description Ľuboš Kardoš 2013-03-21 10:18:12 UTC
Description of problem:
After starting installation and proceeding to hub, you can see error message: "Error checking software selection". After entering "SOFTWARE SELECTION" spoke and making no changes and clicking "Done", error message disappear.

Version-Release number of selected component (if applicable):
Fedora-19-Alpha-TC1
anaconda-19.12

How reproducible:
always

Steps to Reproduce:
1. Start anaconda proceed to main hub. (Now you can see error message: "Error checking software selection").
2. Click on "SOFTWARE SELECTION".
3. In "SOFTWARE SELECTION" spoke make no changes and click "Done".
  
Actual results:
Error message dissapper but dependency problem is still here and you can see dependency error during installation.

Expected results:
Error message shouldn't disapper and you shouldn't be allowed to begin installation.


Additional info:

Comment 1 Ľuboš Kardoš 2013-03-21 10:21:33 UTC
Created attachment 713729 [details]
anaconda.log

Comment 2 Ľuboš Kardoš 2013-03-21 10:21:56 UTC
Created attachment 713730 [details]
program.log

Comment 3 Ľuboš Kardoš 2013-03-21 10:24:29 UTC
Created attachment 713731 [details]
packaging.log

Comment 4 Chris Lumens 2013-03-25 14:06:59 UTC
*** Bug 926549 has been marked as a duplicate of this bug. ***

Comment 5 Adam Williamson 2013-03-25 15:12:52 UTC
From packaging.log:

06:05:00,472 WARN packaging: gvfs-afc-1.15.4-1.fc19.i686 requires libimobiledevice.so.3
06:05:00,472 WARN packaging: libgpod-0.8.2-8.fc19.i686 requires libimobiledevice.so.3
06:05:00,472 WARN packaging: upower-0.9.19-2.fc19.i686 requires libimobiledevice.so.3

is this just those again?

Comment 6 Adam Williamson 2013-03-25 15:13:16 UTC
oh, I see, another bug on anaconda behaviour.

Comment 7 Adam Williamson 2013-03-27 16:29:23 UTC
Discussed at 2013-03-27 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2013-03-27/f19alpha-blocker-review-3.2013-03-27-16.01.log.txt . Agreed that this specific behaviour - the ability to 'override' the dependency errors - is not a blocker, it doesn't hit any of the criteria and we didn't think there was any reason to add a criterion for this kind of problem.

Comment 8 Chris Lumens 2013-04-04 16:59:51 UTC
I've definitely seen this in the past, but I cannot reproduce it with the latest anaconda plus updates for other (seemingly unrelated) fixes.  Can you give this another test with anaconda-19.16 and let me know if it's still a problem?  Thanks.

Comment 9 Ľuboš Kardoš 2013-04-05 09:13:13 UTC
Yes, I can reproduce with anaconda-19.16 but because now there aren't dependency problems in GNOME Desktop environment, steps to reproduce are sligtly different:

1. Start anaconda proceed to the "SOFTWARE SELECTION" spoke.
2. Select "GNOME Desktop" environment.
3. Check "Robotics" from add-ons.
4. Click "Done" (Now you can see error message: "Error checking software selection")
5. Click on "SOFTWARE SELECTION".
6. In "SOFTWARE SELECTION" spoke make no changes and click "Done".

Now error message dissappers but dependency problem is still here and you can see dependency error during installation.

Comment 10 Chris Lumens 2013-04-05 14:59:25 UTC
For whatever reason, the dependency solving is working the second time through.  I wonder what thing we're not cleaning up properly between runs.

Comment 11 Chris Lumens 2013-04-25 16:01:24 UTC
*** Bug 673221 has been marked as a duplicate of this bug. ***

Comment 12 Chris Lumens 2013-05-28 20:46:33 UTC
Any ideas what we might be failing to do here?

Comment 13 Zdeněk Pavlas 2013-05-29 08:26:00 UTC
Sorry, not sure I can help.. There are actually THREE depsolve runs in packaging.log.  I've split it with:

$ perl -ne 'if (/checking dependencies/) { ++$n; open OUT, ">ds$n.log"; next; }; print OUT substr($_, 40); close OUT if /Depsolve time/' packaging.log
$ for i in ds*.log; do grep 'as install for' $i|wc -l; done
882
129
883

1 and 3 failed, 2 is okay.  Opening it with vimdiff, ds1 and ds3 are very similar, but ds2 is completely different.  Also the depsolve time was ~1s instead of ~5s.  I'd say that a lot of packages were not pulled in.

Comment 14 Kamil Páral 2013-06-13 12:58:02 UTC
I have seen this problem again with F19 TC2 x86_64 netinst. The repo was broken because of sssd.

Comment 15 Kamil Páral 2013-06-13 12:59:41 UTC
Created attachment 760668 [details]
hub in initial state

Comment 16 Kamil Páral 2013-06-13 13:00:01 UTC
Created attachment 760669 [details]
software details after you click on the bottom bar

Comment 17 Kamil Páral 2013-06-13 13:00:29 UTC
Created attachment 760671 [details]
hub seemingly OK (BUG!!) after you click Done in software selection

Comment 18 Kamil Páral 2013-06-13 13:01:09 UTC
Created attachment 760672 [details]
dependency error after you disk was modified, only quit possible

Comment 19 Kamil Páral 2013-06-13 13:01:43 UTC
Created attachment 760674 [details]
anaconda.log

Comment 20 Kamil Páral 2013-06-13 13:01:55 UTC
Created attachment 760675 [details]
anaconda-yum.conf

Comment 21 Kamil Páral 2013-06-13 13:02:06 UTC
Created attachment 760676 [details]
ifcfg.log

Comment 22 Kamil Páral 2013-06-13 13:02:21 UTC
Created attachment 760677 [details]
packaging.log

Comment 23 Kamil Páral 2013-06-13 13:02:41 UTC
Created attachment 760678 [details]
program.log

Comment 24 Kamil Páral 2013-06-13 13:02:52 UTC
Created attachment 760679 [details]
storage.log

Comment 25 Kamil Páral 2013-06-13 13:03:05 UTC
Created attachment 760681 [details]
syslog

Comment 26 Kamil Páral 2013-06-13 13:32:36 UTC
Since I'm the usual troublemaker, I'm proposing this as FinalBlocker. It was rejected just for Alpha.

The justification:
Software repositories are sometimes broken. The main 'fedora' repository will be frozen in an OK state, but 'updates' repository can be broken any time. This can affect users during the whole F19 life cycle.

If anaconda wants to allow installation even with broken repos, that's fine. But it should be an informed decision of the user. Currently the user just opens the software spoke, hits Done, and everything _seems_ to be OK. He was not warned about any possible consequences. He naturally proceeds with installation. His disk layout is destroyed and re-created and only then a packaging error is displayed, ending the installation and leaving the system non-functional. Because bootloader installation did not proceed, but disk changes were performed, this can leave some of his other OSes unbootable, even though their contents might be fine.

All of this could have been prevented if the error stays displayed, instead of getting hidden. So this is kind of "don't destroy user data" criterion. If the partitions weren't modified before installation failure, everything would be fine.

Comment 27 Adam Williamson 2013-06-17 18:39:56 UTC
Discussed at 2013-06-17 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2013-06-17/f19final-blocker-review-6.2013-06-17-16.01.log.txt . We accepted this as a blocker with a moderate stretch of the criterion "When doing a graphical install using the dedicated installer images, the installer must be able to install each of the release blocking desktops, as well as the minimal package set." in the case of using the 'updates' repo (which is default for a net install) at a time when it has a depsolving problem. This is a bit of a stretch, but we really feel there at least ought to be a proper warning in this case.

Note that the maintainer ping-pong is in an uncertain state here. Chris, you re-assigned it to packaging-team-maint on 06-13 without further comment, but in comment #13, they seemed to disclaim responsibility:

"Sorry, not sure I can help.. There are actually THREE depsolve runs in packaging.log...1 and 3 failed, 2 is okay.  Opening it with vimdiff, ds1 and ds3 are very similar, but ds2 is completely different.  Also the depsolve time was ~1s instead of ~5s.  I'd say that a lot of packages were not pulled in."

So they seem to be saying anaconda is somehow doing an invalid depsolv here - testing only a small package set, for some reason, which passes - and it's not the packaging tools' fault. You two need to straighten out whose responsibility this is.

Comment 28 Chris Lumens 2013-06-17 18:47:13 UTC
If you look at comment #12, you'll see I was asking for information on what we might need to be doing in anaconda to clean things up for a second successful run.  I already know there are multiple runs, and I already know the time in one was very short and that a lot of packages were not pulled in, and I also already know there's invalid depsolving.

What I need to know is:  Do I need to be calling something from yum to clean up between depsolving runs?  What I do not need is a lecture on responsibility.

Comment 29 Adam Williamson 2013-06-17 19:15:29 UTC
Sorry, I'm not trying to lecture, just trying to make sure that someone acknowledges this is currently their ball, rather than it just sitting there with nothing happening.

Comment 30 Zdeněk Pavlas 2013-06-18 07:19:14 UTC
> Do I need to be calling something from yum to clean up between depsolving runs?

IMO, "del yb.tsInfo" should be enough, but I'm not sure at all..

Comment 31 James Antill 2013-06-18 20:21:07 UTC
 The big hammer here (what "ts reset", in shell, does ... among other things) is:

yb.closeRpmDB()

...you might be able to get away without the last two (yb._up = None and yb.comps = None), but I wouldn't want to recommend less than that.

Comment 32 Jan Zeleny 2013-06-19 06:50:32 UTC
Back to you Chris. If there is any other information you need, let us know.

Comment 33 Adam Williamson 2013-06-21 02:35:26 UTC
clumens mentioned on IRC today that:

<adamw> clumens: did you get everything you needed from yum folks for 924162 ? (we're doing blocker review)
<clumens> adamw: not really, we're already calling yumWhateverWhateverDB

so it looks like some more co-ordinated debugging may be required here.

A quick lazy grep gives me:

[adamw@adam anaconda (f19-branch %)]$ grep -10 -R closeRpmDB *
pyanaconda/packaging/yumpayload.py-                self._yum._pkgSack = None
pyanaconda/packaging/yumpayload.py-
pyanaconda/packaging/yumpayload.py-            self._yum.repos.pkgSack = MetaSack()
pyanaconda/packaging/yumpayload.py-
pyanaconda/packaging/yumpayload.py-            for repo in self._yum.repos.repos.values():
pyanaconda/packaging/yumpayload.py-                repo._sack = None
pyanaconda/packaging/yumpayload.py-
pyanaconda/packaging/yumpayload.py-    def deleteYumTS(self):
pyanaconda/packaging/yumpayload.py-        with _yum_lock:
pyanaconda/packaging/yumpayload.py-            log.debug("deleting yum transaction info")
pyanaconda/packaging/yumpayload.py:            self._yum.closeRpmDB()
pyanaconda/packaging/yumpayload.py-            del self._yum.tsInfo
pyanaconda/packaging/yumpayload.py-            del self._yum.ts
pyanaconda/packaging/yumpayload.py-
pyanaconda/packaging/yumpayload.py-    def preStorage(self):
pyanaconda/packaging/yumpayload.py-        self.release()
pyanaconda/packaging/yumpayload.py-        with _yum_lock:
pyanaconda/packaging/yumpayload.py-            self._yum.close()
pyanaconda/packaging/yumpayload.py-
pyanaconda/packaging/yumpayload.py-    ###
pyanaconda/packaging/yumpayload.py-    ### METHODS FOR WORKING WITH REPOSITORIES

https://git.fedorahosted.org/cgit/anaconda.git/tree/pyanaconda/packaging/yumpayload.py?h=f19-branch

Comment 34 Chris Lumens 2013-06-21 16:15:42 UTC
Looking at the first packaging.log, if you dig around a bit, you'll see the following line:

06:04:29,742 CRIT yum.YumBase: Warning: Group core does not have any packages to install.

There's a lot of other lines like that.  Now this isn't exactly the same as the problem reported in this bug, but it is suspiciously similar - for whatever reason, some yum structures are simply missing the data required to operate.  This fits with my theory that whatever cleanup/resetting anaconda is doing, we're either doing too much or not enough.

Note that we are definitely already calling closeRpmDB.

If I look in yum a bit more and do some debugging, it looks like all groups and packages are getting skipped in selectGroup because all packages are listed as "blacklisted-available".  And that gets set by _groupInstalledData, because the following is returning an empty list:

            ipkgs = self.rpmdb.searchNames([pkg_name])

So, what thing do I either need to do or not need to do for the second run to make sure the rpmdb is populated with enough stuff to return a non-empty list?

Comment 35 Jaroslav Reznik 2013-06-24 11:33:15 UTC
Couldn't we at least go with a warning dialog solution as we're getting to the point we need the fix if we want to release on time? It's pretty late for a new string although.

Comment 36 Jaroslav Reznik 2013-06-24 11:36:57 UTC
And setting back needinfo, sorry, forgot to unchech the box and even as BZ is slow as ... I was slow to cancel submission.

Comment 37 Adam Williamson 2013-06-24 15:36:04 UTC
jreznik: from my reading of the issue it is not subject to a 'simple' fix like that. The problem is ultimately that the package set consistency check is being run multiple times, some of which runs use an entirely incorrect set of packages and pass when they should fail. We don't actually *know* when the package set is inconsistent, so when should we pop up a warning?

Chris may be able to correct me, but I think that's the case.

Comment 38 Chris Lumens 2013-06-24 17:25:20 UTC
Looks like James has a patch for this built in koji already.

Comment 39 James Antill 2013-06-24 17:29:11 UTC
Building for F19 now:

http://koji.fedoraproject.org/koji/taskinfo?taskID=5536337

Comment 40 Adam Williamson 2013-06-24 17:41:27 UTC
Good news:

"Fix igroups not being reset in "ts reset" and anaconda. BZ 924162."

Bad news:

"Update to latest HEAD."

In fact, as the current stable yum build is -95, as things stand we have to take all this churn just to fix this single blocker bug:

* Mon Jun 24 2013 James Antill <james at fedoraproject.org> - 3.4.3-99 
- Update to latest HEAD. 
- Fix igroups not being reset in "ts reset" and anaconda. BZ 924162. 
- Check for bad checksum types at runtime. BZ 902357. 
- fix --cacheonly edge case. BZ 975619. 
* Wed Jun 19 2013 James Antill <james at fedoraproject.org> - 3.4.3-98 
- Update to latest HEAD. 
- Add simple way to specify a group of pkgs. for version. BZ 913461. 
- Change group install => group upgrade for installed groups. BZ 833087. 
- Give more text when telling user about y-c-t, mention history. BZ 974576. 
- Fix the tolerant yum.conf text. 
- Skip installonly limit, if something has set check_future_rpmdbv. BZ 962757.
* Mon Jun 17 2013 James Antill <james at fedoraproject.org> - 3.4.3-97 
- Update to latest HEAD. 
- Minor updates to fssnapshot command. 
- metalink: Don't assume maxconnections is always set. BZ 974826. 
- DeltaPackage: provide returnIdSum(). BZ 974394. 
* Fri Jun 14 2013 James Antill <james at fedoraproject.org> - 3.4.3-96 
- Update to latest HEAD. 
- Add real logging to updateinfo parsing. 
- Merge fssnap command. 
- Extend repos.findRepos() API so it can handle repolist input. 
- Add a prelistenabledrepos plugin point. 
- Auto. enable disabled repos. in repo-pkgs. 
- Do cacheRequirement() tests before doCheck().

I could really wish packages, especially critpath packages, would be more conservative with changes during Final freeze.

Comment 41 James Antill 2013-06-24 17:57:58 UTC
I thought -97 was already in F19, but apparently forgot to make update after building. Even then the "big" change is in -96 adding "yum fssnap" and that does nothing if the user doesn't run that command/config.

So this looks worse than it is (pretty much all the other stuff are simple bug fixes), but I can revert the merge and add the one line patch if required.

Comment 42 Adam Williamson 2013-06-24 18:01:28 UTC
if you're confident it shouldn't break anything, and clumens doesn't see any red flags, i think we can give it a shot. let's just hope it's okay.

Comment 43 Fedora Update System 2013-06-24 20:27:24 UTC
yum-3.4.3-99.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/yum-3.4.3-99.fc19

Comment 44 Martin Krizek 2013-06-25 14:24:31 UTC
Seems to be fixed in F19 RC1.

Comment 45 Kamil Páral 2013-06-25 14:30:47 UTC
With Martin we have removed some packages from DVD repo and adjusted the metadata. We verified that the problem was present in TC6, but was fixed in RC1. Setting as verified.

Comment 46 Fedora Update System 2013-06-26 04:28:49 UTC
yum-3.4.3-99.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.