Bug 209306

Summary: Severe RPM bug, causing file removal, FC-6 showstopper?
Product: [Fedora] Fedora Reporter: Hans de Goede <hdegoede>
Component: rpmAssignee: Paul Nasrat <nobody+pnasrat>
Status: CLOSED RAWHIDE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: rawhideCC: aleksey, axel.thimm, christoph.wickert, curtis, djuran, goeran, herrold, hugh, kevin, k.georgiou, mishu, nmiell, notting, philipp, pmatilai, tmraz, zing
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-06-26 07:52:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 235752    
Attachments:
Description Flags
Trivial patch reverting #187308 fix none

Description Hans de Goede 2006-10-04 14:47:51 UTC
Description of problem:
A couple of days ago someone posted that when removing i386 versions of packages
of which he had both an x86_64 and i386 version, caused doc files and po files
to be removed, evne though the x86_64 version was still installed, so I expected
to find something about this in BZ, but found nothing.

I just saw rpm remove files owned by multiple package versions when only one
version gets removed :( This is really bad IMHO.


How reproducible:
Here is what happened:
1) I did yum -y update
2) It took long and I needed my computer for something else, so I did
   CTRL-C while it was updating
3) Thus it had 2 versions of fedora-release-notes installed, as it never
   got to the cleanup part
4) I fixed this with "rpm -e fedora-release-notes-5.92-5" (the old ver)
5) When I started firefox it said it couldn't find:
   /usr/share/doc/HTML/index.html

 
Actual results:
/usr/share/doc/HTML/index.html was removed

Expected results:
/usr/share/doc/HTML/index.html should still be on the system



Additional info:

Comment 1 Hans de Goede 2006-10-04 16:09:44 UTC
As you're probably aware this is also being discussed on f-d-l I'm copy and
pasting some usefull comments here for tracking:

Rex Dieter wrote:
> Ralf Ertzinger wrote:
>  
>> On Wed, 4 Oct 2006 11:07:23 -0400, Jesse Keating wrote:
>>
>>> I do believe this should have been a --justdb flag.
>> Still, if there is still a package owning these files installed
>> after the removal the files ought to stay.
> 

Exactly, this used to work fine with previous versions (FC-5) of rpm, actually
you do not want to use --justdb, because if files were removed / renamed fomr
one version-release to the other you would ned up with stale unowned files.

I'm using rawhide for a long time and as such often have yum breakage causing
this kinda dual version installs and have been using rpm -e old-version without
--justdb hapily to fix this for a long time.

Also as reported earlier the same thing happens when removing one arch of a
multi-arch package.



Comment 2 Paul Nasrat 2006-10-04 17:31:05 UTC
The patch for removing skipDirs was deliberately removed as it caused huge
performance degredation.  See bug #187308.  Revisiting the fingerprinting code
is not something that will happen for FC6.

Comment 3 Hans de Goede 2006-10-04 17:38:49 UTC
I must say that doesn't seem a good trade off, if I've read bug #187308
correctly, then their wasn't a real problem as the guys DIMM's where bad. So the
patch got dropped because if someone decides to install many kernels rpm starts
using a lot of memory.

So we can choise between:
1) Using lots of memory if someone has many kernels installed
   (unlikely with our current yum setup)
2) Remove semi random files when someone decides to remove an i386 package
   of which the x86_64 version is also installed.

Hmmm, yeah really hard choice


Comment 4 Jeff Johnson 2006-10-04 17:49:29 UTC
Hint: You could pare the skipDir list back to "/lib/modules" and nothing else.

FWIW, this issue is fixed (by reverting the band-aid needed because yum was
installing kernel of the day on every bleeping user machine) in rpm-4.4.7 
(since 4/1/2006).

Comment 5 Tomas Mraz 2006-10-04 18:55:02 UTC
I agree with Hans. This has bitten me a few times.

Comment 6 Jeff Johnson 2006-10-04 19:22:30 UTC
The "semi-random" files is exactly deterministic, identical to
files on the paths mentioned in the skipDirs list:
    _skip("/usr/share/zoneinfo"),
    _skip("/usr/share/locale"),
    _skip("/usr/share/i18n"),
    _skip("/usr/share/doc"),
    _skip("/usr/lib/locale"),
    _skip("/usr/src"),
    _skip("/lib/modules"),

And its noyt exactly like this is a new behavior. In fact, it is *exactly*
the behavior ordered by the RH devel team when I did the implementation.

All issues were pointed out *by me* at that time. "I'm shocked, simply shocked!" exclamations
and discussion days before FC6 final are ingenuous.


Comment 7 Hans de Goede 2006-10-04 20:07:35 UTC
I've done some more testing and I'm starting to understand now. Can we please
take /usr/share/doc and /usr/share/locale out of this list, with the current
setup I get missing docs and worse missing translations when removing i386
versions of double installed packages.



Comment 8 Jeff Johnson 2006-10-04 20:22:05 UTC
(aside) The alternative is to identify multiply owned files on those paths
and decide a single package as owner for shared files.

No matter what, the skipDirs band-aid needs to be eliminated from rpm.

Comment 9 Hans de Goede 2006-10-05 04:51:10 UTC
(In reply to comment #8)
> (aside) The alternative is to identify multiply owned files on those paths
> and decide a single package as owner for shared files.
> 

That won't work mith a multi-arch setup where both the i386 and x86_64 rpm's
will own the commmon files, this is the scenario that has me worried. I still
think that removing /usr/share/doc and /usr/share/locale from the skipdirs list
is a good fix for now and soon be done ASAP preferably before FC-6, it seems
like a quick and safe fix to me.

Comment 10 Jeff Johnson 2006-10-05 12:45:21 UTC
Yes, infeasible (but would "work").

If up to me, I'd remove skipDirs entirely. In fact, that was done on 4/1/2006.

Comment 11 Kevin Kofler 2006-12-16 05:57:40 UTC
I agree with comment #3, I think there really ought to be an RPM update 
reverting the #187308 fix ASAP, as that was likely not even a real bug, whereas 
this regression is definitely real.

By the way, this not only affects x86_64 multilibs, but also removal of 
duplicates after a failed transaction.

Comment 12 Kevin Kofler 2006-12-30 03:58:05 UTC
Created attachment 144573 [details]
Trivial patch reverting #187308 fix

I have attached a trivial specfile patch which should fix this problem for
those who want to try it. (I'm not uploading SRPMs or RPMs because I don't have
a good place to upload them to.)

WARNING: This patch is NOT endorsed by the Fedora Project or Red Hat, it is
only provided in the hope that it will be useful. I doubt this patch will break
RPM as it was effectively in force up to FC5, but if it does break, don't blame
me. You have been warned.

Comment 13 D. Hugh Redelmeier 2007-02-13 21:32:15 UTC
I feel that the problem is that RPM does not model sharing between packages that
are the same except for architecture.  Another symptom is reported in
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=128622

Comment 14 Jeff Johnson 2007-02-13 21:57:47 UTC
I feel that computers were intended to have only a single architecture, just like paths
can contain one and only one type of contents. Int rela, non-virtual world, its perfectly obvious
that 2 objects cannot occupy the same space-time point, quantum mechanics be damned.

All depends on the definition of "same". If truly the "same", why do you need two packages?

Comment 15 Aleksey Nogin 2007-02-13 22:46:39 UTC
This was originally reported as bug 119372, later marked as dup of bug 140055
(see also bug 140055 comment #13).

Comment 16 Jeff Johnson 2007-02-13 23:25:27 UTC
FWIW, this problem has been solved in rpm-4.4.6 and later (almost a year now, 4/1/2006)
by removing deliberately introduced breakage.

Marking DUPES is rearranging deck chairs on the Titanic, the cause is well known ...

Comment 17 Nicholas Miell 2007-02-13 23:46:11 UTC
Jeff, this is a Red Hat bugzilla, tracking bugs in Red Hat and Fedora packages.

Go be crazy somewhere else, ok?

Comment 18 Jeff Johnson 2007-02-14 00:06:31 UTC
Then *fix* the bleeping problem.

Comment 19 David Woodhouse 2007-04-10 19:28:27 UTC
One potential answer to the problem is "Don't have conflicting files then".
That's the gist of bug 235757.

Although that answer is more viable for binaries than documentation, perhaps.

Comment 20 Kevin Kofler 2007-06-08 01:39:54 UTC
*** Bug 243224 has been marked as a duplicate of this bug. ***

Comment 21 Kevin Kofler 2007-06-20 12:30:33 UTC
Panu, any chance you can do something about this?

Comment 22 Kevin Kofler 2007-06-20 12:49:18 UTC
Oh, I see Panu is already on it, see this 1-week-old thread:
https://lists.dulug.duke.edu/pipermail/rpm-maint/2007-June/000379.html
That's good news.

Comment 23 Jeff Johnson 2007-06-23 11:54:41 UTC
CLOSED

Comment 24 Kevin Kofler 2007-06-23 12:37:23 UTC
No, this is NOT closed until this is fixed in Fedora, i.e. either this is fixed 
in the rpm.org tree and Fedora updates to a new version with the fix or this is 
fixed in a Fedora patch.

The rpm5.org tree is entirely irrelevant.

Comment 25 Kevin Kofler 2007-06-23 12:43:27 UTC
Oh, it has been fixed upstream for 4.4.2.1 on June 19:
http://hg.rpm.org/rpm?cs=c9fb2eb5ae26

Let's hope 4.4.2.1 gets released and into Fedora soon.

Comment 26 Panu Matilainen 2007-06-26 07:52:20 UTC
Fixed in next rawhide push by rpm 4.4.2.1-rc1