Bug 507309

Summary: installroot option fails
Product: [Fedora] Fedora Reporter: Anil Seth <seth.anil>
Component: rpmAssignee: Panu Matilainen <pmatilai>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 11CC: borgan, cwhuang, ffesti, james.antill, jnovy, j, kdudka, mads, n3npq, pmatilai, roth, tim.lauridsen
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 4.7.1-3.fc11 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-10-27 06:35:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Anil Seth 2009-06-22 09:50:02 UTC
Description of problem:

--installroot option fails. Can't update a fedora 11 image.

-bash-4.0$ sudo yum --installroot=/mnt list mplayer
Loaded plugins: dellsysidplugin2, fastestmirror, presto, refresh-packagekit
Traceback (most recent call last):
  File "/usr/bin/yum", line 29, in <module>
    yummain.user_main(sys.argv[1:], exit_code=True)
  File "/usr/share/yum-cli/yummain.py", line 309, in user_main
    errcode = main(args)
  File "/usr/share/yum-cli/yummain.py", line 157, in main
    base.getOptionsConfig(args)
  File "/usr/share/yum-cli/cli.py", line 189, in getOptionsConfig
    self.conf
  File "/usr/lib/python2.6/site-packages/yum/__init__.py", line 652, in <lambda>
    conf = property(fget=lambda self: self._getConfig(),
  File "/usr/lib/python2.6/site-packages/yum/__init__.py", line 239, in _getConfig
    self._conf = config.readMainConfig(startupconf)
  File "/usr/lib/python2.6/site-packages/yum/config.py", line 794, in readMainConfig
    yumvars['releasever'] = _getsysver(startupconf.installroot, startupconf.distroverpkg)
  File "/usr/lib/python2.6/site-packages/yum/config.py", line 873, in _getsysver
    hdr = idx.next()
StopIteration


However, without the sudo, it works

-bash-4.0$ yum --installroot=/mnt list mplayer
Loaded plugins: dellsysidplugin2, fastestmirror, presto, refresh-packagekit
Available Packages
mplayer.x86_64                1.0-0.109.20090329svn.fc11                 rpmfusion-free

But, it seems to use the cache of the host.



Version-Release number of selected component (if applicable):
yum-3.2.22-4.fc11.noarch

How reproducible:
every time

Steps to Reproduce:
1.Mount a F11 image on /mnt
2.sudo yum --installroot=/mnt list fedora-release
3.
  
Actual results:
Exception.

Expected results:
Installed Packages
fedora-release.noarch            11-1              @fedora





Additional info:
This was working fine of Fedora 10.

Comment 1 Anil Seth 2009-06-22 10:07:19 UTC
Manually, testing the code as root:

[root@amd anil]# python
Python 2.6 (r26:66714, Jun  8 2009, 16:07:29) 
[GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import rpmUtils.transaction
>>> ts = rpmUtils.transaction.initReadOnlyTransaction(root='/mnt')
>>> idx = ts.dbMatch('provides','redhat-release')
>>> idx.count()
1
>>> hdr=idx.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Comment 2 Mads Kiilerich 2009-07-06 01:08:08 UTC
"me too". This prevents me from using mock.

But poking around I noticed that it helps to remove /mnt/var/lib/rpm/__db.00*. 

Anil: do this work around work for you?

Comment 3 Kamil Dudka 2009-07-06 07:03:39 UTC
For me it didn't. I have fresh F-11 installation and mock does not run for user root. If I pass over the error in yum/config.py, line 873, it crashes on another nonsense (broken rpm database). If I create an ordinary user and add him to group 'mock', it works perfectly for the user.

Comment 4 Mads Kiilerich 2009-07-06 08:59:31 UTC
Ok.

I can reproduce a chroot with the problem as root by running:
mock -v -r fedora-11-i386 rebuild mock-0.9.16-1.fc11.src.rpm

The problem is then seen as root with any yum command such as:
/usr/bin/yum --installroot /var/lib/mock/fedora-11-i386/root/ provides redhat-release

Yes, I probably shouldn't run mock as root. Running mock as non-root is a fine workaround, thank you! But I don't understand how avoiding a privilege escalation can solve the problem. That is confusing, and thus a problem.

The test case tricks yum/rpm into doing something which "breaks" it. It shouldn't be possible to trick yum/rpm to fail that badly.

And the problem seems to be a broken rpm database - and the rpm python lib showing that in a strange way.

Kamil, I don't understand your comment #3. Could you please be a bit more verbose? Do you agree that there is a problem, or are you saying that there is no problem? Reading your comment again it seems like you agree with me and that I'm just repeating what you just said ;-)

Comment 5 Kamil Dudka 2009-07-06 09:30:49 UTC
(In reply to comment #4)
> Kamil, I don't understand your comment #3. Could you please be a bit more
> verbose? Do you agree that there is a problem, or are you saying that there is
> no problem? Reading your comment again it seems like you agree with me and that
> I'm just repeating what you just said ;-)  

Yes, there *is* a problem. mock does not work under root on my machine (fresh up-to-date F-11 installation). It should be working.

Comment 6 seth vidal 2009-07-06 19:45:50 UTC
If you attempt to access the rpmdb inside the chroot using rpm itself - does it succeed or fail?

Comment 7 Kamil Dudka 2009-07-06 20:02:30 UTC
(In reply to comment #6)
> If you attempt to access the rpmdb inside the chroot using rpm itself - does it
> succeed or fail?  

What kind of access should I try to perform? I tried 'rpm --rebuilddb -vv' and 'rpm -qa', it works well. I don't have yum within the chroot as it crashed too early.

Are you able to reproduce the crash with mock or not?

Comment 8 seth vidal 2009-07-06 20:05:41 UTC
run a:
rpm -q --whatprovides redhat-release

but I can't replicate this as mock, unless I run mock as root which it is not supposed to be run as.

Comment 9 Kamil Dudka 2009-07-06 20:18:04 UTC
(In reply to comment #8)
> run a:
> rpm -q --whatprovides redhat-release

bash-4.0# rpm -q --whatprovides redhat-release
fedora-release-11.90-1.noarch

> but I can't replicate this as mock, unless I run mock as root which it is not
> supposed to be run as.  

The same for me: It crashes if I run mock as root. (please read my comment #3 and comment #5)

mock is not supposed to be run as root? Then you can chalk it up to a misunderstanding on my part. Please point me to the proper documentation saying this crucial thing.

Anyway it might be pretty good enhancement if it fails with a message "mock is not supposed to be run as root". The above mentioned backtrace does not say anything to ordinary user as me :-)

Comment 10 Mads Kiilerich 2009-07-06 23:40:42 UTC
Info requested by svidal in comment #8:

# mock -v -r fedora-11-i386 shell
...
mock-chroot> rpm -q --whatprovides redhat-release
error: cannot open Packages index using db3 - No such file or directory (2)
error: cannot open Packages database in /var/lib/rpm
error: cannot open Packages database in /var/lib/rpm
no package provides redhat-release
mock-chroot> ls -l /var/lib/rpm/
total 8540
-rw-r--r-- 1 root mockbuild  688128 2009-07-06 11:02 Basenames
-rw-r--r-- 1 root mockbuild   12288 2009-07-06 11:02 Conflictname
-rw-r--r-- 1 root root            0 2009-07-06 11:02 __db.000
-rw-r--r-- 1 root root        24576 2009-07-07 00:44 __db.001
-rw-r--r-- 1 root root       180224 2009-07-07 00:44 __db.002
-rw-r--r-- 1 root root      1318912 2009-07-07 00:44 __db.003
-rw-r--r-- 1 root root       491520 2009-07-07 00:44 __db.004
-rw-r--r-- 1 root mockbuild  122880 2009-07-06 11:02 Dirnames
-rw-r--r-- 1 root mockbuild  684032 2009-07-06 11:02 Filedigests
-rw-r--r-- 1 root mockbuild   12288 2009-07-06 11:02 Group
-rw-r--r-- 1 root mockbuild   12288 2009-07-06 11:02 Installtid
-rw-r--r-- 1 root mockbuild   12288 2009-07-06 11:02 Name
-rw-r--r-- 1 root mockbuild 5156864 2009-07-06 11:02 Packages
-rw-r--r-- 1 root mockbuild   90112 2009-07-06 11:02 Providename
-rw-r--r-- 1 root mockbuild   28672 2009-07-06 11:02 Provideversion
-rw-r--r-- 1 root mockbuild   12288 2009-07-06 03:21 Pubkeys
-rw-r--r-- 1 root mockbuild   53248 2009-07-06 11:02 Requirename
-rw-r--r-- 1 root mockbuild   57344 2009-07-06 11:02 Requireversion
-rw-r--r-- 1 root mockbuild   12288 2009-07-06 11:02 Sha1header
-rw-r--r-- 1 root mockbuild   12288 2009-07-06 11:02 Sigmd5
-rw-r--r-- 1 root mockbuild   12288 2009-07-06 11:02 Triggername
mock-chroot> rm -f /var/lib/rpm/__db.00*
mock-chroot> rpm -q --whatprovides redhat-release
fedora-release-11-1.noarch

So the problem could just as well be in rpm. And I assume that it probably not is in mock.

BUT that is after having run mock as root. Seth, if I understand you right then you can reproduce it that way too?

Seth, are you implying that because the testcase uses mock in a way you claim it wasn't intented to be used then the test case isn't valid, what we see is correct behaviour, and there is no bug?

Comment 11 Mads Kiilerich 2009-07-06 23:50:44 UTC
Test case with rpm and yum, no mock:

# rm -rf /tmp/x
# rpm --root /tmp/x -ihv fedora-release-11-1.noarch.rpm 
# yum --installroot=/tmp/x install -y filesystem
# yum --installroot=/tmp/x install -y rpm
# chroot /tmp/x
bash-4.0# rpm -q --whatprovides redhat-release
error: cannot open Packages index using db3 - No such file or directory (2)
error: cannot open Packages database in /var/lib/rpm
error: cannot open Packages database in /var/lib/rpm
no package provides redhat-release
bash-4.0# rm -f /var/lib/rpm/__db.00*
bash-4.0# rpm -q --whatprovides redhat-release
fedora-release-11-1.noarch

Comment 12 seth vidal 2009-07-07 04:24:37 UTC
reassigning over to rpm since it looks like something happening independent of yum, too.

Comment 13 Kamil Dudka 2009-07-07 07:02:37 UTC
Seth, worth to open separate bug against mock to fail with the appropriate message when running mock as root?

Comment 14 Mads Kiilerich 2009-07-07 10:16:43 UTC
I can confirm that the same problem is seen when installing the same set of packages with rpm. No mock and no yum and no non-root users involved.

strace shows that rpm fails after
open("/tmp/x/var/lib/rpm/Packages", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)

Apparently the __db files contains a reference to this un-chrooted name:

# grep /tmp/x/var/lib/rpm/Packages /tmp/x/var/lib/rpm/__*
Binary file /tmp/x/var/lib/rpm/__db.003 matches

(Mock nukes __db.* on startup. But after a chrooted "rpm -Uvh --nodeps x.src.rpm" new __db.* files are left when started as root, not as ordinary users. I'm sure that mock will work fine as root again when the rpm issue has been solved, but I'm surprised that it behaves differently when started suid.)

Comment 15 Carl Roth 2009-07-25 22:09:16 UTC
After I build a mock chroot on F11, I see a similar failure:

  $ mock -v -r XXX --clean
  $ mock -v -r XXX --init
  $ mock -v --no-clean -r XXX --shell

  mock-chroot> rpm -q rpm
  error: cannot open Packages index using db3 - No such file or directory (2)
  error: cannot open Packages database in /var/lib/rpm
  error: cannot open Packages database in /var/lib/rpm
  package rpm is not installed

  mock-chroot> /usr/lib/rpm/rpmdb_stat -e -h /var/lib/rpm | grep /var/lib/rpm
  Pool File: /local/spool/mock/fedora-11-x86_64/root/var/lib/rpm/Packages
  Pool File: /local/spool/mock/fedora-11-x86_64/root/var/lib/rpm/Name

Comment 16 Carl Roth 2009-07-25 23:11:43 UTC
I tried the same steps on an i386 box.  Both systems are running rpm-4.7.0-2.fc11.  The i386 system did not exhibit this behavior (i.e. chroot path embedded into 'Pool File').

Comment 17 Huang, Chih-Wei 2009-08-23 16:32:03 UTC
Any progress about this bug?
I was knocked down by this bug today, when tried mock --rebuild.
Finally I found it was caused by rpm db crashed, as described.
My rpm, yum and mock are all newest updated.

yum-3.2.23-3.fc11.noarch
rpm-4.7.1-1.fc11.i586
mock-0.9.16-1.fc11.noarch

Carl, I also tested on an i386(i586) box. This is not a x86_64 specific issue.

Besides, I found another bug 511158 is resulted from this one.
(but that bug was incorrectly closed)

Comment 18 Huang, Chih-Wei 2009-08-23 16:43:37 UTC
By the way, in the above discussion
someone seemed to say one shouldn't run mock as root.
This word really surprised me.
I can't see any doc that say mock should not be run as root.
Indeed when I run mock by a normal user,
it prompted a dialog to ask me the root password.
So it definitely need root privilege.
This is so obviously to me, since it need root to do mount, chroot, ...

Please correct me if I'm wrong.

Comment 19 Huang, Chih-Wei 2009-08-23 16:53:21 UTC
Hi Carl,
I don't tested x86_64. But on an i386 box, you can exhibit this behavior
by doing mock --rebuild YYY.src.rpm or mock --install ZZZ
after mock --init and before mock --shell.
Since these commands will do 'yum --installroot ...', that crashed the rpm db.

For example, here is my output after mock --rebuild
mock-chroot> /usr/lib/rpm/rpmdb_stat -e -h /var/lib/rpm | grep /var/lib/rpm
Pool File: /var/lib/rpm/Packages
Pool File: /var/lib/rpm/Name
Pool File: /var/mock/ippbx/root/var/lib/rpm/Providename
          ^^^^^^^^^^^^^^^^^^^^^^ chroot path was embedded !

Comment 20 Kamil Dudka 2009-08-23 17:02:18 UTC
(In reply to comment #18)
> Please correct me if I'm wrong.  

I've added myself to group 'mock'. Now it doesn't ask for the root password. Not sure if this is documented somewhere, some hints can be found here:

https://fedoraproject.org/wiki/Projects/Mock#Build_User

Comment 21 Mads Kiilerich 2009-08-23 22:22:26 UTC
Re comment #17:
AFAIK there haven't been any progress and nobody is working on it. 

It seems like the db4 files are opened by full path without chroot, but unfortunately these paths are stored in the __db cache files, and even more unfortunately rpm later chokes on it in a chroot where the paths are invalid. I don't know how it should be solved. Should rpm be more tolerant to invalid paths? Or make sure the path is stored without the chroot? Or should rpm avoid using __db files in chroot - or just remove them afterwards?

Is is easy to reproduce, so I assume that it would be easy for a rpm developer to fix it ;-)

[root@localhost tmp]# rm -rf /tmp/bugger/
[root@localhost tmp]# rpm --root /tmp/bugger -ihv fedora-release-11-1.noarch.rpm 
warning: fedora-release-11-1.noarch.rpm: Header V3 RSA/SHA256 signature: NOKEY, key ID d22e77f2
Preparing...                ########################################### [100%]
   1:fedora-release         ########################################### [100%]
[root@localhost tmp]# grep /tmp/bugger /tmp/bugger/var/lib/rpm/*
Binary file /tmp/bugger/var/lib/rpm/__db.003 matches
[root@localhost tmp]# /usr/lib/rpm/rpmdb_stat -e -h /tmp/bugger/var/lib/rpm
...

Comment 22 Jeff Johnson 2009-08-27 19:35:04 UTC
Remove the __db* files after installing in chroot. The dbenv files will be
recreated as needed.

Automating the removal assumes that RPM can "know" whether an
rpmdb is in use or not. Removal of dbenv files opens lock races
when an active dbenv is removed.

(aside)
Automatic removal during dbenv open failure is quite possible,
but patches to achieve that have been rejected because of the
possibility of "lock races". The only way to tell whether locks
are active (or stale) is to see which processes/threads are still
active, which is intrinsically racy. It takes a finite amount
of time to test for active processes, and there is a time
interval between test and removal in which the dbenv might
become active.

Note that the existence of a chroot'ed path (or not) within
dbenv files is purely a symptom. Yes, there are paths within
__db* files, exactly as intended. The flaw is whether the
path is correct wrto chroot (or not), one or the other access
will always be "unhappy" with whatever path is within dbenv files.

Comment 23 Mads Kiilerich 2009-08-27 21:39:19 UTC
I think it will be a fair assumption that when rpm (or yum) is used with a --root option then it will be the only process accessing that rpmdb and it can and should thus safely remove the __db* files both before and after. At least if the path names shows that the __db* files has been created with another --root option than currently used.

I assume that the problem is that db4 automatically writes the paths of the db files into the __db* files, and that it has no "--root" functionality built in.

Is there no clean way rpm could tell db4 that "I am no longer using file X, so if I was the only user then please remove it from the region files".

Hmm ... thinking about it again ... will simultaneous access of the rpmdb with both --root and in a real chroot be detected at all? They access the same file by different names, so how can db4 ensure that file access is properly locked? Isn't a lock to use of one chroot at a time necessary anyway? Wouldn't a racy lock be better than no lock?

Comment 24 Jeff Johnson 2009-08-27 21:59:33 UTC
You can assume anything you wish. RPM itself cannot make those assumptions.

FYI: Historically with beehive a predecessor of Koki, that assumption was
dead wrong. Beehive actively installed from outside chroot while installs
from within chroot were also active.

But having Koji remove /var/lib/rpm/__db* files after chroot set up is easiest.
The result will be no __db* files with chroot paths.

Yes concurrent access from inside and outside a chroot will use
the same dbenv file with NPTL -> futex locks provided by the kernel.

You can (of course) add additional locks that behave however you
wish. But I've described how Berkeley DB Concurrent access works,
with the paths and locks within __db* files.

No library "knows" whether chroot(2) has been done. That's exactly
how chroot(2) was designed.

Comment 25 Fedora Update System 2009-10-08 10:05:41 UTC
rpm-4.7.1-3.fc11 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/rpm-4.7.1-3.fc11

Comment 26 Mads Kiilerich 2009-10-08 10:56:34 UTC
I assume the fix is
http://cvs.fedoraproject.org/viewvc/rpms/rpm/devel/rpm-4.7.1-chroot-env-paths.patch?revision=1.1&view=markup

Panu, can you give a brief description of how it is solved? The discussion above does not give a clear idea of a good solution. What behaviour should we expect?

Comment 27 Panu Matilainen 2009-10-08 11:56:51 UTC
There are two parts to the "fix":
a) When db environment is used, the db files are now accessed with relative to the environment instead of absolute paths, avoiding the outside chroot paths (c#12 and c#13)
b) In chrooted operations, rpm now automatically cleans up the environment when it becomes free, avoiding the most pathological cases of accidentally copying the environment around different hosts etc. It does open up some races in rpmdb open/close sequences but as this is limited to chrooted operation, it seems like reasonable compromise as there's typically some other means to limit the access (through mock etc)

These tweaks dont fix every possible scenario, but they do help quite a bit.

Comment 28 Mads Kiilerich 2009-10-08 12:54:35 UTC
Thanks, Panu. It sounds like a fine compromise = good solution. It seems to work fine for my use cases.

I hope to see it in f12 soon ;-)

Comment 29 Panu Matilainen 2009-10-08 12:55:49 UTC
*** Bug 513699 has been marked as a duplicate of this bug. ***

Comment 30 Panu Matilainen 2009-10-08 13:04:25 UTC
(In reply to comment #28)
> Thanks, Panu. It sounds like a fine compromise = good solution. It seems to
> work fine for my use cases.
> 
> I hope to see it in f12 soon ;-)  

It is in F12 already :)

Comment 31 Jeff Johnson 2009-10-08 14:32:23 UTC
"Blindly" emoving a dbenv in a chroot introducing known races is permitted, but checking
the for DB_VERSION_MISMATCH returned from an open is not?!?

Have fun!

Comment 32 Fedora Update System 2009-10-09 03:40:26 UTC
rpm-4.7.1-3.fc11 has been pushed to the Fedora 11 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update rpm'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F11/FEDORA-2009-10354

Comment 33 James Antill 2009-10-14 14:11:25 UTC
*** Bug 528743 has been marked as a duplicate of this bug. ***

Comment 34 Fedora Update System 2009-10-27 06:34:33 UTC
rpm-4.7.1-3.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.