Bug 1607223 - rpm-ostree install fails with newer librpm
Summary: rpm-ostree install fails with newer librpm
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: rpm-ostree
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Colin Walters
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: IoT
TreeView+ depends on / blocked
 
Reported: 2018-07-23 03:20 UTC by Kevin Fenzi
Modified: 2018-07-31 10:21 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-07-30 17:13:38 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github https://github.com/projectatomic rpm-ostree issues 1462 None None None 2018-07-23 12:50:56 UTC

Description Kevin Fenzi 2018-07-23 03:20:29 UTC
rpm-ostree-2018.6-2.fc29.x86_64

[root@localhost ~]# rpm-ostree install sl
Checking out tree 62e8f64... done
Enabled rpm-md repositories: rawhide
rpm-md repo 'rawhide' (cached); generated: 2018-07-21 08:51:59                                                        
Importing metadata [=============] 100%
Resolving dependencies... done
Will download: 1 package (17.0 kB)
  Downloading from rawhide: [=============] 100%
Importing (1/1) [=============] 100%
Checking out packages (1/1) [=============] 100%
Running pre scripts... 0 done
Running post scripts... 15 done
Writing rpmdb... error: Error running transaction:
[root@localhost ~]# rpm -q sl
package sl is not installed
[root@localhost ~]# getenforce
Permissive

Comment 1 Dusty Mabe 2018-07-23 12:35:07 UTC
anything in the journal?

Comment 2 Dusty Mabe 2018-07-23 12:47:34 UTC
I was able to reproduce:

```
[root@vanilla-rawhide-atomic ~]# rpm-ostree status
State: idle; auto updates disabled
Deployments:
● ostree://fedora-atomic:fedora/rawhide/x86_64/atomic-host
                   Version: Rawhide.20180722.n.0 (2018-07-22 08:42:18)
                    Commit: fad2798a531d714ab5d82258321cb44225b4af5a6c2c369e20a70d370d03cac7
              GPGSignature: Valid signature by 5A03B4DD8254ECA02FDA1637A20AA56B429476B4
[root@vanilla-rawhide-atomic ~]#
[root@vanilla-rawhide-atomic ~]# rpm-ostree install sl
Checking out tree fad2798... done
Enabled rpm-md repositories: rawhide
Updating metadata for 'rawhide': [=============] 100%
rpm-md repo 'rawhide'; generated: 2018-07-22 08:13:02
Importing metadata [=============] 100%
Resolving dependencies... done
Will download: 1 package (17.0 kB)
  Downloading from rawhide: [=============] 100%
Importing (1/1) [=============] 100%
Checking out packages (1/1) [=============] 100%
Running pre scripts... 0 done
Running post scripts... 7 done
Writing rpmdb... error: Error running transaction:
[root@vanilla-rawhide-atomic ~]#
```



relevant output from the journal:

```
rpm-ostree[3437]: client(id:cli dbus:1.194 unit:session-6.scope uid:0) added; new total=1
rpm-ostree[3437]: client(id:cli dbus:1.194 unit:session-6.scope uid:0) vanished; remaining=0
rpm-ostree[3437]: In idle state; will auto-exit in 63 seconds
rpm-ostree[3437]: client(id:cli dbus:1.196 unit:session-6.scope uid:0) added; new total=1
rpm-ostree[3437]: Initiated txn PkgChange for client(id:cli dbus:1.196 unit:session-6.scope uid:0): /org/projectatomic/rpmostree1/fedora_atomic
rpm-ostree[3437]: Preparing pkg txn; enabled repos: ['rawhide'] solvables: 57841
rpm-ostree[3437]: Imported 1 pkg
rpm-ostree[3437]: No files matched %transfiletriggerin(lib) for glibc-common
rpm-ostree[3437]: No files matched %transfiletriggerin(lib64) for glibc-common
kernel: fuse init (API version 7.27)
systemd[1]: Mounting FUSE Control File System...
systemd[1]: Mounted FUSE Control File System.
rpm-ostree[3437]: Executed %transfiletriggerin(glibc-common) for lib, lib64, usr/lib, usr/lib64 in 630ms; 23993 matched files
rpm-ostree[3437]: Executed %transfiletriggerin(systemd-udev) for usr/lib/udev/hwdb.d in 137ms; 18 matched files
rpm-ostree[3437]: Executed %transfiletriggerin(systemd-udev) for usr/lib/udev/rules.d in 126ms; 50 matched files
rpm-ostree[3437]: Executed %transfiletriggerin(info) for usr/share/info in 407ms; 2 matched files
rpm-ostree[3437]: Executed %transfiletriggerin(shared-mime-info) for usr/share/mime in 159ms; 792 matched files
rpm-ostree[3437]: Executed %transfiletriggerin(glib2) for usr/lib64/gio/modules in 244ms; 4 matched files
rpm-ostree[3437]: Executed %transfiletriggerin(glib2) for usr/share/glib-2.0/schemas in 208ms; 30 matched files
rpm-ostree[3437]: sanitycheck(/usr/bin/true) successful
rpm-ostree[3437]: g_string_insert_len: assertion 'len == 0 || val != NULL' failed
rpm-ostree[3437]: Txn PkgChange on /org/projectatomic/rpmostree1/fedora_atomic failed: Error running transaction:
rpm-ostree[3437]: client(id:cli dbus:1.196 unit:session-6.scope uid:0) vanished; remaining=0
rpm-ostree[3437]: In idle state; will auto-exit in 61 seconds
rpm-ostree[3437]: In idle state; will auto-exit in 64 seconds
```

Comment 3 Dusty Mabe 2018-07-23 12:50:19 UTC
found an upstream bug: https://github.com/projectatomic/rpm-ostree/issues/1462

Comment 4 Colin Walters 2018-07-23 12:55:10 UTC
Can you also paste `rpm-ostree status`?

This works fine for me on FAH28.  As Dusty said the place to look is the journal, see e.g.: 
https://github.com/projectatomic/rpm-ostree/blob/b66337e0cbd94024ce249c022482d03978db81c1/src/libpriv/rpmostree-scripts.c#L414
which we print for script failure which is the most common case.

This is a failure writing the rpmdb; it looks like we should be calling `dnf_rpmts_look_for_problems()`.  However it's pretty unusual to see errors here - I can think of running out of disk space as one.

Comment 5 Colin Walters 2018-07-23 13:19:19 UTC
OK wow random side note, looking at rawhide: `/usr/lib/systemd/user/grub-boot-success.service` being on by default for FAH is problematic for a bunch of reasons (the first one being that we don't have `pkexec` wired up passwordless by default).

Comment 6 Dusty Mabe 2018-07-23 14:33:21 UTC
(In reply to Colin Walters from comment #4)
> Can you also paste `rpm-ostree status`?
> 

Is what I posted in comment #2 good enough or did you want Kevin's `rpm-ostree status` ?

Comment 7 Colin Walters 2018-07-23 16:07:44 UTC
OK with the libdnf patch applied I get:

Writing rpmdb... error: Error running transaction: package sl-5.02-9.fc29.x86_64 does not verify: Payload SHA256 digest: BAD (Expected abca348ceff42e65f20eabc36df43e49a69696dcf3482c3967a834ae942b5972 != e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855)

This appears true of every package.  Now, we've been doing
`rpmtsSetVSFlags (rpmdb_ts, _RPMVSF_NOSIGNATURES | _RPMVSF_NODIGESTS);`
for the rpmdb writing for a while which *should* be disabling the SHA256 checks.  However a quick glance at git log shows:

https://github.com/rpm-software-management/rpm/commit/f9f85af2d3a73a31ee099c6da809be8ebdeb2dc3
(And later commits)

which I highly suspect are involved here.

Comment 8 Colin Walters 2018-07-23 17:10:06 UTC
I tried the obvious patch:

```
$ git diff
diff --git a/libdnf b/libdnf
index b3fcc53f..6ff8f797 160000
--- a/libdnf
+++ b/libdnf
@@ -1 +1 @@
-Subproject commit b3fcc53f6f3baf4f51f836f5e1eb54eb82d5df49
+Subproject commit 6ff8f79745ddd548841a8831da06a71bcc71d6f7
diff --git a/src/libpriv/rpmostree-core.c b/src/libpriv/rpmostree-core.c
index b4aa6f43..4f1fe745 100644
--- a/src/libpriv/rpmostree-core.c
+++ b/src/libpriv/rpmostree-core.c
@@ -4010,7 +4010,9 @@ rpmostree_context_assemble (RpmOstreeContext      *self,
 
   g_auto(rpmts) rpmdb_ts = rpmtsCreate ();
   rpmtsSetVSFlags (rpmdb_ts, _RPMVSF_NOSIGNATURES | _RPMVSF_NODIGESTS);
-  rpmtsVfyFlags (rpmdb_ts, _RPMVSF_NOSIGNATURES | _RPMVSF_NODIGESTS);
+  /* https://github.com/rpm-software-management/rpm/commit/f9f85af2d3a73a31ee099c6da809be8ebdeb2dc3 */
+  rpmtsSetVfyFlags (rpmdb_ts, RPMVSF_MASK_NOSIGNATURES | RPMVSF_MASK_NODIGESTS |
+                    RPMVSF_MASK_NOHEADER | RPMVSF_MASK_NOPAYLOAD);
   rpmtsSetFlags (rpmdb_ts, RPMTRANS_FLAG_JUSTDB);
 
   tdata.ctx = self;
```

But that just gets me to: Writing rpmdb... error: Error running transaction: package sl-5.02-9.fc29.x86_64 does not verify: no digest

Offhand, looks like rpmvsVerify() doesn't actually return success if we specify to skip the verification.

Comment 9 Peter Robinson 2018-07-23 17:37:19 UTC
> But that just gets me to: Writing rpmdb... error: Error running transaction:
> package sl-5.02-9.fc29.x86_64 does not verify: no digest
> 
> Offhand, looks like rpmvsVerify() doesn't actually return success if we
> specify to skip the verification.

Why are we skipping the verification?

Comment 10 Colin Walters 2018-07-23 18:02:08 UTC
> Why are we skipping the verification?

rpm-ostree implements an entirely different model for filesystem management than "classic" librpm.  Basically we import packages into ostree branches locally, caching just their header.

Every operation reassembles the filesystem from scratch (today) - this is why e.g. rpm-ostree doesn't suffer from the directory <-> symlink bug and things like that.  libostree is doing the filesystem I/O, not librpm.

We do verification during import - this is a separate pass that is synthesizing the rpmdb (which is also regenerated from scratch each time).  So we don't want to verify things twice.

(Note the librpm checksum verification is generally redundant with the "outer" checksum from the rpm-md, and per https://theupdateframework.github.io/ you really want to be verifying that rather)

Comment 11 Colin Walters 2018-07-23 21:41:07 UTC
Splitting off the patch to https://github.com/projectatomic/rpm-ostree/pull/1469

In the meantime, one option is to revert librpm and retry the new verify API in a way that's a bit more compatible.  But let's see what Panu says.

Comment 12 Panu Matilainen 2018-07-30 09:42:52 UTC
The "outer" checksum is not entirely reliable as a package could've been corrupted/tampered with before createrepo gets run (it might be rare but it can happen). And not all packages come off a repo. 

It's implemented the way it is to enforce the verification for all existing API users without them learning new tricks - instead they need to learn new tricks to bypass it. Which is very much intentional.

The problem with your patch is that you're basically messing with the wrong thing. To entirely disable the verification step, either use

    rpmtsVfyLevel(ts, 0);

...or pass RPMPROB_FILTER_VERIFY flag in rpmtsRun() ignoreSet argument, whichever suits your purposes better.

I do wonder what exactly it is that you're doing to get invalid payload there though. It's simply expecting the full untampered package at that time, which doesn't seem that unreasonable thing to expect at a start of a transaction...

Comment 13 Colin Walters 2018-07-30 15:31:34 UTC
> rpmtsVfyLevel(ts, 0);

That works, thanks!

> I do wonder what exactly it is that you're doing to get invalid payload there though. It's simply expecting the full untampered package at that time, which doesn't seem that unreasonable thing to expect at a start of a transaction...

Yep, see comment #10 - this phase of operation is *just* doing `rpmtsSetFlags (rpmdb_ts, RPMTRANS_FLAG_JUSTDB)`.

rpm-ostree basically works like:

 - check out base tree
 - for each layered package: checkout(package)
 - for each layered package: run_pre_scripts(package)
 - for each layered package: run_post_scripts(package)
 - copyup(/usr/share/rpm)  # Creates a *copy* (reflink if possible) of the base rpmdb
 - Use librpm to write updates to the new rpmdb

This last part is the one that's failing now.

And we do all of these operations *every single upgrade* (or package install/uninstall).  The "re-check things out via hardlinks" is what makes this fast.

Note the checkout() operation is using libostree, not librpm.  At this point we're mostly using librpm to write the db.

At this point we're basically not using the payload checksum then.  I could imagine computing it while importing though?  We'd need to walk the payload twice though.

Comment 14 Colin Walters 2018-07-30 17:13:38 UTC
Should be fixed in rpm-ostree-2018.6-4.fc29

Comment 15 Panu Matilainen 2018-07-31 10:19:58 UTC
Hmm okay, with RPMTRANS_FLAG_JUSTDB it might well be reasonable to skip the payload entirely.

Comment 16 Panu Matilainen 2018-07-31 10:21:26 UTC
...meh, too many edits syndrome. That should've said something:

it might well be reasonable to have rpm automatically skip the payload entirely.


Note You need to log in before you can comment on or make changes to this bug.