Bug 1848201 - Setting netsharedpath in .rpmmacros causes lorax to fail when run inside of mock
Summary: Setting netsharedpath in .rpmmacros causes lorax to fail when run inside of mock
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: mock
Version: 32
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Copr Team
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-17 23:09 UTC by Brian Lane
Modified: 2020-08-06 04:26 UTC (History)
11 users (show)

Fixed In Version: mock-2.4-1.fc32 mock-2.4-1.fc31 mock-2.4-1.el8 mock-2.4-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-23 01:06:38 UTC
Type: Bug


Attachments (Terms of Use)

Description Brian Lane 2020-06-17 23:09:32 UTC
Setting the /builddir/.rpmmacros %netsharedpath macro to /proc:/sys is causing all rpm processes run inside the mock to skip creating /proc and /sys, even when running with a different root. This includes running `rpm --root`, `dnf --installroot`, and when using lorax to create a boot.iso

lorax expects the installroot it is using to include /proc (dracut needs /proc/modules to exist) so setting this macro causes the install of the filesystem package, in a separate path, to unexpectedly exclude /proc and /sys, which then causes lorax to fail to create the temporary /proc/modules file.

It looks like this macro is meant to be used for installing packages into the mock itself, so maybe it could be written only when needed?

Comment 1 Adam Williamson 2020-06-17 23:25:24 UTC
Note, we think https://github.com/rpm-software-management/mock/commit/85d544291da69dbc521369c8b9b87baa3f7ffc42 changed things here. It changed how this netsharedpath thing is done, we think probably in a way which made it apply to other processes that run inside the mock root whereas it did not before.

Comment 2 Pavel Raiskup 2020-06-18 12:06:11 UTC
Thank you for the report!  Can you please elaborate on how mock is used
for lorax.rpm build, or how mock is used by lorax?

Considering the myths about how bad is running /bin/rpm from rpmbuild
process, this problem sounds suspicious.  I thought that no-one will ever
try to install packages from within mock chroot, so the %_netsharedpath
shouldn't affect package builds...?

That said, I need to see the use-cases to decide what is the proper fix
(whether is it worth to split the ~/.rpmmacros generator again) or whether
we need to fix lorax (or lorax.rpm build).

Comment 3 Adam Williamson 2020-06-18 14:49:05 UTC
lorax is the tool that builds installer trees and images. As part of the compose process, we run it inside mock chroots on Koji builders, as the 'runroot' task. We were getting failures with that due to this bug, like:
https://koji.fedoraproject.org/koji/taskinfo?taskID=45690119

That was trying to build the Server DVD installer image for a Rawhide nightly compose. Part of what lorax actually does in this case is create a temporary directory and install a bunch of packages into it using libdnf (to create the installer environment). We think due to this issue, that temporary install root did not contain a /proc , even though it should have one as part of the 'filesystem' package; there was code in lorax that expects /proc to be present, and so it blew up.

We've worked around it for now by doing a lorax build with a patch I wrote to create /proc if it's not there - https://github.com/weldr/lorax/pull/1035 - but both Brian and I were curious as to what changed to make this suddenly a problem. We think it's this (note that this problem started happening after the builders were updated from mock 2.0 to mock 2.3 as part of the datacenter move).

So the process looks like this:

Builder
  ˅
Koji
  ˅
Mock
  ˅
Lorax
  ˅
Temporary 'root' inside mock chroot

and it's that temporary 'root' where we used to get a /proc but don't any more.

Comment 4 Pavel Raiskup 2020-06-18 19:56:18 UTC
Thank you for the info.  Ok, I suppose you run something like `mock
--shell` to execute the lorax, not --rebuild (rpmbuild).

Good catch mock's patch 85d544291da69dbc521 caused this.

So the question is what to do about this.  Isn't it actually good that
mock doesn't pre-populate unexpected content under /proc?  That lorax has
complete control of it?

Or should we stop defining that macro?  The thing is that I didn't realize
before this is an issue at all..., but from mock perspective - we
only need this for installing mock chroot from bootstrap chroot in fedora
toolbox.  So we could special-case this, somehow.  I suppose.  I suppose
it would break lorax in mock in toolbox very likely (but nobody does that
because no such report existed before).

Comment 5 Adam Williamson 2020-06-18 20:05:43 UTC
Yes, that is the question :) I would certainly be curious as to why we do this at all. It seems to be an abuse of what the property is meant for - per http://ftp.rpm.org/max-rpm/s1-rpm-query-parts.html , "The net shared state is used to support client systems that NFS mount portions of their filesystems from a server. Since the server most likely exports filesystems to more than one client, if a client erased a package that contained files on a shared filesystem, other client systems would have incompletely installed packages. The net shared state is used to alert RPM to the fact that a file is on a shared filesystem and should not be erased.", which does not seem to be the case here.

I can do a bit of git blame digging to try and figure out why we started using it in the first place, I guess.

Comment 6 Adam Williamson 2020-06-19 00:12:32 UTC
Hum, it's not some ancient thing as I imagined, it's quite recent. Here it is:

https://github.com/rpm-software-management/mock/commit/d8d224ff6f5be4f8ed7f73a056d63e104df3d48c

the commit message notes it was a "Suggestion from Panu Matilainen", so I guess I can't argue too hard that it's an abuse of RPM :P CC'ing Panu in case he has any thoughts here.

So, we probably can't just get rid of it - the reason why it was introduced is almost certainly still valid. I guess the choices I can think of are 1) try to go back to having it apply only to package installs *to the mock chroot itself* somehow or other, or 2) just note it as a wrinkle and accept the patch I wrote for mock?

Comment 7 Pavel Raiskup 2020-06-19 06:18:22 UTC
> 2) just note it as a wrinkle and accept the patch I wrote for mock?

Here you meant "patch for lorax" fixing use-cases in mock.

I'd honestly prefer this variant.  As long as we can agree that
a) the use-case lorax has is a very rare thing for mock (unlikely that anyone
   else does something similar), and
b) it's reasonably safe to build packages with pre-set _netsharedpath

Practically, how much are we in hurry now?  Can we give this bug a few
weeks for a decision?

Comment 8 Adam Williamson 2020-06-19 06:25:35 UTC
yeah, I meant "patch for lorax".

we're not in any particular hurry since we've applied the patch downstream and it seems to be working.

Comment 9 Brian Lane 2020-06-22 16:41:53 UTC
The patch doesn't really belong in Lorax, it is reasonable to expect that the files and directories installed by the filesystem package actually exist. OTOH if we could make dracut work without /proc present that would also solve it.

Comment 10 Adam Williamson 2020-06-22 16:50:23 UTC
sure, I just meant that we're not in a roaring hurry to come up with a definitive fix right now because we do have a workaround and composes are running.

The dracut angle is interesting indeed; the comment in lorax suggests that we're touching /proc/modules at that point to make dracut happy somehow, but I don't recall exactly what happens if we *don't*. is it a straight up failure or just an ugly warning or something? Have you tried it lately?

Comment 11 Brian Lane 2020-06-22 18:25:26 UTC
(In reply to Adam Williamson from comment #10)
> sure, I just meant that we're not in a roaring hurry to come up with a
> definitive fix right now because we do have a workaround and composes are
> running.

I fine with keeping it until a cleaner solution is found :)

> 
> The dracut angle is interesting indeed; the comment in lorax suggests that
> we're touching /proc/modules at that point to make dracut happy somehow, but
> I don't recall exactly what happens if we *don't*. is it a straight up
> failure or just an ugly warning or something? Have you tried it lately?

IIRC it was a failure. I haven't tried without it lately, there are also some other complaints that show up but aren't fatal, so maybe things have changed... I'll take a closer look when I have a chance this week.

Comment 12 Brian Lane 2020-06-22 23:31:52 UTC
(In reply to Brian Lane from comment #11)

> IIRC it was a failure. I haven't tried without it lately, there are also
> some other complaints that show up but aren't fatal, so maybe things have
> changed... I'll take a closer look when I have a chance this week.

https://github.com/weldr/lorax/pull/1042

Looks like it never caused a failure, just a bit more warning output. The initrd ends up exactly the same with or without it so I propose dropping it.

Comment 13 Panu Matilainen 2020-06-23 06:20:30 UTC
> the commit message notes it was a "Suggestion from Panu Matilainen", so I guess I can't argue too hard that it's an abuse > of RPM :P CC'ing Panu in case he has any thoughts here.

No abuse there, netsharedpath is in no way NFS-specific although the name seems to imply something like that, it's merely a mechanism to tell rpm a path is managed by something else and is untouchable. Which seems pretty fitting for /sys and /proc bind-mounted into a user namespace.

But that of course assumes something has created those directories to begin with. I thought mock will have done this anyhow when setting up the mounts, which are necessary for normal operation. I don't know about lorax so dunno why that doesn't happen here.

Comment 14 Pavel Raiskup 2020-06-23 13:42:28 UTC
Ok, I'm tempted to revert the guilty mock commit [1] now.  That's because
it looks to me now a bit semantically cleaner (the fact that build-time
macros don't affect the mock install-time and vice versa).  The
filesystem.rpm is responsible for installing those directories for certain
reasons.

Btw., note that there's a long-time open question [2] how to handle the
/proc and /sys directories in filesystem.rpm WRT containers (opinions are
welcome).

[1] https://github.com/rpm-software-management/mock/pulls
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1548403

Comment 15 Fedora Update System 2020-07-21 18:44:41 UTC
FEDORA-2020-619a166380 has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2020-619a166380

Comment 16 Fedora Update System 2020-07-21 18:44:41 UTC
FEDORA-EPEL-2020-7429d567a7 has been submitted as an update to Fedora EPEL 7. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2020-7429d567a7

Comment 17 Fedora Update System 2020-07-21 18:44:42 UTC
FEDORA-2020-23409355f7 has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-23409355f7

Comment 18 Fedora Update System 2020-07-21 18:44:42 UTC
FEDORA-EPEL-2020-08d93fc368 has been submitted as an update to Fedora EPEL 8. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2020-08d93fc368

Comment 19 Fedora Update System 2020-07-22 01:16:37 UTC
FEDORA-2020-23409355f7 has been pushed to the Fedora 32 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-23409355f7`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-23409355f7

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 20 Fedora Update System 2020-07-22 01:48:52 UTC
FEDORA-2020-619a166380 has been pushed to the Fedora 31 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-619a166380`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-619a166380

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 21 Fedora Update System 2020-07-22 02:12:06 UTC
FEDORA-EPEL-2020-7429d567a7 has been pushed to the Fedora EPEL 7 testing repository.

You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2020-7429d567a7

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 22 Fedora Update System 2020-07-22 02:13:48 UTC
FEDORA-EPEL-2020-08d93fc368 has been pushed to the Fedora EPEL 8 testing repository.

You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2020-08d93fc368

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 23 Fedora Update System 2020-07-23 01:06:38 UTC
FEDORA-2020-23409355f7 has been pushed to the Fedora 32 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 24 Fedora Update System 2020-07-30 19:08:22 UTC
FEDORA-2020-619a166380 has been pushed to the Fedora 31 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 25 Fedora Update System 2020-08-06 04:15:01 UTC
FEDORA-EPEL-2020-08d93fc368 has been pushed to the Fedora EPEL 8 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 26 Fedora Update System 2020-08-06 04:26:46 UTC
FEDORA-EPEL-2020-7429d567a7 has been pushed to the Fedora EPEL 7 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.