Bug 2166556

Summary: supermin + pacman should use "pacman -Qii package_name" to get list of conf files
Product: [Community] Virtualization Tools Reporter: George Tsiamasiotis <gtsiam>
Component: superminAssignee: Richard W.M. Jones <rjones>
Status: NEW --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: lersek, ptoscano, rjones
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description George Tsiamasiotis 2023-02-02 07:07:21 UTC
Description of problem:

This started as an investigation of some junky resolv.conf handling for virt-builder (on archlinux). Turns out, the issue is in what supermin considers to be a config file.

I typically have /etc/resolv.conf symlinked to /run/systemd/resolve/stub-resolv.conf, but this [1] code doesn't consider symlinks to be configuration. And so it copies what it thinks is not a configuration file to the final filesystem.

At this point I would just submit a patch myself, but I know neither ocaml nor where to submit patches (mailing list maybe?). Admittedly the first reason is more important than the second.

Version-Release number of selected component (if applicable):

supermin 5.3.2

How reproducible:

very.

Steps to Reproduce:
1. Replace `/etc/resolv.conf` with a symlink
2. Prepare a supermin.d folder (I used the one distributed with libguestfs on arch)
3. $ supermin --build -f ext2 supermin.d -o appliance.d # This doesn't seem to affect -f chroot, not sure why.
4. $ mkdir root && sudo mount -o loop appliance.d/root root
5. $ ls -l root/etc/resolv.conf

Actual results:

root/etc/resolv.conf is the /etc/resolv.conf symlink, copied from the host.

Expected results:

root/etc/resolv.conf is a regular file, copied from the base in supermin.d.

Additional info:

[1]: https://github.com/libguestfs/supermin/blob/86fd6f3e86ab99d54a22b475aecccfc19bdff07e/src/ph_pacman.ml#L164 (Also on other package handlers)

Comment 1 Richard W.M. Jones 2023-02-06 13:26:26 UTC
You can just email a patch to libguestfs without needing
to subscribe.  If the patch is sufficiently understandable even if it
doesn't work then I can fix it up before applying it.  Having said that
I'm not sure I understand the proposal - should we treat symlinks
as configuration files?

Comment 2 Richard W.M. Jones 2023-02-06 13:30:41 UTC
& should that be all symlinks or just /etc/resolv.conf?

Also the RPM format annotates configuration files ("%config") which
is very useful for dealing with upgrades - it's how RPM knows not
to blindly overwrite your local configuration when a newer package
contains an updated %config file.

Does pacman have anything similar to that?

Comment 3 Laszlo Ersek 2023-02-07 11:04:03 UTC
(following reference [1] from comment#0):

      let config =
	try string_prefix "/etc/" path && (lstat path).st_kind = S_REG
	with Unix_error _ -> false in
      { ft_path = path; ft_source_path = path; ft_config = config }

So I guess the point is that we might not want to call "lstat" <https://v2.ocaml.org/api/Unix.html#VALlstat> there, but "stat" <https://v2.ocaml.org/api/Unix.html#VALstat>. IOW, if it is a symlink, dereference it, and check the referenced file's type.

Is that correct?

I don't think we can decide this ourselves generally. Many of the common utilities have specific flags for dereferencing symlinks or not:
- cp: --dereference, --no-dereference
- tar: --dereference
- find: -P, -L

This is not an easy question though, as supermin seems to use lstat in numerous places. Figuring out which ones should consider such a flag will likely not be easy.

Comment 4 George Tsiamasiotis 2023-02-10 06:55:27 UTC
Sorry for the late reply. Had other stuff going on.

Regarding the title change: I just took a look at the other package handlers, and this should also affect dpkg. Only rpm-based distros are unaffected.

> You can just email a patch

Except I don't know the first thing about ocaml. Plus, if I'm being honest, I moved away from virt-builder due to all the bugs. Trying to --update a debian image complains how /dev/vda does not exist. It's just more effort than it's worth.

> Does pacman have anything similar to that?

Yes, the backup property [1] in PKGBUILDs. Quick google search also reveals the Debian equivalent [2].

> should that be all symlinks or just /etc/resolv.conf?

All symlinks. While resolv.conf is the one that obviously breaks the system, /etc/systemd/system/multi-user.target.wants/ is a good example of a directory that contains symlinks as configuration. It just so happens that the files in multi-user.wants are not in the package file list, but that only suppresses the bug. /etc/mtab is another example, but that happens to not be a problem since on arch it symlinks to ../proc/self/mounts, which should exist on the final system. If someone mounted /proc to /myproc and changed their /etc/mtab on the local filesystem things would also break (I know it's an insane thing to do, but it demonstrates my point).

> So I guess the point is that we might not want to call "lstat"

No, lstat should be fine. I was talking about amending the config file check to accept both S_REG and S_LNK as configuration files to prevent configuration symlinks from the live system. However Richard makes a good point, sourcing the configuration file list from `backup` on arch, `conffiles` on debian and whatever equivalent mechanism there is on other distributions would likely be a better solution overall.

---

But I just noticed I didn't go into much detail about how the original bug in guestfs-tools happens, so:
- the preparation step, during build time for guestfs-tools, correctly detects /etc/resolv.conf as a configuration file and it correctly puts it in base.tar.gz
- but then the build step, (which for some reason happens in the live system) by guestfs-tools, does not consider /etc/resolv.conf to be a configuration file (since it is a symlink) and thus copies it to the new rootfs, overriding the /etc/resolv.conf from the base with a symlink that only makes sense in the host file system.

It'd be really nice if the supermin build step by guestfs-tools was moved to guestfs-tools build time, but that's another project. Though it would resolve the dreaded /etc/resolv.conf bug. And similar any live system mismatch issues. And remove a bunch of unnecessary dependencies.

---

[1]: https://wiki.archlinux.org/title/Pacman/Pacnew_and_Pacsave#Package_backup_files
[2]: https://www.debian.org/doc/debian-policy/ap-pkg-conffiles.html

Comment 5 Richard W.M. Jones 2023-02-10 08:45:40 UTC
> It'd be really nice if the supermin build step by guestfs-tools was moved to guestfs-tools build time, but that's another project. Though it would resolve the dreaded /etc/resolv.conf bug. And similar any live system mismatch issues. And remove a bunch of unnecessary dependencies.

You can build libguestfs like this if you want:

https://libguestfs.org/guestfs-faq.1.html#how-can-i-compile-and-install-libguestfs-without-supermin
https://libguestfs.org/guestfs-internals.1.html#fixed-appliance

However it has a twoa serious drawbacks which is why we don't do it:

 - It means you have to distribute the whole appliance, several hundreds MBs,
   when that same data is already on the disk.

 - You have to redistribute the appliance every time there's a security update.

> [1]: https://wiki.archlinux.org/title/Pacman/Pacnew_and_Pacsave#Package_backup_files
> [2]: https://www.debian.org/doc/debian-policy/ap-pkg-conffiles.html

I'll see if we can modify supermin to use these.