Bug 1663040

Summary:	selinux prevents any service with PrivateDevices=true from starting (systemd-hostnamed, dbus-broker, ...) with systemd-240
Product:	[Fedora] Fedora	Reporter:	Adam Williamson <awilliam>
Component:	lorax	Assignee:	Brian Lane <bcl>
Status:	CLOSED RAWHIDE	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	urgent	Docs Contact:
Priority:	high
Version:	rawhide	CC:	anaconda-maint-list, awilliam, bcl, bobgus, dac.override, devin, dick, dmach, dshea, dwalsh, hdegoede, io, jonathan, kellin, kevin.paetzold, lam, lnykryn, lvrabec, mgrepl, mikhail.v.gavrilov, msekleta, nembard, ol+redhat, ovasik, pkotvan, plautrba, rebus, redhat-bugzilla, robatino, rvykydal, satellitgo, ssahani, s, systemd-maint, vanmeeuwen+fedora, vponcova, wwoods, zbyszek, zpytela
Target Milestone:	---	Keywords:	CommonBugs, Reopened
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:	openqa https://fedoraproject.org/wiki/Common_F30_bugs#selinux-private-devices-unlabelled
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-01-17 20:33:49 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1574713, 1574714

Description Adam Williamson 2019-01-02 19:57:13 UTC

Since Fedora-Rawhide-20181222.n.1 (previous compose, where this worked, was 20181219.n.1), the installer does not run in live images. No error is shown - just the UI never appears after you try to launch it, and if launched from a console it seems to be stuck. Looking /tmp/dbus.log shows:

dbus-daemon[2402]: Activating service name='org.fedoraproject.Anaconda.Modules.Network' requested by ':1.1' (uid=0 pid=2406 comm="python3 -m pyanaconda.modules.boss " label="unconfined_u:system_r:install_t:s0-s0:c0.c1023")
dbus-daemon[2402]: Failed to activate service 'org.fedoraproject.Anaconda.Modules.Network': timed out (service_start_timeout=120000ms)

and /tmp/anaconda.log shows:

05:34,965 ERR misc: Waiting for modules to be started timed out.
14:05:34,965 ERR stdout: Anaconda DBus modules failed to start on time.

I haven't yet found any logs that tell us much about *why* the Network module timed out, though.

anaconda and obvious network-related things did not change in the 20181222.n.1 compose. The most obvious suspect among packages that *did* change is probably systemd, which went from 239-10.git3bf819c to 240-2, so CCing zbyszek.

Comment 1 Adam Williamson 2019-01-02 20:00:51 UTC

sorry, correction: this happens on the *workstation* live. The *KDE* live does not seem to be affected. Not sure about other lives, yet.

Proposing as a Beta blocker as a violation of basic criterion "The installer must run when launched normally from the release-blocking images" - the Workstation live is a release-blocking image.

Comment 2 Adam Williamson 2019-01-02 20:13:07 UTC

Poking around a bit, if I run this manually:

python3 -m pyanaconda.modules.network

it seems to get stuck at:

INFO:anaconda.dbus.connection:Connecting to the system bus.

throwing some extra log lines in modules/network/network.py , it seems to get stuck right here:

        if conf.system.provides_system_bus:
            log.info("NetworkModule initializing #5.1")
            self._hostname_service_proxy = HOSTNAME.get_proxy()
            log.info("NetworkModule initializing #5.2")

I get the 5.1 log line, but not the 5.2 log line, so it's the HOSTNAME.get_proxy() call that's getting stuck...

Comment 3 Radek Vykydal 2019-01-03 09:44:29 UTC

systemd-hostnamed service fails to start. If I try to start it in the live environment manually I get:

● systemd-hostnamed.service - Hostname Service
   Loaded: loaded (/usr/lib/systemd/system/systemd-hostnamed.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2019-01-03 04:37:53 EST; 51s ago
     Docs: man:systemd-hostnamed.service(8)
           man:hostname(5)
           man:machine-info(5)
           https://www.freedesktop.org/wiki/Software/systemd/hostnamed
  Process: 2626 ExecStart=/usr/lib/systemd/systemd-hostnamed (code=exited, status=226/NAMESPACE)
 Main PID: 2626 (code=exited, status=226/NAMESPACE)

Jan 03 04:37:53 localhost systemd[1]: Starting Hostname Service...
Jan 03 04:37:53 localhost systemd[2626]: systemd-hostnamed.service: Failed to set up mount namespacing: Permission denied
Jan 03 04:37:53 localhost systemd[2626]: systemd-hostnamed.service: Failed at step NAMESPACE spawning /usr/lib/systemd/systemd-hostnamed: Permission denied
Jan 03 04:37:53 localhost systemd[1]: systemd-hostnamed.service: Main process exited, code=exited, status=226/NAMESPACE
Jan 03 04:37:53 localhost systemd[1]: systemd-hostnamed.service: Failed with result 'exit-code'.
Jan 03 04:37:53 localhost systemd[1]: Failed to start Hostname Service.

Comment 4 Adam Williamson 2019-01-03 16:44:47 UTC

ooh, well spotted.

Comment 5 Zbigniew Jędrzejewski-Szmek 2019-01-04 06:33:33 UTC

It's selinux-related. With setenforce 0, hostnamed starts without issue. Strangely, there's no AVC messages in either enforcing or permissive mode. So it seems we're hitting one of the noaudit rules. The two errors from PID 2626 are from systemd itself, pre-exec.

Comment 6 Zbigniew Jędrzejewski-Szmek 2019-01-04 08:41:20 UTC

mount("/tmp/namespace-dev-t6Rwrn/dev", "/run/systemd/unit-root/dev", NULL, MS_MOVE, NULL) = -1 EACCESS

Setting PrivateDevices=no also works around the issue. But systemd-hostnamed.service has had PrivateDevices=yes since commit v211-132-gd99a705296, 4 years ago. So it seems to be a regression in the selinux policy. Alternate explanation would be that some internal implementation detail in systemd changed. Either way, a selinux policy update seems necessary.

FTR, this is with selinux-policy-targeted-3.14.3-15.fc30.

Comment 7 Zbigniew Jędrzejewski-Szmek 2019-01-04 08:50:39 UTC

> some internal implementation detail in systemd changed
https://github.com/systemd/systemd/commit/0722b35934?

Comment 8 Adam Williamson 2019-01-04 20:57:05 UTC

It is most likely that the trigger was a change in systemd, yes, because selinux-policy-targeted did not change between 20181219.n.1 and 20181222.n.1, which is when the bug showed up.

Comment 9 Lukas Vrabec 2019-01-04 22:43:53 UTC

Hi All, 

Another nice case. :) 

Do we know if this bug is affecting only live images? 

Zbigniew, 
How did you look for denials?  Could it be dbus related? Could you put SELinux to permissive and when systemd-hostnamed will be started run:
# ausearch -m AVC -m USER_AVC -ts today

Thanks,
Lukas.

Comment 10 Zbigniew Jędrzejewski-Szmek 2019-01-05 11:13:44 UTC

Hi,

> Do we know if this bug is affecting only live images? 

I don't know.

> How did you look for denials?

Yes, there are no denials.

> Could it be dbus related?

Not really. The mount call is from systemd itself, in the child, after it has forked, and before it has exec'ed the new process. At that time it's busy arranging contents of the private /dev and other elements of the sandbox. There is no dbus communication at that time.

> Could you put SELinux to permissive

With "permissive" the issue is gone.

> and when systemd-hostnamed will be started run:
> # ausearch -m AVC -m USER_AVC -ts today

Just two "received setenforce notice" messages.
Like I said, I think some noaudit rule must be active.

Comment 11 Zbigniew Jędrzejewski-Szmek 2019-01-06 18:00:37 UTC

Pfff, something runs setenforce 1 in the live image. It happens near the end of installation to disk, so I suspect it's either anaconda or something that anaconda invokes.

> Do we know if this bug is affecting only live images? 

After installation to disk, hostnamed starts without issue. The issue appears to be limited to live images.

Comment 12 satellitgo 2019-01-07 16:46:35 UTC

rawhide soas live  liveinst times out and does not install 

(Fedora-SoaS-Live-x86_64-Rawhide-20181212.n.1.iso  worked)

Comment 13 satellitgo 2019-01-07 16:58:19 UTC

https://wiki.sugarlabs.org/go/File:90106-startup.png

Comment 14 Lukas Vrabec 2019-01-07 17:03:16 UTC

I tried it and it don't working even in permissive mode, only when SELinux is disabled. This could be some problem with SELinux object managers.

Comment 15 Lukas Vrabec 2019-01-07 17:05:53 UTC

Adam, 

Is it possible to create live image with older version of systemd (e.g from: 20181219.n.1), because for me it's not working also in permissive mode, which could be some problem in systemd or dbus.

Comment 16 Adam Williamson 2019-01-07 17:12:05 UTC

Just to be clear: you don't just do 'setenforce 0' then run anaconda. That won't help as hostnamed is still not running. You do 'setenforce 0', 'systemctl restart systemd-hostnamed', *then* start anaconda. That worked in openQA, for me (it's how I got your desktop tests to run).

Comment 17 Lukas Vrabec 2019-01-07 21:24:55 UTC

Hi All,

I think, we find the issue here. 

boot live image without dontaudit rules and find these unlabeled files:

# find /etc/selinux/ -context unlabeled_t                    0
/etc/selinux/targeted/contexts/files/file_contextsstem_u:object_r:unlabeled_t:s0 
/etc/selinux/targeted/contexts/files/file_contexts.bin
/etc/selinux/targeted/contexts/files/file_contexts.homedirs.bin
/etc/selinux/targeted/contexts/files/file_contexts.homedirs
/etc/selinux/targeted/seusers
/etc/selinux/targeted/policy/policy.31


This is one of the problematic AVCs: 
 type=AVC msg=audit(1546912562.945:143): avc:  denied  { open } for  pid=1255 comm="systemd-hostnam" path="/etc/selinux/targeted/contexts/files/file_contexts.bin" dev="dm-0" ino=142247 scontext=system_u:system_r:systemd_hostnamed_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file permissive=1

After discussion with plautrba it looks this is the root of cause:
https://github.com/weldr/lorax/pull/523 

and this commit:
https://github.com/weldr/lorax/pull/523/commits/8b11705ea0697dc2ac59f482de3891f4ed03ef19

Comment 18 Lukas Vrabec 2019-01-07 21:28:18 UTC

Re-assigning to lorax component.

Comment 19 Brian Lane 2019-01-07 21:45:06 UTC

Why are these files created unlabeled? We can't just revert that patch, in my tests setfiles wouldn't run with SELinux in Enforcing mode, and really, the files should be created with the right labels from the start, shouldn't they?

Comment 20 Zbigniew Jędrzejewski-Szmek 2019-01-08 08:18:48 UTC

I think a different AVC is important:

Jan 08 02:53:14 localhost systemd[1]: Starting Hostname Service...
Jan 08 02:53:14 localhost audit[1425]: AVC avc:  denied  { mounton } for  pid=1425 comm="(ostnamed)" path="/run/systemd/unit-root/dev" dev="dm-0" ino=131329 scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=dir permissive=0
Jan 08 02:53:14 localhost systemd[1425]: systemd-hostnamed.service: Failed to set up mount namespacing: Permission denied
Jan 08 02:53:14 localhost systemd[1425]: systemd-hostnamed.service: Failed at step NAMESPACE spawning /usr/lib/systemd/systemd-hostnamed: Permission denied
Jan 08 02:53:14 localhost systemd[1]: systemd-hostnamed.service: Main process exited, code=exited, status=226/NAMESPACE
Jan 08 02:53:14 localhost systemd[1]: systemd-hostnamed.service: Failed with result 'exit-code'.
Jan 08 02:53:14 localhost systemd[1]: Failed to start Hostname Service.
Jan 08 02:53:14 localhost audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'

Now we're getting somewhere. There's a number of units with PrivateDevices=yes:
dbus-org.freedesktop.hostname1.service
dbus-org.freedesktop.locale1.service
mlocate-updatedb.service
systemd-coredump@.service
systemd-hostnamed.service
systemd-journal-gatewayd.service
systemd-journal-remote.service
systemd-journal-upload.service
systemd-localed.service
systemd-resolved.service
systemd-timesyncd.service

They all fail to start in this way, even mlocate-updatedb.service. It's just that systemd-hostnamed.service is the only one needed during normal boot of the live image.

"/run/systemd/unit-root/dev" is unlabelled. The parent directory is labelled:
drwx------. 2 root root system_u:object_r:init_var_run_t:s0 40 Jan  8 02:24 /run/systemd/unit-root
This directory is created here:
https://github.com/systemd/systemd/blob/master/src/core/namespace.c#L727
so it should be labelled too.

So... what do we need to do to make this operation allowed?

Comment 21 Lukas Vrabec 2019-01-08 10:04:59 UTC

Taking back. 

Brian, you're right. It's not lorax related. Sorry. 

Zbigniew, 

Is there any change how "/run/systemd/unit-root/dev" is created? Or when it's created? This directory should inherit SELinux context from parent directory as you mentioned in your comment which is init_var_run_t. I tried explicitly label /run/systemd/unit-root/dev using semanage fcontext with no effect. 

Line #727 was added in commit from 2016 by Lennart, so this is not new in systemd, I guess. Maybe some change in mkdir_p_label function? Something must change in systemd because policy is still same between live composes.

Comment 22 Zbigniew Jędrzejewski-Szmek 2019-01-08 16:30:24 UTC

Bah. I decided to use brute force and simply bisected the failure to https://github.com/systemd/systemd/commit/1beab8b0d0.
I looked at it previously, but it looked harmless.

The commit changes the error checking that is performed when setting up the process sandbox.
Before, any EPERM/EACCES failure would be ignored and simply logged at debug level as "Failed to set up namespace, assuming containerized execution".
After the patch, this failure is ignored only if two conditions are met:
- there is no RootDir= or RootImage= or other filesystem relocations
- the failure is from `unshare(CLONE_NEWNS)` and is EACCES/EPERM/EOPNOTSUPP/ENOSYS.
The old code was certainly sloppy, but it was written when the namespacing operations were understood
to be only an optional sandbox. Nowadays, the unit will likely not work correctly if certain parts
of the setup fail.

In this particular case, we get EACCES from mount, and previously it was ignored but now
it is (correctly) treated as a failure.

One option would be to extend the logic in systemd which detects failures which only
affect sandboxing and not the filesystem seen by the service, and also ignore them.
But I don't find this appealing at all, because it makes use of the sandboxing very
fragile. We trade one sandbox (PrivateDevices=yes) for another (SELinux).

The other option of course is to figure out why the AVC occurs, just on the live image.

In case you want to test systemd, I use the following procedure:
1. wget https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20190102.n.0/compose/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-Rawhide-20190102.n.0.iso
2. qemu-system-x86_64 -machine accel=kvm -smp 2 -m 2048 -drive if=pflash,format=raw,readonly,file=/usr/share/edk2/ovmf/OVMF_CODE.fd -cdrom Fedora-Workstation-Live-x86_64-Rawhide-20190102.n.0.iso
3. add '3' at the grub prompt
4. remove /usr/share/doc and /usr/share/locale and some other stuff so there's more space
5. git clone https://github.com/systemd/systemd
6. sudo dnf build-dep systemd
7. meson build -Dman=false -Defi=false -Dtests=false && ninja -C build && sudo ninja -C build install && sudo systemctl daemon-reexec && sudo systemctl restart systemd-hostnamed

Comment 23 Lukas Vrabec 2019-01-10 17:41:01 UTC

Hi Zbigniew, 

Okay, this can be older problem but it was mask, as you mentioned EACCES was ignored. Could you setup right SELinux context to the directory "dev" when you creating this file in /run/systemd/unit-root/dev ? 

Thanks,
Lukas.

Comment 24 Zbigniew Jędrzejewski-Szmek 2019-01-10 22:34:12 UTC

Actually, in this case, the target directory (/run/systemd/unit-root/dev, I'm assuming that's the one you mean), is just a plain directory on the root filesystem. (Systemd bind-mounts / to /run/systemd/unit-root after unsharing the namespace. So the directories like /run/systemd/unit-root/dev are the same as /dev in the root filesystem). This context is not something that systemd has influence over.

On the live image, the directory (*) looks like:
drwxr-xr-x.   2 root root system_u:object_r:unlabeled_t:s0       4096 Jan 2 05:03 dev

On the installed system:
drwxr-xr-x.   2 root root unconfined_u:object_r:root_t:s0       4096 01-06 18:29 dev

Frankly, both contexts look suspicious.
So maybe the issue is not in systemd or selinux policy, but in the way that the filesystem is prepared.

(*) The procedure to look at the directory context:
$ mkdir /tmp/foo
$ sudo mount --bind / /tmp/foo
$ ls -lZd /tmp/foo/dev

Comment 25 Zbigniew Jędrzejewski-Szmek 2019-01-11 13:34:37 UTC

I added an untested work-around for this: https://src.fedoraproject.org/rpms/systemd/c/4557ee08723d29170153dfe9a115d4916dcb9ee6?branch=master. This should be enough to get the live image to run.

Comment 26 Adam Williamson 2019-01-15 02:10:40 UTC

In the 20190114.n.0 compose, the Workstation live does not boot at all. Adding 'enforcing=0' to the boot parameters makes it boot. So that sounds somewhat like the workaround made it worse, not better.

The failure mode is that dbus and many other services fail to start, and the boot just gets stuck, like this:

https://openqa.fedoraproject.org/tests/344644#step/_boot_to_anaconda/19

booting to runlevel 3 shows the same services failing to start; it reaches a login screen, but you cannot actually log in as root or liveuser, it fails with "Cannot make/remove an entry for the specified session".

Neither ausearch nor the journal show any denials.

Comment 27 Zbigniew Jędrzejewski-Szmek 2019-01-15 07:41:53 UTC

With systemd-240-4.gitf02b547.fc30.x86_64, systemd-hostnamed starts fine. (At least on an installed system. I didn't test the installer yet). So the new failure appears to be a different bug.

Comment 28 Zbigniew Jędrzejewski-Szmek 2019-01-15 09:08:49 UTC

It's the same bug in the sense that it has the same underlying cause. This time dbus-broker.service cannot start, and things go south from there.

I'll push a build of systemd that neuters PrivateDevices=, as a work-around, so we can boot until this gets a proper fix.

Comment 29 Zbigniew Jędrzejewski-Szmek 2019-01-15 10:39:42 UTC

systemd-240-6.gitf02b547.fc30 makes dbus-broker start again. Let's see how it goes.

Comment 30 Lukas Vrabec 2019-01-15 12:44:12 UTC

Hi Zbigniew, 

Problem is still the same. /var/run/systemd/unit-root/dev is labeled as unlabeled_t on live filesystem. I believe there is a way in systemd code how systemd can label it during creation of the directory, something like: 

* Check how directory should be labeled (I'll add label device_t for /var/run/systemd/unit-root/dev)) 
* Create directory and assign correct label. 

I tried several experiments and I think it's not possible to handle this issue in selinux-policy without systemd assistance. 

Thanks,
Lukas.

Comment 31 Zbigniew Jędrzejewski-Szmek 2019-01-15 14:25:58 UTC

It's just the /dev directory. Systemd does not create it, it's part of the root file system image. It is provided by filesystem.rpm. I guess it's created by dnf (?) when it is called by lorax (?) when the iso is created. @adamw?

Normally, this directory is mounted over very early in boot, so we see the context of the mounted filesystem. But when arranging a new namespace, the mount point itself is exposed, and its context matters then. That's the reason for the procedure in https://bugzilla.redhat.com/show_bug.cgi?id=1663040#c24.

So it's possible that this is not a bug in selinux-policy, but in lorax/livemedia-creator/something. How do we set the selinux context of the files in that image?

I have a vague memory of there being a bug about restorecon not setting the context underneath mountpoints, but I can't find it now.

Comment 32 Zbigniew Jędrzejewski-Szmek 2019-01-15 14:29:59 UTC

If I run dnf locally, the context seems to be correct:
$ sudo dnf install --installroot=/tmp/machine --releasever=rawhide --nogpgcheck filesystem glibc-minimal-langpack
$ ls -lZd /tmp/machine/dev
drwxr-xr-x. 2 root root system_u:object_r:device_t:s0 60 Jan 15 15:27 /tmp/machine/dev/

So it seems that it's something in the live-image pipeline that goes wrong.

Comment 33 Adam Williamson 2019-01-15 16:26:40 UTC

bcl would be the best person to ping about the fundamentals here, so CCing him. He will correct but, but my best memory of how the current live image creation process works is something like this. within a mock chroot, lorax creates a filesystem image and loopback mounts it. It then runs anaconda and tells it to run a kickstart installation *to that mount point*, using a special mode that exists in anaconda more or less for this purpose. Once the install is complete, lorax does a bit more fiddling around (HANDWAVE HANDWAVE isohybrid HANDWAVE HANDWAVE) to convert or embed the image file into an actual bootable ISO.

I *think* that means that ultimately the /dev within the filesystem image will be created when anaconda, using libdnf, installs the filesystem package, yes.

Looking at the filesystem package, it creates /dev during %install:

mkdir -p boot dev \
...


and does this in %posttrans:

restorecon /dev 2>/dev/null >/dev/null || :

I'm honestly not sure if anything would be mounted on this /dev at that point during a live media creation install.

Notably there is a %pretrans block which uses posix.mkdir to create several directories before the install transaction starts, but /dev is not one of these...

Comment 34 Adam Williamson 2019-01-15 16:31:20 UTC

There is a rather interesting block in lorax's pylorax/installer.py `novirt_install`, which is definitely on the path we're using here (installing to a disk image in mock, as opposed to lmc's other mode, which installs to a virtual machine):

https://github.com/weldr/lorax/blob/f9ade497d32590d89bcbe671da9cd4a62174ed75/src/pylorax/installer.py#L391-L400

which rather suggests that, in this mode, lmc/lorax pre-creates /dev in the disk image, and attempts to label it correctly. That could conceivably be going wrong.

Comment 35 Adam Williamson 2019-01-15 16:38:09 UTC

Ah, I misread that a bit: the setfiles is run *after* the install completes, and -e means to exclude /dev. FWIW, the logs from the most recent compose indicate that 'setfiles' was run successfully:

https://koji.fedoraproject.org/koji/taskinfo?taskID=32041061
https://kojipkgs.fedoraproject.org//work/tasks/1061/32041061/root.log
https://kojipkgs.fedoraproject.org//work/tasks/1061/32041061/program.log

root.log does not have the "Running setfiles on install tree failed" error, and program.log shows the actual execution:

2019-01-15 10:31:48,540 INFO: Running... setfiles -e /proc -e /sys -e /dev /etc/selinux/targeted/contexts/files/file_contexts /
2019-01-15 10:32:41,178 INFO: setfiles: conflicting specifications for /usr/sbin/e2mmpstatus and /usr/sbin/dumpe2fs, using system_u:object_r:fsadm_exec_t:s0.
2019-01-15 10:32:41,179 DEBUG: Return code: 0

(it's not indicated in the log, but that should have run chrooted to the mounted filesystem image directory).

Comment 36 Adam Williamson 2019-01-15 16:41:46 UTC

This 'setfiles' invocation was added to livemedia-creator back in 2014, with an interesting commit message:

https://github.com/weldr/lorax/commit/ffa40722dd0c4c2147272c1048fb237a96a98f90

"Depending on the environment that --no-virt is run inside the resulting
filesystem may need to have the SELinux labels updates. Run setfiles on
the new filesystem after the anaconda run has finished.

eg. when run from inside mock the labels will be incorrect."

That's exactly the scenario we're dealing with here, so it seems interesting. The exclusion of /dev from the `setfiles` command has been there since the start, but is not explained in the commit message or the comments, so I'm not clear on why it's excluded. It may be worth testing a live image build done with the `-e /dev` dropped from that setfiles command...

Comment 37 Lukas Vrabec 2019-01-15 17:03:01 UTC

Thanks a lot for help Adam. 

There was some changes in lorax in recent past touching also SELinux bits in lorax. 

Adam, 

Who should we contact to test live image with dropped '-e /dev' ? 

Thanks,
Lukas.

Comment 38 Adam Williamson 2019-01-15 17:11:12 UTC

I'm doing it right now.

Comment 39 Adam Williamson 2019-01-15 18:33:55 UTC

OK, so that seems to work: I set up a mock chroot as Koji does, shelled into it, edited installer.py to drop the `-e /dev` from the setfiles invocation, then ran the live image build the same way Koji does, and the built live image boots fine.

I'm now doing a build following the same method but *without* editing installer.py - just to make sure that image fails to boot, i.e. the fix was really the edit to installer.py and not just some inadvertent difference in my environment vs. the Koji environment.

Comment 40 Adam Williamson 2019-01-15 18:42:22 UTC

The sanity check is still running, but I have to go out soon, so I sent the pull request:

https://github.com/weldr/lorax/pull/568

will report back on the result of the sanity check in ~7-8 hours.

Comment 41 Adam Williamson 2019-01-16 03:02:03 UTC

*** Bug 1666497 has been marked as a duplicate of this bug. ***

Comment 42 Adam Williamson 2019-01-16 03:24:17 UTC

OK, the sanity check image did *not* boot, as expected - so the setfiles tweak really does seem to be the fix here. zbyszek thinks we should also drop the exceptions of /proc and /sys, so I'm going to run more test build with that further change to make sure there are no problems, then backport the change to lorax for tomorrow's Rawhide compose and see how that goes.

Comment 43 Adam Williamson 2019-01-16 06:46:59 UTC

The image I built with the /proc and /sys exceptions dropped did not boot correctly, so I went back to only dropping the /dev exception for now, and backported that patch to the lorax package in Rawhide; we'll see how the next compose goes...

Comment 44 Adam Williamson 2019-01-16 06:49:10 UTC

If I boot the image I built with the /proc and /sys exceptions dropped using 'enforcing=0', it boots correctly and I see these AVCs:

----
time->Wed Jan 16 01:47:54 2019
type=AVC msg=audit(1547621274.612:208): avc:  denied  { map_create } for  pid=1 comm="systemd" scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=bpf permissive=1
----
time->Wed Jan 16 01:47:54 2019
type=AVC msg=audit(1547621274.612:209): avc:  denied  { map_read map_write } for  pid=1 comm="systemd" scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=bpf permissive=1
----
time->Wed Jan 16 01:47:54 2019
type=AVC msg=audit(1547621274.612:210): avc:  denied  { prog_load } for  pid=1 comm="systemd" scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=bpf permissive=1
----
time->Wed Jan 16 01:47:54 2019
type=AVC msg=audit(1547621274.612:211): avc:  denied  { prog_run } for  pid=1 comm="systemd" scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=bpf permissive=1
----
time->Wed Jan 16 06:47:46 2019
type=AVC msg=audit(1547639266.251:130): avc:  denied  { transition } for  pid=1432 comm="(systemd)" path="/usr/lib/systemd/systemd" dev="dm-0" ino=153316 scontext=system_u:system_r:kernel_t:s0 tcontext=unconfined_u:unconfined_r:unconfined_t:s0 tclass=process permissive=1
----
time->Wed Jan 16 06:47:46 2019
type=AVC msg=audit(1547639266.251:131): avc:  denied  { entrypoint } for  pid=1432 comm="(systemd)" path="/usr/lib/systemd/systemd" dev="dm-0" ino=153316 scontext=unconfined_u:unconfined_r:unconfined_t:s0 tcontext=unconfined_u:object_r:unlabeled_t:s0 tclass=file permissive=1
----
time->Wed Jan 16 06:47:46 2019
type=AVC msg=audit(1547639266.439:133): avc:  denied  { transition } for  pid=1439 comm="gdm-session-wor" path="/usr/bin/gnome-keyring-daemon" dev="dm-0" ino=52824 scontext=system_u:system_r:kernel_t:s0 tcontext=unconfined_u:unconfined_r:unconfined_t:s0 tclass=process permissive=1

Comment 45 Adam Williamson 2019-01-16 07:08:28 UTC

Note, the 20190116.n.0 compose started before I got this build done, so we'll see the results in 20190117.n.0 I guess.

Comment 46 Zbigniew Jędrzejewski-Szmek 2019-01-16 09:01:55 UTC

Can you post the image from comment 44? (Or maybe instructions how to build it locally, if that's easier.)

Comment 47 Zbigniew Jędrzejewski-Szmek 2019-01-16 17:04:15 UTC

*** Bug 1663582 has been marked as a duplicate of this bug. ***

Comment 48 Mikhail 2019-01-17 06:03:03 UTC

I am confirm that the problem was fixed in Fedora-Workstation-Live-x86_64-Rawhide-20190116.n.0.iso

Comment 49 Lukas Vrabec 2019-01-17 09:47:33 UTC

Thank you all for help on this issue.

Comment 50 Adam Williamson 2019-01-17 16:20:00 UTC

Do you mean 20190117.n.0? Cos it certainly wasn't fixed in 20190116.n.0, that was the one that wouldn't boot.

Comment 51 Adam Williamson 2019-01-17 16:43:47 UTC

Confirmed fixed in 20190117.n.0, though.

Comment 52 Mikhail 2019-01-17 18:34:54 UTC

(In reply to Adam Williamson from comment #50)
> Do you mean 20190117.n.0? Cos it certainly wasn't fixed in 20190116.n.0,
> that was the one that wouldn't boot.

No I am sure that it was be 20190116.n.0.
And I able launch Fedora Live from this image and make fresh install on HDD.

I am also can provide sha1 of my workable iso:

$ sha1sum Downloads/Fedora-Workstation-Live-x86_64-Rawhide-20190116.n.0.iso 
f7d76f1b23103b01df36f81a18a8d416cf0b15a1  Downloads/Fedora-Workstation-Live-x86_64-Rawhide-20190116.n.0.iso

Comment 53 satellitgo 2019-01-17 19:10:32 UTC

Fedora-SoaS-Live-x86_64-Rawhide-20190117.n.0.iso

liveinst: "gtk_is_style_provider_private (provider) failed"

https://fedoraproject.org/wiki/Test_Results:Fedora_30_Rawhide_20190117.n.0_Installation#Default_boot_and_install_.28x86_64.29

Comment 54 Adam Williamson 2019-01-17 20:33:49 UTC

satellit: that is very likely not the same bug. Please open a new one.

Comment 55 satellitgo 2019-01-21 22:41:14 UTC

Fedora-SoaS-Live-x86_64-Rawhide-20190121.n.0.iso  seems similar

Comment 56 satellitgo 2019-01-21 22:41:56 UTC

https://bugzilla.redhat.com/show_bug.cgi?id=1668105

Comment 57 dac.override 2019-02-01 12:55:06 UTC

(In reply to Zbigniew Jędrzejewski-Szmek from comment #24)

<snip> 
> 
> Frankly, both contexts look suspicious.
> So maybe the issue is not in systemd or selinux policy, but in the way that
> the filesystem is prepared.
> 

I agree. The mountpoints (/home /sys /dev etc) need to be labeled properly by the installer before anything gets mounted on it. If you relabel / when theres all kinds of stuff mounted then the underlying mountpoints are not included in the relabel, and so they may end up with bad or no labels.

This will be exposed with mount --bind.

Comment 58 Hans de Goede 2019-04-10 13:12:42 UTC

So I hit an interesting variant of this with the Fedora install on my main workstation, when I moved from a hdd to a ssd and did a cp -pr of the non mount dirs and a straight forward mkdir to create the mount-points on the ssd. Leaving to the same unlabelled mount-points issue.

And on upgrading to F30 everything broke unless I switched selinux to permissive mode. I'm not sure if there is anything we can do about this, but it is a scenario to keep an eye out for. I've written a blog post about this, to help others avoid the long debug process I've gone through:
https://hansdegoede.livejournal.com/20910.html

Perhaps this is something for the F30 common bugs page? Although I guess it will not be that common, I do expect other peoples to have moved old installs around in a similar way.

Comment 59 Adam Williamson 2019-04-10 15:11:53 UTC

My gut feeling would be that it's probably not common enough for commonbugs :P but if it comes up more than once or twice on e.g. forums, discourse etc., that would be a signal to add it...

Comment 60 Oleg Girko 2019-04-30 22:24:48 UTC

I had the same problem as Hans described in comment 58 when upgrading my CubieTruck and Pandaboard to Fedora 30. I appeared that for some reason I had unlabeled "/boot", "/dev", "/proc" and "/sys", and setting proper labels for these directories as he describes in his blog helped to solve the problem.

I don't know how it happened that these directories were unlabeled. I've installed Fedora on these computers at around 2010 .. 2011 using Fedora ARM images, and I was gradually upgrading Fedora using dnf since then. Probably that was an artefact from the ancient times when I was first installing Fedora there, but it was causing no problems until now.

On the other hand, Raspberry PI 3 where I installed Fedora first in 2016 had no such problem. Neither had this problem my desktop computer that had Fedora installed when it was not Fedora yet.

Comment 61 Ralf Ertzinger 2019-05-08 08:36:03 UTC

I ran into this after upgrading to F30 from F29, and it was quite a journey to find this bug report.

Just to make it easier for people to find this I'm going to drop the error message from a failed unit start here, as I was unable to turn up anything useful with the strings in there:

● systemd-hostnamed.service - Hostname Service
   Loaded: loaded (/usr/lib/systemd/system/systemd-hostnamed.service; static; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/systemd-hostnamed.service.d
           └─disable-privatedevices.conf
   Active: failed (Result: exit-code) since Wed 2019-05-08 10:13:43 CEST; 18min ago
     Docs: man:systemd-hostnamed.service(8)
           man:hostname(5)
           man:machine-info(5)
           https://www.freedesktop.org/wiki/Software/systemd/hostnamed
  Process: 1207 ExecStart=/usr/lib/systemd/systemd-hostnamed (code=exited, status=226/NAMESPACE)
 Main PID: 1207 (code=exited, status=226/NAMESPACE)

May 08 10:13:43 kendra systemd[1]: Starting Hostname Service...
May 08 10:13:43 kendra systemd[1207]: systemd-hostnamed.service: Failed to set up mount namespacing: Permission denied
May 08 10:13:43 kendra systemd[1207]: systemd-hostnamed.service: Failed at step NAMESPACE spawning /usr/lib/systemd/systemd-hostnamed: Permission denied
May 08 10:13:43 kendra systemd[1]: systemd-hostnamed.service: Main process exited, code=exited, status=226/NAMESPACE
May 08 10:13:43 kendra systemd[1]: systemd-hostnamed.service: Failed with result 'exit-code'.
May 08 10:13:43 kendra systemd[1]: Failed to start Hostname Service.


Following the instructions from Hans' blog post made the services work again.

Comment 62 Hans de Goede 2019-05-08 09:25:35 UTC

Adam, since this bug nog has gotten 2 me too comments and I received a thank you / another me too by email, I wonder if we should revisit putting this on the common bus page? 4 people (me included) is still not a lot, OTOH this one is nasty to debug, so having some extra documentation might be good...

Comment 63 Oleg Girko 2019-05-08 14:11:59 UTC

Alternatively, can restoring proper labels of top-level directories be somehow automated?

Comment 64 Hans de Goede 2019-05-09 07:47:16 UTC

(In reply to Oleg Girko from comment #63)
> Alternatively, can restoring proper labels of top-level directories be
> somehow automated?

This is something which I've been thinking about too. This would be possible if we do it in the initrd before the pivot-root (or what ever we do now a days).

Comment 65 Oleg Girko 2019-05-09 13:16:38 UTC

(In reply to Hans de Goede from comment #64)
> (In reply to Oleg Girko from comment #63)
> > Alternatively, can restoring proper labels of top-level directories be
> > somehow automated?
> 
> This is something which I've been thinking about too. This would be possible
> if we do it in the initrd before the pivot-root (or what ever we do now a
> days).

Also, fixing labels can be done in already booted system by exposing directories hidden under mounted filesystems using your bind mount trick.
This can be done by systemd oneshot service that performs bind mount of root directory in its own namespace and then fixes labels.

Comment 66 Ralf Ertzinger 2019-05-09 13:31:57 UTC

Going this in a general fashion might be tricky, as one would have to take into consideration not only / (even if this was the main issue here) but also mount points further down in the tree (a very common example being /, /boot and /boot/efi)

Comment 67 Leszek Matok 2019-05-11 14:30:32 UTC

Hans, you should really update your blog post to make it easier to Google! Include error messages I was searching for:
systemd[10118]: dbus-broker.service: Failed at step NAMESPACE spawning /usr/bin/dbus-broker-launch: Permission denied
systemd[1]: dbus-broker.service: Main process exited, code=exited, status=226/NAMESPACE

Also, actually, you still can login to the system remotely (or use systemd.debug-shell=1, or rd.break + another relabel - but anyone can SSH from their laptop or phone, no? :))

When you access the system, after confirming the problem with Zbyszek's quick bind mount 3-liner, simply continue. cd /tmp/foo and run the 4 chcons directly, then safely reboot - no need for a live CD, it just worked \o/

Thanks for taking the time to document this, I found this bug by googling but ignored it at first, esp. since setenforce 0 did NOT make dbus-broker start for me! Maybe disabling SELinux completely would do the trick, but switching to permissive mode was NOT enough.

So I struggled for some time until coming back to it and checking later comments where Oleg and Ralf praised your blog post from last month. Please SEO it so others in need can find it quicker :)

(I know for you guys I must seem slow to be upgrading to F30 today, aka so late, but I can assure you there are even more planning to do it even later ;) And it's guys like us, who migrate systems to new drives with cp -a + fixfiles onboot ;))

Many thanks again, life saver indeed \o/

PS. Actually, fixfiles onboot should be upgraded to detect and fix this... or SELinux shouldn't care about privileges of a mountpoint after it's mounted, as 1. this was never Linux behavior (in AIX you can get -EPERM if a mountpoint is root:root 750, in Linux that never happened before) and 2. it just doesn't seem a good practice to have the final privileges set even before mounting (in AIX you would make it 755 but some admins would also chown to the actual user who was supposed to write to the fs - then users can mess up).

Comment 68 Dick Marinus 2019-05-12 05:36:17 UTC

Had the same problem as Hans described, thanks for documentating this.

Comment 69 Mikel Pérez 2019-05-14 05:10:07 UTC

I ran into it too, on a machine that's been through many system upgrades, I don't know if it's related.
And Hans' blogpost workaround worked.

It would be nice if it was done somehow during upgrade.

Comment 70 Peter Kotvan 2019-05-20 10:45:36 UTC

I've experienced this on my Banana Pi machine that went through a lot of upgrades after the upgrade to F30 yesterday.

Comment 71 Michal Ambroz 2019-07-03 11:41:58 UTC

Just run into this issue on Fedora machine which went through some number of fedup/dnf upgrades already.

It seems that some mount points in the root filesystem were having wrong context for some reason,
but as those are already mounted during the pre-boot, context on thes is not fixed by conventional weapons like simple "fixfile onboot" or "restorecon /dev /proc /sys".

This helped:
mkdir -p /mnt/root
mount --bind / /mnt/root
chcon -t device_t /mnt/root/dev
chcon -t proc_t   /mnt/root/proc
chcon -t sysfs_t  /mnt/root/sys

I would recommend something like that finds its way to "dnf system-upgrade".

Comment 72 blah 2019-07-03 17:54:44 UTC

I would like to know why this bug was mysteriously closed (presumably by Adam Williamson, since the bug 'last closed' time exactly matches his comment at 2019-01-17 20:33:49 UTC ) without notice. No offense to Adam, he seems like a good guy, but why is there no notice within the comment thread that this BUG was closed, and WHY? Usually wouldn't there be a reason given for closing it?

This BUG is still affecting users in July 2019, 6 months later, presumably because it was CLOSED without reason.

What's worse, because it was closed without proper discussion, it is hard for affected users to even know the proper workaround.

6 months later, I've hit this issue on two servers.

The WORKAROUND or FIX described correctly by Hans de Goede above, but ignored by Adam when he (presumably) closed the bug,
is to mount the root partition and after that, set correct selinux contexts on various directories.
(see the blog post linked for Hans, or else the comment immediately above from Michal Ambroz).

A full selinux relabel doesn't help, as described above. This should be fixed / backported, rather than just leaving the
unfortunate users that run with selinux enabled to figure it out for themselves.

Similar to the blog post by Hans, and the insructions from Michal above, I ran this which helped me
(after checking for the 'unlabeled' selinux file context as described in Hans' blog)

mkdir /mnt/sys && mount --bind / /mnt/sys && cd /mnt/sys && chcon -t device_t dev && chcon -t root_t proc sys && chcon -t var_run_t run

As an aside, in this one upgrade from fc29 to fc30, I have hit 3 bugs affecting the upgrade, that have not been fixed six months after it was released.

Firstly, the upgrade does not continue if the filesystem used is in the fstab with 'nofail'. Secondly, the bootloader grub.cfg is empty after the upgrade (that was really not a nice feeling, seeing that) - no linux OS's are detected any more, except by os-prober (and with incorrect root filesystems, and no kernel parameters), and third, after the system does boot, any service with PrivateDevices=true fails (yay, bug, closed without any resolution or explanation 6 months ago). This bug could at the very least be fixed by one command (similar to that above) run in a postinstall script, but it wasn't because the bug was closed without explanation and subsequently ignored.

I understand this is free software, but some of you are actually employed/paid to work for RedHat, and closing issues like this with no explanation is really just not good enough.

If you don't take issues like this seriously, you should stop making representations that the dnf upgrade process actually works. You clearly don't test it properly and you then you close serious bugs without fixing them.

Comment 73 blah 2019-07-03 18:04:37 UTC

No way that I can see to re-open the bug.

This will continue to affect users until you fix it.

Perhaps an update of the code that does a reboot after the install, to relabel these important directories.

Or perhaps a postinstall script for systemd, backported, since that's what's actually failing / changed in its behaviour.

But you need to fix it and not just leave upgraders to find this thread and fix it themselves.

I might add, this thread did not come up easily from a google thread, at all. Hence, I'm adding some text from the
log messages which would have helped me to find it quicker:


    dbus-broker.service: About to execute: /usr/bin/dbus-broker-launch --scope system --audit

and:

    Main process exited, code=exited, status=226/NAMESPACE

and

    PrivateDevices=true

Comment 74 blah 2019-07-03 18:12:03 UTC

This bug was tagged Priority: high, Severity: urgent, it has been closed and ReOpened, and presumably closed again - closed without notice or proper reason both times.

It is currently CLOSED with no explanation.

None of the comments state that the bug is being closed, or why it is being closed.

This is a serious issue, and should be taken seriously, not flippantly closed.

Comment 75 blah 2019-07-03 18:24:50 UTC

"Confirmed fixed in 20190117.n.0, though."
Status: MODIFIED → CLOSED
Resolution: --- → RAWHIDE
Last Closed: 2019-01-17 16:43:47

I can confirm 6 months later, that this issue persists with upgrades from fc29 to fc30.  Presumably your testing of 20190117.n.0 did not include an upgrade of a host that did not have the unlabeled selinux context on these particular directories, and is thus useless for confirming whether this bug is fixed or not.  Or else your rawhide fix for this significant issue never made its way into fc30 not even in the updates repo.

Comment 76 Adam Williamson 2019-07-03 18:50:01 UTC

The bug was for this being broken on *clean install of a live image*, not on system upgrade. That is a different case. The fix for this bug was to stop images being generated with the incorrect mount point labels. That fix was done and tested and so the bug was closed.

You can file a separate bug to request some kind of handling for this on system upgrade, when the system being upgraded has the incorrectly labelled mount points. But it's possible that it could be difficult to actually fix this during the upgrade process.

Comment 77 Adam Williamson 2019-07-03 19:18:25 UTC

BTW, a key question for the upgrade scenario is "how did your system come to have the mount points wrongly labelled?" So far we know Hans hit it because of the way he transferred an install from one disk to another, but we don't know why they were mislabelled for anyone else who has run into this on upgrade. It would be useful to know if you did anything similar to what Hans did, or anything else which may have affected the mount point labelling, and how and when the system was initially installed.

Comment 78 blah 2019-07-03 23:55:42 UTC

Good point (live install / system upgrade), though in essence the bug behind it was the same, I do take your point about the fix for a live install.

In answer to your question, these systems were both initially installed years ago (it was Fedora 17 at the time) and have been upgraded over the years through each fedora release, so the cause of the unlabeled directories may be lost in the winds of time, but it affected both machines of that vintage. Perhaps the installer at the time (and until your fix) had the same issue, and it just hasn't been a problem before. Thanks for your work on finding and fixing that and my apologies for any misdirected attitude that I allowed to leak into my comments above, it was really a separate bug (the system upgrade rebooting after writing an empty grub.cfg file) that had disappointed me so much and caused my overreaction.

Comment 79 Leszek Matok 2019-07-08 15:07:35 UTC

I can confirm I've copied over everything to a new md raid some time ago (I assume it was Fedora 27 or so at that time, it did survive at least one upgrade afterwards).

Most likely I've just mounted new /dev/dm1 as /mnt, cp -a'd everything with files and created proc/dev/sys manually, then relabeled with autorelabel on boot. As we can see, autorelabel operates when /proc etc. are already mounted and doesn't fix the inodes in actual root fs. This was never a problem before, and only in F30 there's a check for labels of a directory that is already overlaid with another fs, which is super counter-intuitive.

This bug has been reported against live image, but from the looks of it, it can happen on any already installed system if you migrate a "live" system (even single mode aka rescue.target has those mounted, so you'll likely just mkdir and expect them to mount fine).

Personally I think a new bug is warranted - there's still a strong "disable selinux because it only causes issue" front and strange to explain effects like this only fuel their fire :)

Comment 80 Adam Williamson 2019-07-08 17:47:11 UTC

"This bug has been reported against live image, but from the looks of it, it can happen on any already installed system if you migrate a "live" system (even single mode aka rescue.target has those mounted, so you'll likely just mkdir and expect them to mount fine)."

Yes, it can, but the point is that this bug is not "SELinux has a policy that requires these labels to be correct and systemd bails if the check fails". This bug is "Those mount points are wrongly labelled in live images". We fixed that: the mount points are no longer wrongly labelled in live images. So this bug is fixed.

If you want to report *the SELinux policy denial itself* as a bug you certainly can, though it's possible/likely the SELinux maintainers would disagree. Note that the SELinux policy has existed for a long time and it has not changed; what changed that made this suddenly start happening is that systemd previously ignored the denial by SELinux, but it now respects it - see https://bugzilla.redhat.com/show_bug.cgi?id=1663040#c22 .

If we can identify any other cases in which these mount points wind up incorrectly labelled which are actually fixable, each of those cases can be reported as a separate bug.

The only options we plausibly have to "fix" the case where these labels are wrong due to a user transferring an install between storage devices in the way you and Hans did, AFAICS, is to either relax the SELinux policy or make systemd ignore it again. As I said, you can certainly request that, but it's quite likely the answer would be "no" in both cases. Well, I guess it might be possible to try and make autorelabel fix the mount point labels somehow too, that's another thing we could file separately.

Comment 81 Zbigniew Jędrzejewski-Szmek 2020-10-27 13:03:27 UTC

Is it OK to revert the workaround in
https://src.fedoraproject.org/rpms/systemd/c/4557ee08723d29170153dfe9a115d4916dcb9ee6?branch=master
now?

Comment 82 Zbigniew Jędrzejewski-Szmek 2022-03-23 12:21:47 UTC

Arrgh, I forgot to do this. The workaround will be dropped in the next build in rawhide.