Bug 1308771
Summary: | Current Rawhide Workstation live image does not reach GDM due to mislabelled /run/systemd/inhibit and /run/user/1000 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Adam Williamson <awilliam> | ||||||
Component: | systemd | Assignee: | systemd-maint | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 24 | CC: | awilliam, cra, dominick.grift, dwalsh, jfrieben, johannbg, juliux.pigface, lnykryn, lvrabec, mgrepl, msekleta, muadda, petersen, plautrba, pschindl, robatino, satellitgo, s, systemd-maint, vlee, zbyszek | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | AcceptedBlocker | ||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 1314372 (view as bug list) | Environment: | |||||||
Last Closed: | 2016-03-07 17:23:32 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1230431, 1314372 | ||||||||
Attachments: |
|
Description
Adam Williamson
2016-02-16 01:03:36 UTC
Affected image: https://kojipkgs.fedoraproject.org/work/tasks/3353/12993353/Fedora-Live-Workstation-x86_64-rawhide-20160215.iso note, there was a new systemd in Rawhide in the relevant period: systemd-229-1 landed on 2016-02-11. Aha. When I boot with systemd.log_level=debug , I can see that the problem seems to be that X crashes: Received SIGCHLD from PID 1734 (Xorg). Child 1734 (Xorg) died (code=exited, status=1/FAILURE) furthermore, I managed to get to a tty, and looking at the journal, I see a whole bunch of errors including a ton of SELinux denials. Booting with 'enforcing=0' reaches a GDM screen (which is wrong, it should auto-login, but it at least boots). So, we seem to have an SELinux issue. Re-assigning, and attaching a log extract. Created attachment 1127480 [details]
extract from journal on affected boot
as the attempt to start the session just keeps looping the log grows too big to get out easily, but here's an extract which I think covers a couple of iterations of the loop.
Created attachment 1127481 [details]
ausearch output after enforcing=0 boot
After booting with enforcing=0, here's what I get with 'ausearch -m avc -ts recent' - several dozen denials.
Aha. It looks like /run/systemd/inhibit is mislabelled. All its contents seem to have label:
system_u:object_r:default_t
when they should have:
system_u:object_r:systemd_logind_inhibit_var_run_t:s0
/run/user/1000 also seems to have issues:
unconfined_u:object_r:default_t:s0->unconfined_u:object_r:config_home_t:s0
Yup, I confirmed with the 2016-02-06 image - on that one, only /run/user/1000/keyring files seem to have labelling issues, nothing else in /run/user/1000 and nothing in /run/systemd is mislabelled. I'm not sure what's changed in terms of how those files are created. Hi, I also can reproduce this issue. I would say systemd folks need to look on this, because systemd runs restorecon to fix labels in "/run". Maybe issue can be somewhere there. What does matchpathcon show you? Can we confirm it is a systemd issue? Did you try to downgrade? You can't really 'downgrade' systemd in a live image. The information we have is: 1) it broke between 2016-02-06 and 2016-02-14 2) selinux hasn't changed noticeably in that time 3) systemd is responsible for labelling the affected paths 4) systemd had a major change in the relevant timeframe If you look in https://bugzilla.redhat.com/attachment.cgi?id=1127481 there's output from 'restorecon -nvr', which does more or less what matchpathcon does (the -n option to restorecon tells it not to actually make the changes, -v tells it to print out what it *would* change, so it's effectively a way to check the labels for a given path). (In reply to awilliam from comment #9) > You can't really 'downgrade' systemd in a live image. The information we > have is: > > 1) it broke between 2016-02-06 and 2016-02-14 Yeap, I meant a live image with an older version of systemd. > 2) selinux hasn't changed noticeably in that time > 3) systemd is responsible for labelling the affected paths > 4) systemd had a major change in the relevant timeframe Ok, it looks like a systemd issue here. Thank you. > > If you look in https://bugzilla.redhat.com/attachment.cgi?id=1127481 there's > output from 'restorecon -nvr', which does more or less what matchpathcon > does (the -n option to restorecon tells it not to actually make the changes, > -v tells it to print out what it *would* change, so it's effectively a way > to check the labels for a given path). *** Bug 1309975 has been marked as a duplicate of this bug. *** *** Bug 1309896 has been marked as a duplicate of this bug. *** *** Bug 1309897 has been marked as a duplicate of this bug. *** Even using today's network install media, a freshly installed system does not boot into graphical mode. It turns out that the labels mentioned in comment 5 are wrong after the install but they are set correctly after forcing a full relabeling of the file system (touch /.autorelabel and reboot), see attachments https://bugzilla.redhat.com/attachment.cgi?id=1128860 https://bugzilla.redhat.com/attachment.cgi?id=1128861 to bug 1309903. Description of problem: Boot into Fedora-Live-Workstation-x86_64-rawhide-20160220.iso with enforcing=0. Version-Release number of selected component: selinux-policy-3.13.1-171.fc24.noarch Additional info: reporter: libreport-2.6.4 hashmarkername: setroubleshoot kernel: 4.5.0-0.rc4.git3.1.fc24.x86_64 type: libreport *** Bug 1310377 has been marked as a duplicate of this bug. *** *** Bug 1310376 has been marked as a duplicate of this bug. *** *** Bug 1310378 has been marked as a duplicate of this bug. *** *** Bug 1310398 has been marked as a duplicate of this bug. *** qemu/kvm - "enforcing=0" on boot gets to GUI on live; only "liveinst-T" works for installer. graphical boot fails on reboot with systemctl set-default graphical.target [1] [1] https://fedoraproject.org/wiki/Test_Results:Fedora_24_Rawhide_20160220_Installation So I'm planning to do some manual bisection of this (by building systemd packages at various git commits and building live images with them included). So far I've confirmed that a live image built from current Rawhide with systemd returned to the state of 228-8.gite35a787 (and epoch-bumped) reaches a desktop and does not have the mislabellings - /run/systemd/inhibit is correctly labelled, and in /run/user/1000 only the keyring tree is mislabelled (as it was before this bug appeared). I'll try and pin things down to a git commit tomorrow. (In reply to Adam Williamson from comment #21) It might be more economical to start from a current Fedora rawhide system installed in a virtual machine and to downgrade the systemd-related packages and reboot the system successively until the labeling is done correctly. This issue, also reported in bug 1309903, is by no means restricted to the live image. I was able to get Fedora-Live-Workstation-x86_64-rawhide-20160220.iso installed by booting with: enforcing=0 systemd.unit=multi-user.target Log in on the text console as root, set a root password and liveuser password, and edit /etc/gdm/custom.conf to turn off AutoLogin. Then: systemctl isolate graphical.target Log in graphical as root and run liveinst from there. Installation went fine. Boot the installed system using enforcing=0. The labeling problem doesn't happen with systemd-228-10.gite35a787.fc24 but does happen with systemd-229-1.fc24. Joachim: I can't actually reproduce that version. I run Rawhide on my desktop, and the labelling is correct for me. Charles: the issues beyond labelling are I think to do with GNOME and Wayland in the live session and are not related to this bug. Discussed at 2016-02-22 blocker review meeting: [1]. This bug was accepted as Alpha blocker: clear violation of "All release-blocking images must boot in their supported configurations." [1] http://meetbot.fedoraproject.org/fedora-blocker-review/2016-02-22/f24-blocker-review.2016-02-22-17.00.html (In reply to Adam Williamson from comment #26) 1. Current live media boot correctly into GNOME (on Wayland) on bare metal after adding kernel option "enforcing=0". The steps suggested in comment 23 are unnecessary. 2. Current live media boot correctly into GNOME (on Wayland) in a -virtual machine- with kernel option "enforcing=0" but heavy flickering related to a QXL DRM issue (qxl 0000:00:02.0: ... unpin not necessary) makes the the session unusable. I got sidetracked into other work but I'll try to get back to triaging it soon. So far I had found that 35ad41d361a2d9e766f2d7689b92cfbc4304ddbd - Jan 1st - is good, no mislabelling. Bisect news: the bug appears to be somewhere between: https://github.com/systemd/systemd/commit/795ab08f783e78e85f1493879f13ac44cb113b00 (Feb 1) and: https://github.com/systemd/systemd/commit/ef9fde5378c0b2614991f9e3c4ac525cc07736a8 (Feb 7) Now we've reduced the range, this commit rather catches the eye: https://github.com/systemd/systemd/commit/4b51966cf6c06250036e428608da92f8640beb96 I'm gonna check that one. Yep, that indeed turned out to be the culprit. A systemd built at git commit https://github.com/systemd/systemd/commit/d58669f08abefcc4300e1f476b6482e5f7e87098 - the one immediately before the selinux one - works OK. systemd built at https://github.com/systemd/systemd/commit/4b51966cf6c06250036e428608da92f8640beb96 fails. I guess that change in how labelling gets done makes Arch work but breaks Fedora? See my comments here: https://github.com/systemd/systemd/pull/2508#issuecomment-188235477 This bug appears to have been reported against 'rawhide' during the Fedora 24 development cycle. Changing version to '24'. More information and reason for this action is here: https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora24#Rawhide_Rebase FYI https://github.com/keszybz/systemd/commit/c3dacc8bbf2dc2f5d498072418289c3ba79160ac should fix this problem. OK, thanks for looking into this. Can you comment on https://github.com/keszybz/systemd/commit/5c5433ad32c3d911f0c66cc124d190d40a2b5f5b too? Commented, the change is right and wanted. I added fixes for this issue in selinux-policy rpm package (version selinux-policy-3.13.1-176.fc24). So /etc/selinux/targeted/contexts/files/file_contexts.bin will be available in Fedora Live images. @Adam: Could you try create new Live image with this new version of selinux-policy? http://koji.fedoraproject.org/koji/buildinfo?buildID=741436 Thank you. will do - I'd usually have tested the patch right away, but I'm buried in getting QA stuff synced up with the new compose process ATM :/ but i'll get it checked one way or another (a new compose should be along soon enough anyhow). Great! Thank you! Fedora-Workstation-Live-x86_64-24-20160305.0.iso boots correctly into the GNOME (on Wayland) session in enforcing mode. Installed packages include selinux-policy-targeted-3.13.1-176.fc24. I guess we can close this for now. There's still some stuff to figure out in the systemd/selinux interface, but /run/systemd/inhibit and /run/user/1000 are fine (apart from /run/user/100/keyring, but that's a separate issue). |