Bug 1265295 - f23 beta atomic - journal from first boot is lost [NEEDINFO]
f23 beta atomic - journal from first boot is lost
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: ostree (Show other bugs)
24
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Colin Walters
Fedora Extras Quality Assurance
: Reopened
: 1289902 1353688 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-09-22 11:10 EDT by Dusty Mabe
Modified: 2016-10-24 12:21 EDT (History)
11 users (show)

See Also:
Fixed In Version: ostree-2016.10-4.fc24 ostree-2016.10-8.fc24
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-09-22 20:23:24 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
jlebon: needinfo? (lpoetter)


Attachments (Terms of Use)
Kickstart patch (588 bytes, patch)
2016-01-10 17:44 EST, Matthew Barnes
no flags Details | Diff

  None (edit)
Description Dusty Mabe 2015-09-22 11:10:19 EDT
Description of problem:

The journal on first boot of the Atomic cloud image is lost because it is logged to /run vs /var. Filing against rpm-ostree as I don't really know a better component to use. Please move to correct component when that is determined.

##### FIRST BOOT ######

-bash-4.3# ls -l /proc/$(pgrep journal)/fd/ | grep journal
lrwx------. 1 root root 64 Sep 22 13:21 13 -> /run/log/journal/bd336ade3c5a47d9b59d7cd2fe01fa0a/system.journal

###### 1st REBOOT #########

-bash-4.3# journalctl --list-boots
 0 e5976df3fb6645edb10b9ca5a5269cb0 Tue 2015-09-22 13:57:31 UTC—Tue 2015-09-22 13:58:20 UTC
-bash-4.3# 
-bash-4.3# ls -l /proc/$(pgrep journal)/fd/ | grep journal
lrwx------. 1 root root 64 Sep 22 13:57 18 -> /var/log/journal/bd336ade3c5a47d9b59d7cd2fe01fa0a/system.journal
lrwx------. 1 root root 64 Sep 22 13:57 28 -> /var/log/journal/bd336ade3c5a47d9b59d7cd2fe01fa0a/user-1000.journal

###### 2nd REBOOT #########

-bash-4.3# journalctl --list-boots
-1 e5976df3fb6645edb10b9ca5a5269cb0 Tue 2015-09-22 13:57:31 UTC—Tue 2015-09-22 14:00:02 UTC
 0 ba533750e5534b9e860b457c8677dead Tue 2015-09-22 14:00:08 UTC—Tue 2015-09-22 14:03:13 UTC
-bash-4.3# 
-bash-4.3# ls -l /proc/$(pgrep journal)/fd/ | grep journal
lrwx------. 1 root root 64 Sep 22 14:00 14 -> /var/log/journal/bd336ade3c5a47d9b59d7cd2fe01fa0a/system.journal
lrwx------. 1 root root 64 Sep 22 14:03 26 -> /var/log/journal/bd336ade3c5a47d9b59d7cd2fe01fa0a/user-1000.journal

###########################

Version-Release number of selected component (if applicable):
-bash-4.3# rpm -q rpm-ostree ostree kernel systemd
rpm-ostree-2015.9-2.fc23.x86_64
ostree-2015.8-1.fc23.x86_64
kernel-4.2.0-300.fc23.x86_64
systemd-222-5.fc23.x86_64


How reproducible:
Always


Steps to Reproduce:
See description
Comment 1 Colin Walters 2015-09-22 15:03:44 EDT
Well, what's in /etc/systemd/journald.conf ?  storage=auto?

This might be due to a change with respect to journald in the initramfs.
Comment 2 Dusty Mabe 2015-09-22 15:10:44 EDT
Everything in journald.conf is commented out which is the same as it is in the cloud base image. The cloud base image does not have this problem.
Comment 3 Dusty Mabe 2015-10-13 09:44:11 EDT
This is still a problem as of 'Fedora-Cloud-Atomic-23-20151005.x86_64.qcow2'. I haven't tried a newer image but I assume it hasn't been fixed.
Comment 4 Stef Walter 2015-12-10 08:11:25 EST
*** Bug 1289902 has been marked as a duplicate of this bug. ***
Comment 5 Colin Walters 2015-12-16 09:34:17 EST
I'm thinking this is likely because we're pruning /var in rpm-ostree composes, likely need to do what we do in RHELAH to set storage=persistent.
Comment 6 Colin Walters 2016-01-10 17:06:48 EST
Another alternative is to patch systemd to default to persistent.  I lean a bit towards this - the "auto" value is a bit weird - configuration-by-directory-presence.
Comment 7 Matthew Barnes 2016-01-10 17:44 EST
Created attachment 1113405 [details]
Kickstart patch

I think this should do for now.  Based on RHELAH's treecompose-post.sh.

Still trying to (learn how to) get an image built so I can test this.
Comment 8 Matthew Barnes 2016-01-10 18:28:22 EST
FTR, I'm doing:

# imagefactory --debug base_image --file-parameter install_script fedora-cloud-atomic.ks ../f23/fedora-atomic-f23.tdl

But that eventually aborts with:

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/imgfac/Builder.py", line 132, in _build_image_from_template
    self.os_plugin.create_base_image(self, template, parameters)
  File "/usr/lib/python2.7/site-packages/imagefactory_plugins/TinMan/TinMan.py", line 365, in create_base_image
    builder.base_image.icicle = self.guest.customize_and_generate_icicle(libvirt_xml)
  File "/usr/lib/python2.7/site-packages/oz/Linux.py", line 353, in customize_and_generate_icicle
    return self._internal_customize(libvirt_xml, "gen_and_mod")
  File "/usr/lib/python2.7/site-packages/oz/Linux.py", line 303, in _internal_customize
    self._collect_setup(modified_xml)
  File "/usr/lib/python2.7/site-packages/oz/RedHat.py", line 421, in _collect_setup
    g_handle = self._guestfs_handle_setup(libvirt_xml)
  File "/usr/lib/python2.7/site-packages/oz/Guest.py", line 1061, in _guestfs_handle_setup
    raise oz.OzException.OzException("No operating systems found on the disk")
OzException: No operating systems found on the disk

Image build FAILED with error: No operating systems found on the disk


Not sure if the ksflatten step is necessary.  I get the same error either way.  Hints appreciated.
Comment 9 Colin Walters 2016-01-11 10:12:32 EST
The problem with doing this in that kickstart is it doesn't apply to bare metal.  There's a different kickstart for that that we *could* do things in, but it's probably simplest to do it in the treecompose post (where the locale bits are) so it applies everywhere.

I lean a bit towards at least pinging the systemd guys and flipping the default.
Comment 11 Colin Walters 2016-09-14 12:57:45 EDT
https://github.com/ostreedev/ostree/pull/511
Comment 12 Colin Walters 2016-09-14 13:06:07 EDT
*** Bug 1353688 has been marked as a duplicate of this bug. ***
Comment 13 Fedora Update System 2016-09-16 07:39:04 EDT
ostree-2016.10-4.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-702fb80598
Comment 14 Fedora Update System 2016-09-16 20:53:22 EDT
ostree-2016.10-4.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-702fb80598
Comment 15 Jason Brooks 2016-09-19 14:25:14 EDT
I left feedback in bodhi, but I'm getting the same behavior w/ ostree-2016.10-4.fc24.
Comment 16 Jonathan Lebon 2016-09-20 11:50:08 EDT
> I left feedback in bodhi, but I'm getting the same behavior w/ ostree-2016.10-4.fc24.

I can confirm this as well. This is due to an SELinux violation:

Sep 20 03:08:47 localhost.localdomain audit[625]: AVC avc:  denied  { write } for  pid=625 comm="systemd-journal" name="log" dev="dm-0" ino=4756389 scontext=system_u:system_r:syslogd_t:s0 tcontext=system_u:object_r:var_t:s0 tclass=dir permissive=0

What happens is that the journal tries to write to /var/log, but it's labeled as var_t instead of the expected var_log_t, so it gets blocked by SELinux. Now, /var/log *does* get relabeled later on by systemd-tmpfiles. And in fact, a following `systemctl restart systemd-journald.service` would do the trick.

The issue is that during installation, the dir $ANA_INSTALL_PATH/var is labeled as var_t by ostree when anaconda creates the deployment, but then anaconda copies log files into $ANA_INSTALL_PATH/var/log/anaconda, creating log in the process with an inherited var_t label (since they don't match the final paths SELinux expects). So a workaround here is to simply do:

https://github.com/ostreedev/ostree/pull/513

Or we can transfer this to anaconda so that they ensure that /var/log is properly labeled during installation.

Or we can pass inst.nosave=logs to anaconda so that it doesn't even create those log files.

I'm gonna let Colin chime in here.
Comment 17 Colin Walters 2016-09-20 14:02:45 EDT
Ah, so there were two bugs here.  I did test the ordering fix, but I tested it by nuking /var/log/journal, not /var/log.

Hmm.  I don't quite understand how your ostree PR helps - oh wait, I see, it's that ostree does a relabeling on the first deployment for an osname.

Eh.  Something is bugging me here - why can't we just have systemd-tmpfiles handle this, basically ensure that systemd-journal-flush is ordered after it?

A quick annotate leads to:

https://github.com/systemd/systemd/commit/74055aa76278232ff05574fc47c4e6b3560554a7

Which I've read three times and am still confused by.  Don't we want to order *before*?
Comment 18 Jonathan Lebon 2016-09-20 14:27:07 EDT
(In reply to Colin Walters from comment #17)
> Eh.  Something is bugging me here - why can't we just have systemd-tmpfiles
> handle this, basically ensure that systemd-journal-flush is ordered after it?
> 
> A quick annotate leads to:
> 
> https://github.com/systemd/systemd/commit/
> 74055aa76278232ff05574fc47c4e6b3560554a7
> 
> Which I've read three times and am still confused by.  Don't we want to
> order *before*?

Yeah, that's confusing indeed. I had tested this before by adding systemd-journal-flush.service as an After= in systemd-tmpfiles-setup.service and didn't investigate further when I saw the circular dependency error without realizing it came from right there in the systemd-journal-flush.service definition.

But I tried just now to simply move systemd-tmpfiles-setup.service from a Before= to an After= in systemd-journal-flush.service and everything went smooth, so I'm wondering if that isn't what was intended upstream.

Lennart, could you have a quick look? Is this ordered the other way around?
Comment 20 Jason Brooks 2016-09-20 21:44:56 EDT
Seems the same? https://paste.fedoraproject.org/431727/47442219/
Comment 21 Jonathan Lebon 2016-09-21 09:33:19 EDT
(In reply to Jason Brooks from comment #20)
> Seems the same? https://paste.fedoraproject.org/431727/47442219/

Works for me here. Are you using an installer that has this ostree build?
Comment 22 Fedora Update System 2016-09-21 20:22:17 EDT
ostree-2016.10-4.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.
Comment 23 Fedora Update System 2016-09-21 21:24:51 EDT
ostree-2016.10-8.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-eea2281eff
Comment 24 Fedora Update System 2016-09-22 20:23:24 EDT
ostree-2016.10-8.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.
Comment 25 Jonathan Lebon 2016-09-23 09:55:50 EDT
We'll need the Koji createImage tasks to use newer boot media than what is currently in https://kojipkgs.fedoraproject.org/pub/alt/releases/24/Cloud/x86_64/os/ for this fix to take effect in the F24 cloud images.
Comment 26 Dusty Mabe 2016-09-27 15:01:29 EDT
Any update on this happening?
Comment 27 Colin Walters 2016-09-27 17:05:35 EDT
I moved that part to https://pagure.io/releng/issue/114
Comment 28 Dusty Mabe 2016-10-24 12:21:25 EDT
For the record it seems journal is getting written to /var/log/ just fine on F25 atomic host:

-bash-4.3# rpm-ostree status
State: idle
Deployments:
● fedora-atomic:fedora-atomic/25/x86_64/docker-host
       Version: 25.33 (2016-10-23 10:17:49)
        Commit: 049beb1175ad5763df5ba345c9f062fc56f5fd70c8f1d6a4f4c3cd2882192ae8
        OSName: fedora-atomic
-bash-4.3# ls -l /proc/659/fd | grep journal
lrwx------. 1 root root 64 Oct 24 16:15 23 -> /var/log/journal/9c797dc6c9a24adcbc1d88cf0cf932a9/system.journal
lrwx------. 1 root root 64 Oct 24 16:15 29 -> /var/log/journal/9c797dc6c9a24adcbc1d88cf0cf932a9/user-1000.journal

Note You need to log in before you can comment on or make changes to this bug.