Bug 1265295 - f23 beta atomic - journal from first boot is lost [NEEDINFO]
Summary: f23 beta atomic - journal from first boot is lost
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: ostree
Version: 24
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
Assignee: Colin Walters
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1289902 1353688 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-09-22 15:10 UTC by Dusty Mabe
Modified: 2016-10-24 16:21 UTC (History)
11 users (show)

Fixed In Version: ostree-2016.10-4.fc24 ostree-2016.10-8.fc24
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-09-23 00:23:24 UTC
Type: Bug
jlebon: needinfo? (lpoetter)


Attachments (Terms of Use)
Kickstart patch (588 bytes, patch)
2016-01-10 22:44 UTC, Matthew Barnes
no flags Details | Diff

Description Dusty Mabe 2015-09-22 15:10:19 UTC
Description of problem:

The journal on first boot of the Atomic cloud image is lost because it is logged to /run vs /var. Filing against rpm-ostree as I don't really know a better component to use. Please move to correct component when that is determined.

##### FIRST BOOT ######

-bash-4.3# ls -l /proc/$(pgrep journal)/fd/ | grep journal
lrwx------. 1 root root 64 Sep 22 13:21 13 -> /run/log/journal/bd336ade3c5a47d9b59d7cd2fe01fa0a/system.journal

###### 1st REBOOT #########

-bash-4.3# journalctl --list-boots
 0 e5976df3fb6645edb10b9ca5a5269cb0 Tue 2015-09-22 13:57:31 UTC—Tue 2015-09-22 13:58:20 UTC
-bash-4.3# 
-bash-4.3# ls -l /proc/$(pgrep journal)/fd/ | grep journal
lrwx------. 1 root root 64 Sep 22 13:57 18 -> /var/log/journal/bd336ade3c5a47d9b59d7cd2fe01fa0a/system.journal
lrwx------. 1 root root 64 Sep 22 13:57 28 -> /var/log/journal/bd336ade3c5a47d9b59d7cd2fe01fa0a/user-1000.journal

###### 2nd REBOOT #########

-bash-4.3# journalctl --list-boots
-1 e5976df3fb6645edb10b9ca5a5269cb0 Tue 2015-09-22 13:57:31 UTC—Tue 2015-09-22 14:00:02 UTC
 0 ba533750e5534b9e860b457c8677dead Tue 2015-09-22 14:00:08 UTC—Tue 2015-09-22 14:03:13 UTC
-bash-4.3# 
-bash-4.3# ls -l /proc/$(pgrep journal)/fd/ | grep journal
lrwx------. 1 root root 64 Sep 22 14:00 14 -> /var/log/journal/bd336ade3c5a47d9b59d7cd2fe01fa0a/system.journal
lrwx------. 1 root root 64 Sep 22 14:03 26 -> /var/log/journal/bd336ade3c5a47d9b59d7cd2fe01fa0a/user-1000.journal

###########################

Version-Release number of selected component (if applicable):
-bash-4.3# rpm -q rpm-ostree ostree kernel systemd
rpm-ostree-2015.9-2.fc23.x86_64
ostree-2015.8-1.fc23.x86_64
kernel-4.2.0-300.fc23.x86_64
systemd-222-5.fc23.x86_64


How reproducible:
Always


Steps to Reproduce:
See description

Comment 1 Colin Walters 2015-09-22 19:03:44 UTC
Well, what's in /etc/systemd/journald.conf ?  storage=auto?

This might be due to a change with respect to journald in the initramfs.

Comment 2 Dusty Mabe 2015-09-22 19:10:44 UTC
Everything in journald.conf is commented out which is the same as it is in the cloud base image. The cloud base image does not have this problem.

Comment 3 Dusty Mabe 2015-10-13 13:44:11 UTC
This is still a problem as of 'Fedora-Cloud-Atomic-23-20151005.x86_64.qcow2'. I haven't tried a newer image but I assume it hasn't been fixed.

Comment 4 Stef Walter 2015-12-10 13:11:25 UTC
*** Bug 1289902 has been marked as a duplicate of this bug. ***

Comment 5 Colin Walters 2015-12-16 14:34:17 UTC
I'm thinking this is likely because we're pruning /var in rpm-ostree composes, likely need to do what we do in RHELAH to set storage=persistent.

Comment 6 Colin Walters 2016-01-10 22:06:48 UTC
Another alternative is to patch systemd to default to persistent.  I lean a bit towards this - the "auto" value is a bit weird - configuration-by-directory-presence.

Comment 7 Matthew Barnes 2016-01-10 22:44:06 UTC
Created attachment 1113405 [details]
Kickstart patch

I think this should do for now.  Based on RHELAH's treecompose-post.sh.

Still trying to (learn how to) get an image built so I can test this.

Comment 8 Matthew Barnes 2016-01-10 23:28:22 UTC
FTR, I'm doing:

# imagefactory --debug base_image --file-parameter install_script fedora-cloud-atomic.ks ../f23/fedora-atomic-f23.tdl

But that eventually aborts with:

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/imgfac/Builder.py", line 132, in _build_image_from_template
    self.os_plugin.create_base_image(self, template, parameters)
  File "/usr/lib/python2.7/site-packages/imagefactory_plugins/TinMan/TinMan.py", line 365, in create_base_image
    builder.base_image.icicle = self.guest.customize_and_generate_icicle(libvirt_xml)
  File "/usr/lib/python2.7/site-packages/oz/Linux.py", line 353, in customize_and_generate_icicle
    return self._internal_customize(libvirt_xml, "gen_and_mod")
  File "/usr/lib/python2.7/site-packages/oz/Linux.py", line 303, in _internal_customize
    self._collect_setup(modified_xml)
  File "/usr/lib/python2.7/site-packages/oz/RedHat.py", line 421, in _collect_setup
    g_handle = self._guestfs_handle_setup(libvirt_xml)
  File "/usr/lib/python2.7/site-packages/oz/Guest.py", line 1061, in _guestfs_handle_setup
    raise oz.OzException.OzException("No operating systems found on the disk")
OzException: No operating systems found on the disk

Image build FAILED with error: No operating systems found on the disk


Not sure if the ksflatten step is necessary.  I get the same error either way.  Hints appreciated.

Comment 9 Colin Walters 2016-01-11 15:12:32 UTC
The problem with doing this in that kickstart is it doesn't apply to bare metal.  There's a different kickstart for that that we *could* do things in, but it's probably simplest to do it in the treecompose post (where the locale bits are) so it applies everywhere.

I lean a bit towards at least pinging the systemd guys and flipping the default.

Comment 11 Colin Walters 2016-09-14 16:57:45 UTC
https://github.com/ostreedev/ostree/pull/511

Comment 12 Colin Walters 2016-09-14 17:06:07 UTC
*** Bug 1353688 has been marked as a duplicate of this bug. ***

Comment 13 Fedora Update System 2016-09-16 11:39:04 UTC
ostree-2016.10-4.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-702fb80598

Comment 14 Fedora Update System 2016-09-17 00:53:22 UTC
ostree-2016.10-4.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-702fb80598

Comment 15 Jason Brooks 2016-09-19 18:25:14 UTC
I left feedback in bodhi, but I'm getting the same behavior w/ ostree-2016.10-4.fc24.

Comment 16 Jonathan Lebon 2016-09-20 15:50:08 UTC
> I left feedback in bodhi, but I'm getting the same behavior w/ ostree-2016.10-4.fc24.

I can confirm this as well. This is due to an SELinux violation:

Sep 20 03:08:47 localhost.localdomain audit[625]: AVC avc:  denied  { write } for  pid=625 comm="systemd-journal" name="log" dev="dm-0" ino=4756389 scontext=system_u:system_r:syslogd_t:s0 tcontext=system_u:object_r:var_t:s0 tclass=dir permissive=0

What happens is that the journal tries to write to /var/log, but it's labeled as var_t instead of the expected var_log_t, so it gets blocked by SELinux. Now, /var/log *does* get relabeled later on by systemd-tmpfiles. And in fact, a following `systemctl restart systemd-journald.service` would do the trick.

The issue is that during installation, the dir $ANA_INSTALL_PATH/var is labeled as var_t by ostree when anaconda creates the deployment, but then anaconda copies log files into $ANA_INSTALL_PATH/var/log/anaconda, creating log in the process with an inherited var_t label (since they don't match the final paths SELinux expects). So a workaround here is to simply do:

https://github.com/ostreedev/ostree/pull/513

Or we can transfer this to anaconda so that they ensure that /var/log is properly labeled during installation.

Or we can pass inst.nosave=logs to anaconda so that it doesn't even create those log files.

I'm gonna let Colin chime in here.

Comment 17 Colin Walters 2016-09-20 18:02:45 UTC
Ah, so there were two bugs here.  I did test the ordering fix, but I tested it by nuking /var/log/journal, not /var/log.

Hmm.  I don't quite understand how your ostree PR helps - oh wait, I see, it's that ostree does a relabeling on the first deployment for an osname.

Eh.  Something is bugging me here - why can't we just have systemd-tmpfiles handle this, basically ensure that systemd-journal-flush is ordered after it?

A quick annotate leads to:

https://github.com/systemd/systemd/commit/74055aa76278232ff05574fc47c4e6b3560554a7

Which I've read three times and am still confused by.  Don't we want to order *before*?

Comment 18 Jonathan Lebon 2016-09-20 18:27:07 UTC
(In reply to Colin Walters from comment #17)
> Eh.  Something is bugging me here - why can't we just have systemd-tmpfiles
> handle this, basically ensure that systemd-journal-flush is ordered after it?
> 
> A quick annotate leads to:
> 
> https://github.com/systemd/systemd/commit/
> 74055aa76278232ff05574fc47c4e6b3560554a7
> 
> Which I've read three times and am still confused by.  Don't we want to
> order *before*?

Yeah, that's confusing indeed. I had tested this before by adding systemd-journal-flush.service as an After= in systemd-tmpfiles-setup.service and didn't investigate further when I saw the circular dependency error without realizing it came from right there in the systemd-journal-flush.service definition.

But I tried just now to simply move systemd-tmpfiles-setup.service from a Before= to an After= in systemd-journal-flush.service and everything went smooth, so I'm wondering if that isn't what was intended upstream.

Lennart, could you have a quick look? Is this ordered the other way around?

Comment 20 Jason Brooks 2016-09-21 01:44:56 UTC
Seems the same? https://paste.fedoraproject.org/431727/47442219/

Comment 21 Jonathan Lebon 2016-09-21 13:33:19 UTC
(In reply to Jason Brooks from comment #20)
> Seems the same? https://paste.fedoraproject.org/431727/47442219/

Works for me here. Are you using an installer that has this ostree build?

Comment 22 Fedora Update System 2016-09-22 00:22:17 UTC
ostree-2016.10-4.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.

Comment 23 Fedora Update System 2016-09-22 01:24:51 UTC
ostree-2016.10-8.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-eea2281eff

Comment 24 Fedora Update System 2016-09-23 00:23:24 UTC
ostree-2016.10-8.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.

Comment 25 Jonathan Lebon 2016-09-23 13:55:50 UTC
We'll need the Koji createImage tasks to use newer boot media than what is currently in https://kojipkgs.fedoraproject.org/pub/alt/releases/24/Cloud/x86_64/os/ for this fix to take effect in the F24 cloud images.

Comment 26 Dusty Mabe 2016-09-27 19:01:29 UTC
Any update on this happening?

Comment 27 Colin Walters 2016-09-27 21:05:35 UTC
I moved that part to https://pagure.io/releng/issue/114

Comment 28 Dusty Mabe 2016-10-24 16:21:25 UTC
For the record it seems journal is getting written to /var/log/ just fine on F25 atomic host:

-bash-4.3# rpm-ostree status
State: idle
Deployments:
● fedora-atomic:fedora-atomic/25/x86_64/docker-host
       Version: 25.33 (2016-10-23 10:17:49)
        Commit: 049beb1175ad5763df5ba345c9f062fc56f5fd70c8f1d6a4f4c3cd2882192ae8
        OSName: fedora-atomic
-bash-4.3# ls -l /proc/659/fd | grep journal
lrwx------. 1 root root 64 Oct 24 16:15 23 -> /var/log/journal/9c797dc6c9a24adcbc1d88cf0cf932a9/system.journal
lrwx------. 1 root root 64 Oct 24 16:15 29 -> /var/log/journal/9c797dc6c9a24adcbc1d88cf0cf932a9/user-1000.journal


Note You need to log in before you can comment on or make changes to this bug.