Bug 963182

Summary: [rfe] Please make abrt activated on demand instead of continously running
Product: [Fedora] Fedora Reporter: Lennart Poettering <lpoetter>
Component: abrtAssignee: abrt <abrt-devel-list>
Status: NEW --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: unspecified    
Version: rawhideCC: abrt-devel-list, awilliam, dvlasenk, iprikryl, jay.hilliard, laurent.rineau__fedora, msuchy
Target Milestone: ---Keywords: FutureFeature, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 963210    

Description Lennart Poettering 2013-05-15 10:57:31 UTC
It would be great if you guys could turn abrt into something that is auto-spawned as soon as the first crash happens rather than all the time. This would help us bringing down boot-times and resource usage of Fedora.

systemd allows auto-activation of abrt (or its components) when a file shows up in the fs (via .path units), when a client requests it on the bus or on a socket.

If you need any hints how to implement this best, I am happy to help.

Comment 1 Jakub Filak 2013-05-15 13:40:35 UTC
ABRT installs these services:

* abrt-oops.service - ABRT kernel log watcher
* abrt-xorg.service - ABRT Xorg log watcher
    - watch /var/log/messages and if a suspicious string is found, spawns a handler process

* abrt-ccpp.service - Install ABRT coredump hook
    - replaces /proc/sys/kernel/core_pattern

* abrt-vmcore.service - Harvest vmcores for ABRT
    - can be omitted if /var/crash directory is empty

* abrt-uefioops.service - Collect UEFI-saved oopses for ABRT
    - can be omitted if /sys/fs/pstore directory is empty

* abrtd.service - ABRT Automated Bug Reporting Tool
    - provides two ways of communication:
        1. processes arbitrary data created by clients in /var/tmp/abrt directory
        2. accepts data through /var/run/abrt/abrt.socket and stores them in /var/tmp/abrt
    - ensures that /var/tmp/abrt directory exists


abrt-oops, abrt-xorg and abrt-ccpp must be started at boot time and continuously run all the time but they do not consume many resources.

I see room for improvement only in abrt-vmcore, abrt-uefioops and abrtd. In order to start abrtd on demand we probably have to switch abrtd to "socket mode only" and drop the support of arbitrary data.

While the changes in the first two services will probably be trivial, the changes in abrtd will affect entire abrt environment.

Comment 2 Lennart Poettering 2013-05-15 18:34:44 UTC
(In reply to comment #1)
> ABRT installs these services:
> 
> * abrt-oops.service - ABRT kernel log watcher
> * abrt-xorg.service - ABRT Xorg log watcher
>     - watch /var/log/messages and if a suspicious string is found, spawns a
> handler process

Does the latter actually check /var/log/messages? If it watches /var/log/Xorg.* then it might make sense to conditionalize this with ConditionGlobExists=/var/log/Xorg.*.log

> * abrt-ccpp.service - Install ABRT coredump hook
>     - replaces /proc/sys/kernel/core_pattern

Any chance to simply replace this with a drop-in file in /usr/lib/sysctl.d? See sysctl.d(5) for more information.
 
> * abrt-vmcore.service - Harvest vmcores for ABRT
>     - can be omitted if /var/crash directory is empty

"ConditionDirectoryNotEmpty=/var/crash" could be used for this. Or do you need to start this as soon as something appears there? In that case use a .path unit with DirectoryNotEmpty=/var/cache.

> * abrt-uefioops.service - Collect UEFI-saved oopses for ABRT
>     - can be omitted if /sys/fs/pstore directory is empty

"ConditionDirectoryNotEmpty=/sys/fs/pstore" sounds like the simplest solution.

> * abrtd.service - ABRT Automated Bug Reporting Tool
>     - provides two ways of communication:
>         1. processes arbitrary data created by clients in /var/tmp/abrt
> directory

For this one, please add a .path unit with DirectoryNotEmpty=/var/tmp/abrt or so.

>         2. accepts data through /var/run/abrt/abrt.socket and stores them in
> /var/tmp/abrt

This one could be made socket-activatable nicely, I guess. Please have a look at:

http://0pointer.de/blog/projects/socket-activation.html

The patch necessary is usually just a few lines and adding support for this does not break support for non-socket activation.

>     - ensures that /var/tmp/abrt directory exists

For this, it should be sufficient to drop in a snippet into /usr/lib/tmpfiles.d/ that creates this at early boot.

> abrt-oops, abrt-xorg and abrt-ccpp must be started at boot time and
> continuously run all the time but they do not consume many resources.

Well, it's all these little things that cost quite a bit alltogether, since they mean seeks, and so on...
 
> I see room for improvement only in abrt-vmcore, abrt-uefioops and abrtd. In
> order to start abrtd on demand we probably have to switch abrtd to "socket
> mode only" and drop the support of arbitrary data.

Getting the socket from systemd instead of creating it directly doesn't really change much in the semantics really. You get the same socket, just somebody else created it for you.
 
> While the changes in the first two services will probably be trivial, the
> changes in abrtd will affect entire abrt environment.

It's not invasive. Basically, instead of doing socket()+bind()+listen() yourself unconditionally, you first check with sd_listen_fds() if systemd already passed you a socket. If so, you take it, otherwise you execute socket()+bind()+listen() as before. Usually something like this results in patches < 10lines...

Comment 3 Lennart Poettering 2013-05-15 19:28:27 UTC
*** Bug 963184 has been marked as a duplicate of this bug. ***

Comment 4 Lennart Poettering 2013-05-16 18:45:34 UTC
Also see Adam's findings:

https://lists.fedoraproject.org/pipermail/devel/2013-May/182850.html

Please, remove that sleep loop. We really shouldn't do things like that anymore. We can boot entire systems including BIOS, kernel and initrd and < 4s and you guys add a loop that sleeps at least 1s! Not cool that!

Comment 5 Adam Williamson 2013-05-16 19:27:21 UTC
I've got a trivial patch to make abrt-uefioops.service conditional, I just need to test that it works as expected and then I'll send it.

Comment 6 Adam Williamson 2013-05-16 20:06:35 UTC
Patch for abrt-uefioops.service submitted:

https://lists.fedorahosted.org/pipermail/crash-catcher/2013-May/004616.html

I tested at least that it prevents the service starting and wasting 1 second when /sys/fs/pstore is empty, and the failure condition looks correct (i.e. it's failing because the condition is not satisfied, it's not failing because I broke the service). It seems impossible to create any kind of dummy data in /sys/fs/pstore manually to test if the service still works correctly if there's actual crash data in there, though.

<adamw> mjg59: is there any way I can get any kind of dummy data into /sys/fs/pstore ?
<adamw> in a VM ideally
<mjg59> No
<mjg59> Because you probably don't have an efi VM and also the feature is currently disabled

Comment 7 Jakub Filak 2013-05-16 20:17:15 UTC
(In reply to comment #6)
> Patch for abrt-uefioops.service submitted:

Thank you! I was about to push this patch and similar patches for abrt-oops.service and abrt-xorg.service. I'm really sorry, but I can't do it right now. I hope it can wait a few hours since.

Comment 8 Adam Williamson 2013-05-16 20:20:28 UTC
Yes, there's no urgency. It won't go into F19 Beta anyhow.

Comment 9 Jakub Filak 2013-05-17 08:10:46 UTC
Thank you for your help! The trivial part has been merged upstream.

commit 6c040f78a7031c031c8ae4bcb17fb0d713d90af5
Author: Jakub Filak <jfilak>
Date:   Thu May 16 18:03:03 2013 +0200

    systemd units: start services only if it make sense
    
    Thanks Lennart Poettering <lpoetter>
    
    Related to rhbz#963182
    
    Signed-off-by: Jakub Filak <jfilak>

commit 870f4c0ba64b008c0e5b11e22a3cb20ce14ced2c
Author: Adam Williamson <awilliam>
Date:   Thu May 16 13:10:46 2013 -0700

    abrt-harvest-uefioops.in: test for abrtd after testing for pstore, not before
    
    If I'm reading this script right, this seems to make sense. At least this way, if /sys/fs/pstore does not exist at all, we'll exit 0 before we sleep for 1 second for no good reason.
    
    Signed-off-by: Jakub Filak <jfilak>

commit 858f1ba4a0a2c96e34660f118e57b73425f0d334
Author: Adam Williamson <awilliam>
Date:   Thu May 16 12:58:40 2013 -0700

    make abrt-uefioops.service conditional on /sys/fs/pstore being populated
    
    Signed-off-by: Jakub Filak <jfilak>

Comment 10 Fedora End Of Life 2015-01-09 22:07:12 UTC
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 11 Adam Williamson 2015-01-09 22:17:51 UTC
Lennart, Jakub, do you think this bug as it stands makes sense any more? Should we bump it to Rawhide as an RFE, or close?

Comment 12 Jakub Filak 2015-01-12 12:38:27 UTC
(In reply to Adam Williamson (Red Hat) from comment #11)
> Lennart, Jakub, do you think this bug as it stands makes sense any more?
> Should we bump it to Rawhide as an RFE, or close?

Once we switch from dumping core files on our own to systemd-coredumpctl, we can make abrt completely socket activated.

I have spent some time on this task and I have several patches for ABRT ready on a development branch [1].


1: https://github.com/abrt/abrt/commits/socket_activation