Bug 2128662 - Abrt does not report a segfault which is reported in journalctl.
Summary: Abrt does not report a segfault which is reported in journalctl.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: abrt
Version: 37
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Michal Srb
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: RejectedBlocker AcceptedFreezeException
Depends On:
Blocks: F37FinalFreezeException 2211640
TreeView+ depends on / blocked
 
Reported: 2022-09-21 11:00 UTC by Lukas Ruzicka
Modified: 2023-06-01 10:14 UTC (History)
11 users (show)

Fixed In Version: abrt-2.15.1-5.fc38 abrt-2.15.1-5.fc37 abrt-2.15.1-6.fc38 abrt-2.15.1-6.fc37
Clone Of:
: 2211640 (view as bug list)
Environment:
Last Closed: 2022-10-24 17:50:46 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
system journal (307.61 KB, text/plain)
2022-09-21 12:50 UTC, Kamil Páral
no flags Details
rpm -qa output (62.32 KB, text/plain)
2022-09-21 12:50 UTC, Kamil Páral
no flags Details
System journal (new occurence) (1.93 MB, text/plain)
2022-10-18 12:36 UTC, Lukas Ruzicka
no flags Details

Description Lukas Ruzicka 2022-09-21 11:00:55 UTC
Description of problem:

Abrt does not show any notification, nor does it show and report a simulated segfault on Fedora 37 pre-Final (20220920).

The segmentation fault is correctly displayed in journalctl, which means that the system knows about it. 

Version-Release number of selected component (if applicable):
abrt-2.15.1-4
kernel 5.19.9-300

How reproducible:
Always

Steps to Reproduce:
1. Install the `will-crash` package that simulates various crashes.
2. Run `will_segfault` to simulate a segfault.
3. Wait for notifications -> none will come in 120 seconds
4. Open Abrt -> the segfault will not be displayed there.

Actual results:
Abrt might not be able to catch system issues which means that it cannot report them either.

Expected results:
Abrt should catch the issue, notify about it, show it in the application and allow to report it (or trace locally).

Additional info:

Journalctl has the info about the segfault:
---
Sep 21 12:41:09 platypus systemd-coredump[46119]: Process 46117 (will_segfault) of user 1000 dumped core.
                                                  
                                                  Module linux-vdso.so.1 with build-id 6daaf8b06a8e9606d0faa8151c085374ea451a22
                                                  Module ld-linux-x86-64.so.2 with build-id 653dfb54d6e6d9c27c349f698a8af1ab86d5501d
                                                  Module libc.so.6 with build-id a6572cd46182057d3dbacf1685a12edab0e2eda1
                                                  Module libwillcrash.so with build-id d17a8ed3a0098089a90ecaa1fd2c2dd21a85341f
                                                  Metadata for module libwillcrash.so owned by FDO found: {
                                                          "type" : "rpm",
                                                          "name" : "will-crash",
                                                          "version" : "0.13.5-2.fc37",
                                                          "architecture" : "x86_64",
                                                          "osCpe" : "cpe:/o:fedoraproject:fedora:37"
                                                  }
                                                  
                                                  Module will_segfault with build-id fc7b52fa0e3611a64622a30d94b5bdd4a86e4d9b
                                                  Metadata for module will_segfault owned by FDO found: {
                                                          "type" : "rpm",
                                                          "name" : "will-crash",
                                                          "version" : "0.13.5-2.fc37",
                                                          "architecture" : "x86_64",
                                                          "osCpe" : "cpe:/o:fedoraproject:fedora:37"
                                                  }
                                                  
                                                  Stack trace of thread 46117:
                                                  #0  0x000055abb5664262 crash.constprop.0 (will_segfault + 0x1262)
                                                  #1  0x000055abb5664335 varargs (will_segfault + 0x1335)
                                                  #2  0x000055abb566436e f (will_segfault + 0x136e)
                                                  #3  0x000055abb566439d callback (will_segfault + 0x139d)
                                                  #4  0x00007ff0dca8113d call_me_back (libwillcrash.so + 0x113d)
                                                  #5  0x000055abb5664232 recursive.constprop.0 (will_segfault + 0x1232)
                                                  #6  0x000055abb56640f6 main (will_segfault + 0x10f6)
                                                  #7  0x00007ff0dc8b7510 __libc_start_call_main (libc.so.6 + 0x23510)
                                                  #8  0x00007ff0dc8b75c9 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x235c9)
                                                  #9  0x000055abb5664155 _start (will_segfault + 0x1155)
                                                  ELF object binary architecture: AMD x86-64
Sep 21 12:41:09 platypus systemd[1]: systemd-coredump: Deactivated successfully.
---

Comment 1 Fedora Blocker Bugs Application 2022-09-21 11:05:36 UTC
Proposed as a Blocker for 37-final by Fedora user lruzicka using the blocker tracking app because:

 In my opinion, this violates the "Basic Functionality Criterion" because Abrt is useless at the moment.

Comment 2 Michal Srb 2022-09-21 11:24:17 UTC
Thanks for the bug report. I just tested this in a fully-updated f37 virtual machine:

[msrb@fedora ~]$ will_segfault
Will segfault.
Segmentation fault (core dumped)
[msrb@fedora ~]$ abrt ls
Id           cec4c37  
Component    will-crash  
Count        1  
Time         2022-09-21 13:20:18  
User id      1000 (msrb)  
Reported to    
ABRT Server  https://retrace.fedoraproject.org/faf/reports/bthash/2b650f55015e773bd65e11cd4a3dee6cfeb328df
[msrb@fedora ~]$ cat /etc/os-release 
NAME="Fedora Linux"
VERSION="37 (Workstation Edition Prerelease)"

I got the notification immediately.
Note this was a fresh install, not upgrade from f36.

Could you please double-check that you have the latest-greatest packages installed?

Was this upgrade from f36?

Comment 3 Kamil Páral 2022-09-21 12:49:31 UTC
I tested in a fully updated F37 (using updates-testing, clean installed some weeks back) and I can reproduce the issue. ABRT doesn't detect any crashes (tested will_segfault, will_abort, will_cpp_segfault and manually killing gnome-calculator with SIGABRT). The journal doesn't contain any messages from abrt. Coredumpctl shows all the crashes, though. All abrt* systemd services are shown as running. Attaching logs.

Comment 4 Kamil Páral 2022-09-21 12:50:06 UTC
Created attachment 1913310 [details]
system journal

Comment 5 Kamil Páral 2022-09-21 12:50:14 UTC
Created attachment 1913311 [details]
rpm -qa output

Comment 6 Michal Srb 2022-09-21 13:38:23 UTC
Thanks! Could you please check following:

# cat /var/lib/abrt/abrt-dump-journal-core.state
# journalctl --cursor-file=/var/lib/abrt/abrt-dump-journal-core.state

The file should contain a valid journal cursor, but in case there is something wrong with it, you can try to remove the file and let abrt to recreate it next time crash happens.

Comment 7 Kamil Páral 2022-09-21 14:14:19 UTC
Hmm, so, I used a VM before (cleanly booted). When I started it again now, ABRT notifications work perfectly. I restarted it a few times, it still works fine. So perhaps some race condition?

But, I also saw this problem on my laptop. That wasn't restarted since then, and I can still reproduce the problem. The journalctl cursor value in /var/lib/abrt/abrt-dump-journal-core.state seems to only change when I run journalctl --cursor-file=/var/lib/abrt/abrt-dump-journal-core.state. Otherwise it stays the same. On the other hand, in the VM where things work now, the value changes when I run journalctl --cursor-file=/var/lib/abrt/abrt-dump-journal-core.state or when a crash happens.

Comment 8 Michal Srb 2022-09-21 14:58:29 UTC
This is a great feedback -- thanks!

My VM has been restarted as well, so I will create a fresh one and observe. I will keep you guys posted here.

Comment 9 Lukas Ruzicka 2022-09-21 16:28:14 UTC
Hello, 
mine was a fresh installation of the latest compose run in the virtual machine.
I am experiencing the same problem with my laptop that runs the latest greatest packages, but is an upgrade from Fedora 36.

Comment 10 Michal Srb 2022-09-21 17:42:53 UTC
Would it be possible to try if simple "# systemctl restart abrt-journal-core" fixes the problem on the system?

Comment 11 Lukas Ruzicka 2022-09-21 18:04:25 UTC
Of course.
I have restarted the abrt-journal-core service and Abrt is working correctly now.

Thanks.

Comment 12 Michal Srb 2022-09-21 18:10:09 UTC
Dammit. Thank you :) That explains why it somehow fixes itself after restart. But I still don't know what might be triggering the problem after clean installation or upgrade from previous version...

Comment 13 Adam Williamson 2022-09-23 15:50:25 UTC
+4 in https://pagure.io/fedora-qa/blocker-review/issue/915 , marking accepted.

Comment 14 Kamil Páral 2022-09-27 15:00:00 UTC
I tried to reproduce this issue today and I have to say I'm currently unable to. I restarted an existing system numerous times, but ABRT was intercepting crashes on every boot. I also installed a completely fresh Workstation multiple times, and ABRT was again working OK after each installation. I have no clue how to help trigger this (quite possibly) race condition.

Comment 15 Lukas Ruzicka 2022-09-27 15:55:05 UTC
(In reply to Kamil Páral from comment #14)
> I tried to reproduce this issue today and I have to say I'm currently unable
> to. I restarted an existing system numerous times, but ABRT was intercepting
> crashes on every boot. I also installed a completely fresh Workstation
> multiple times, and ABRT was again working OK after each installation. I
> have no clue how to help trigger this (quite possibly) race condition.

I did not focus on this specifically, but I used the above workflow to test Abrt on several freshly installed VMs and it
also worked fine always. Maybe this has somehow magically gone away? Maybe some other update fixed the race?

Comment 16 Kamil Páral 2022-10-10 10:05:47 UTC
Unfortunately I saw this again yesterday. I caused Nautilus to crash twice, coredumpctl saw the crashes but ABRT did not. After restarting abrtd.service and causing Nautilus to crash again, ABRT finally detected the crash.

Comment 17 Kamil Páral 2022-10-10 10:07:01 UTC
Michal, were you able to look into this? Find a probable cause for this? How can we provide better debug info? Thanks!

Comment 18 Michal Srb 2022-10-11 07:57:52 UTC
Yes, I've been looking into this and I found a reproducible scenario. I upgraded 2 laptops and 1 VM from f36 to f37 recently and unfortunately non of the machines suffered from the problem, but I observed the same issue happening in Rawhide (f38). Fix incoming (later today/tomorrow morning).

Comment 19 Adam Williamson 2022-10-11 14:47:26 UTC
Thanks! Any chance still for today? We could....kind of...do a candidate compose if we get this fix, I guess.

Comment 20 Fedora Update System 2022-10-12 08:27:24 UTC
FEDORA-2022-7e53c89754 has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2022-7e53c89754

Comment 21 Fedora Update System 2022-10-12 08:35:58 UTC
FEDORA-2022-7cf3cad7c7 has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2022-7cf3cad7c7

Comment 22 Michal Srb 2022-10-12 08:38:01 UTC
Sorry! I didn't make it yesterday. The update is in Bodhi now.

Comment 23 Fedora Update System 2022-10-12 08:38:26 UTC
FEDORA-2022-7e53c89754 has been pushed to the Fedora 38 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 24 Adam Williamson 2022-10-12 08:50:00 UTC
Setting back to ON_QA as this is for F37.

Comment 25 Fedora Update System 2022-10-12 09:39:03 UTC
FEDORA-2022-7cf3cad7c7 has been pushed to the Fedora 37 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-7cf3cad7c7`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-7cf3cad7c7

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 26 Kamil Páral 2022-10-13 08:13:18 UTC
According to the patch description [1], this seems to be a race condition, so it's hard to verify. I guess we'll just push the update and reopen this bug if we still see it happening in the future.

[1] https://github.com/abrt/abrt/commit/0cd7e3d7faa0c36e6ef2ecdd1d599fe54e9e87be

Comment 27 Fedora Update System 2022-10-17 22:55:03 UTC
FEDORA-2022-7cf3cad7c7 has been pushed to the Fedora 37 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 28 Lukas Brabec 2022-10-18 11:56:11 UTC
This happened to me again on Raspberry Pi 4, F37 Final RC 1.1 with abrt-2.15.1-5.fc37

Comment 29 Lukas Ruzicka 2022-10-18 12:14:50 UTC
I can confirm that it also happened to me on a fresh install of Fedora WS, F37 Final RC 1.1 when will-crash was freshly installed and used.
The problem went away after a system restart.

Comment 30 Michal Srb 2022-10-18 12:17:41 UTC
Could you please share the journal log from the boot when this happened?

Comment 31 Lukas Ruzicka 2022-10-18 12:34:03 UTC
(In reply to Michal Srb from comment #30)
> Could you please share the journal log from the boot when this happened?

I am providing all journal logs from the affected machine as I have rebooted the machine several times and I am not sure in which cycle I attempted to simulate the crash. I believe that it was one of the first three boot cycles.

Comment 32 Lukas Ruzicka 2022-10-18 12:36:47 UTC
Created attachment 1918739 [details]
System journal (new occurence)

Comment 33 Michal Srb 2022-10-19 05:47:33 UTC
Thanks ;)

Indeed, I also see the problem on freshly installed f37 RC. It seems like the problem goes away when:

* systemd-coredump reports a crash in journal
* abrt-journal-core.service or the whole system is restarted

I.e. abrt-journal-core.service somehow doesn't see the first crash (or crashes before first restart after them). 

I see the same error in journal:

Oct 18 11:13:22 fedora abrt-dump-journal-core[792]: Cannot save journal watch's position

I am wondering if abrt-journal-core.service is somehow incorrectly initializing the journal cursor, if there are no prior (valid) crashes in journal.
I think I have an idea when the problem might be. Another fix incoming today.

Comment 34 Fedora Update System 2022-10-19 14:43:43 UTC
FEDORA-2022-dcd00ab021 has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2022-dcd00ab021

Comment 35 Fedora Update System 2022-10-19 14:46:10 UTC
FEDORA-2022-7fb34453de has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2022-7fb34453de

Comment 36 Fedora Update System 2022-10-19 14:50:26 UTC
FEDORA-2022-dcd00ab021 has been pushed to the Fedora 38 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 37 Adam Williamson 2022-10-19 18:10:05 UTC
I'm gonna kick this back to proposed blocker as it seems like our experience of it is quite different now from when it was accepted as a blocker (it's intermittent and goes away on a reboot). Please vote again in https://pagure.io/fedora-qa/blocker-review/issue/915

Comment 38 Lukas Ruzicka 2022-10-21 09:24:21 UTC
(In reply to Fedora Update System from comment #35)
> FEDORA-2022-7fb34453de has been submitted as an update to Fedora 37.
> https://bodhi.fedoraproject.org/updates/FEDORA-2022-7fb34453de

I tried to install a fresh system and update Abrt to this build before I attempted to induce any crashes or issues.
With the updated version, even the very first issue was properly caught and could be reported. So I tend to believe that this fixed helped.
If anyone can also confirm,  @kparal?

Comment 39 Kamil Páral 2022-10-21 11:10:24 UTC
The problem was not about the first boot, it happened randomly on any boot (at least for me). So this can't be easily verified to be fixed, only after some time. I tested the update and it worked, which means it's surely worth a try to push it stable.

Comment 40 Fedora Update System 2022-10-21 14:40:41 UTC
FEDORA-2022-7fb34453de has been pushed to the Fedora 37 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-7fb34453de`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-7fb34453de

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 41 Adam Williamson 2022-10-21 18:26:53 UTC
We have -7 blocker / +3 FE in https://pagure.io/fedora-qa/blocker-review/issue/915 , marking rejected blocker, accepted FE.

Comment 42 Fedora Update System 2022-10-24 17:50:46 UTC
FEDORA-2022-7fb34453de has been pushed to the Fedora 37 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.